FlexChem

From Geos-chem
Revision as of 15:12, 17 February 2021 by Bmy (Talk | contribs) (Installing and building the KPP source code)

Jump to: navigation, search

On this page we provide information about FlexChem in GEOS-Chem.

Overview

FlexChem is a clean implementation of the Kinetic Pre-Processor (KPP) present in GEOS-Chem v11-01 and later versions. Within FlexChem, there is a single supported chemical mechanisms (named fullchem), but users may also define their own custom mechanisms.

The source code in flexchem_mod.F90 serves as the connection between the KPP chemical mechanism files and GEOS-Chem. It passes initial species concentrations, photolysis rates, meteorology fields, etc. to KPP., KPP then computes rates and runs the chemical solver, and finally the final concentrations are obtained from KPP and passed back to GEOS-Chem.

The main benefits of FlexChem are:

  1. Better documentation of chemical mechanisms;
  2. Easier to drop in other chemical mechanisms;
  3. Optimized chemistry computations; and
  4. Removal of the old SMVGEAR solver (used prior to GEOS-Chem v11-01).

Overview

The Kinetic Pre Processor (KPP) package creates optimized chemical solver code in Fortran-90 from a user-defined mechanism specification. The resulting Fortran-90 code can be added into chemical models such as GEOS-Chem. Adding changes to a mechanism can be simply done by editing the mechanism's configuration files and then re-running KPP to generate new Fortran output files.

Requirements

KPP requires the following:

  1. A C language compiler (such as gcc, from the GNU Compiler Collection)
  2. The flex library. This is often installed on many computer systems, or can be easily installed with a package manager such as Spack.

Depending on your setup, you might have to load these packages with the module load or spack load commands. Ask your sysadmin for more information.

Quick start

Installing KPP

Current KPP version for use with GEOS-Chem: 2.3.0_gc

Download KPP to your home directory with this command:

git clone -b GC_updates https://github.com/geoschem/KPP.git
  • The -b GC_updates will check out the GC_updates branch instead of the main branch.
  • IMPORTANT! Do not download the KPP source code into your GEOS-Chem source code directory. This will avoid confusion with the KPP folder that is already within GEOS-Chem, which contains Fortran files created by KPP that define the chemical mechanism.

Building KPP

Make sure that your computer system has a C compiler.

Once you have downloaded KPP and have made sure that a C compiler exists on your system, you may use these commands to build the KPP executable file:

cd KPP/kpp-code
make distclean
make all

If the build completes successfully, you will see an executable file named kpp in the KPP/kpp-code/bin/ folder.

Setting environment variables

Once have built KPP, you must add the path to the KPP executable file to your PATH environment variable.

Settings for bash

If you use the bash Unix shell, add these lines to your ~/.bash_aliases file. (If you don't have a ~/.bash_aliases file, you can add these lines to your ~/.bashrc file instead.)

export PATH=$PATH:/PATH_TO_KPP/KPP/kpp-code/bin/
export KPP_HOME=PATH_TO_KPP/KPP/kpp-code

Settings for csh or tcsh

If you use the csh or tcsh Unix shell, add these lines to your ~/.cshrc file:

setenv PATH $PATH:/PATH_TO_KPP/KPP/kpp-code/bin/
setenv KPP_HOME=PATH_TO_KPP/KPP/kpp-code

NOTES:

  • In the above examples, the PATH_TO_KPP is the path to the main KPP folder.
  • For example, if you installed KPP into your home directory, then PATH_TO_KPP would be ~/KPP, etc.

Specifying a custom chemical mechanism

Once you have downloaded and installed KPP on your computer system, you can proceed to creating a custom chemistry mechanism.

We recommend that you build your new chemistry mechanism in the KPP/custom folder. To create a custom mechanism, you will need to edit the following files:

  1. KPP/custom/custom.eqn: Specifies the chemical species and reaction list
  2. KPP/custom/gckpp.kpp: Specifies the choice of solver, language, list of chemical families, and rate-law functions

These are copies of the GEOS-Chem fullchem mechanism configuration files (KPP/fullchem/fullchem.eqn and KPP/fullchem/gckpp.kpp). This will easily let you create your own mechanism using the fullchem mechanism as your starting point.

Add species

List chemically-active (aka variable) species in the #DEFVAR section of custom.eqn, as shown below:

#DEFVAR

A3O2       = IGNORE; {CH3CH2CH2OO; Primary RO2 from C3H8}
ACET       = IGNORE; {CH3C(O)CH3; Acetone}
ACTA       = IGNORE; {CH3C(O)OH; Acetic acid}
...etc ...

List species whose concentrations do not change in the #DEFFIX of custom.eqn, as shown below:

#DEFFIX

H2         = IGNORE; {H2; Molecular hydrogen}
N2         = IGNORE; {N2; Molecular nitrogen}
O2         = IGNORE; {O2; Molecular oxygen}
... etc ...

Species may be listed in any order, but we have found it convenient to list them alphabetically.

Add gas-phase reactions

List gas-phase reactions first in the #EQUATIONS section of custom.eqn.

#EQUATIONS
//
// Gas-phase reactions
//
// NOTES:
// ------
// (1) Be sure to use "D" exponents to force double precision values!
//     (i.e. write 1.70d-12 instead of 1.70e-12, etc.).
//        -- Bob Yantosca (16 Dec 2020)
//
// (2) This file might not render properly if the right hand side of the
//     equation is longer than ~100 characters.  This seems to be an issue
//     with the KPP code itself.  See this Github issue at geoschem/KPP:
//     https://github.com/geoschem/KPP/issues/1
//        -- Bob Yantosca (16 Dec 2020)
//
// (3) To avoid useless CPU cycles, we have introduced new rate law functions
//     that skip computing Arrhenius terms (and other terms) that would
//     evaluate to 1.  The Arrhenius terms that are passed to the function
//     are in most cases now noted in the function name (e.g. GCARR_abc takes
//     Arrhenius A, B, C parameters but GCARR_ac only passes A and C
//     parameters because B=0 and the (300/T)*B would evaluate to 1).
//     This should be much more computationally efficient, as these functions
//     are called (sometimes multiple times) for each grid box where we
//     perform chemistry.
//        -- Bob Yantosca (25 Jan 2020)
//
//
O3 + NO = NO2 + O2 :                         GCARR_ac(3.00d-12, -1500.0d0);
O3 + OH = HO2 + O2 :                         GCARR_ac(1.70d-12, -940.0d0);
O3 + HO2 = OH + O2 + O2 :                    GCARR_ac(1.00d-14, -490.0d0);
O3 + NO2 = O2 + NO3 :                        GCARR_ac(1.20d-13, -2450.0d0);
... etc ...

General form

No matter what reaction is being added, the general procedure is the same. A new line must be added to custom.eqn of the following form:

A + B = C + 2.000D : RATE_LAW_FUNCTION(ARG_A, ARG_B ...);

The denotes the reactants (A and B) as well as the products (C and D) of the reaction. If exactly one molecule is consumed or produced, then the factor can be omitted; otherwise the number of molecules consumed or produced should be specified with at least 1 decimal place of accuracy. The final section, between the colon and semi-colon, specifies the function (RATE_LAW_FUNCTION) and its arguments which will be used to calculate the reaction rate constant k. Rate-law functions are specified in the gckpp.kpp file.

For an equation such as the one above, the overall rate at which the reaction will proceed is determined by k[A][B]. However, if the reaction rate does not depend on the concentration of A or B, you may write it with a constant value, such as:

A + B = C + 2.000D : 8.95d-17

This will save the overhead of a function call. As noted in the comments above, we use double-precision numerical constants (e.g. 8.95d-17 or 8.95e17_dp) as arguments to rate law functions.

Rates for two-body reactions according to the Arrhenius law

For many reactions the calculation of k follows the Arrhenius law:

k = a0 + ( 300 / TEMP )**b0 + EXP( c0 / TEMP )

For example, the JPL chemical data evaluation (Feb 2017) specifies that the reaction O3 + NO produces NO2 and O2, and its Arrhenius parameters are A = 3.0x10^-12 and E/R = c0 = 1500. The reaction rate k for this reaction is computed as:

k = 3.0x10^-12 + EXP( 1500 / TEMP )

Note that we are able to omit the computation of the term ( 300 / TEMP )**b0, since that would evaluate to 1. The EXP and ** mathematical operations are costly in terms of CPU clock cycles. To avoid a computational bottleneck, computing terms that evaluate to 1 should be avoided as much as possible.

For computational efficiency, we have defined the following Arrhenius rate law functions in gckpp.kpp for you to use:

  1. GCARR_abc(a0, b0, c0): Use when a0 > 0 and b0 > 0 and c0 > 0
  2. GCARR_ab(a0, b0): Use when a0 > 0 and b0 > 0
  3. GCARR_ac(a0, c0): Use when a0 > 0 and c0 > 0

For example, we would write the O3 + NO reaction in our KPP/custom/custom.eqn file as:

O3 + NO = NO2 + O2 : GCARR_ac(3.00d12, -1500.0d0); 

using the rate law function for when both a0 and c0 are nonzero.

Other rate-law functions

The KPP/custom/gckpp.kpp file contains other rate law functions, such as those required for three-body, pressure-dependent reactions. Any rate function which is to be referenced in the KPP/custom/custom.eqn file must be available in KPP/custom/gckpp.kpp prior to building the reaction mechanism.

Add heterogeneous reactions

List heterogeneous reactions after all of the gas-phase reactions in KPP/custom/custom.eqn, according to the format below:

//
// Heterogeneous reactions
//
HO2 = O2 :                                   HET(ind_HO2,1);                      {2013/03/22; Paulot2009; FP,EAM,JMAO,MJE}
NO2 = 0.500HNO3 + 0.500HNO2 :                HET(ind_NO2,1);
NO3 = HNO3 :                                 HET(ind_NO3,1);
NO3 = NIT :                                  HET(ind_NO3,2);                      {2018/03/16; XW}
... etc ...

Implementing new heterogeneous chemistry requires an additional step. For the reaction in question, a reaction should be added as usual, but this time the rate function should be given as an entry in the HET array. A simple example is uptake of HO2, specified as

HO2 = O2 : HET(ind_HO2,1);

Note that the product in this case, O2, is actually a fixed species, so no O2 will actually be produced. O2 is used in this case only as a dummy product to satisfy the KPP requirement that all reactions have at least one product. Here, HET is simply an array of pre-calculated rate constants. The rate constants in HET are actually calculated in KPP/custom/gckpp_HetRates.F90.

To implement an additional heterogeneous reaction, the rate calculation must be added to this file. The following example illustrates a (fictional) heterogeneous mechanism which converts the species XYZ into CH2O. This reaction is assumed to take place on the surface of all aerosols, but not cloud droplets (this requires additional steps not shown here). Three steps would be required:

  1. Add a new line to the KPP/custom/custom.eqn file, such as XYZ = CH2O : HET(ind_XYZ,1);

  2. Add a new function to gckpp_HetRates.F90 designed to calculate the heterogeneous reaction rate. As a simple example, we can copy the function HETNO3 and rename it HETXYZ. This function accepts two arguments: molecular mass of the impinging gas-phase species, in this case XYZ, and the reaction's "sticking coefficient" - the probability that an incoming molecule will stick to the surface and undergo the reaction in question. In the case of HETNO3, it is assumed that all aerosols will have the same sticking coefficient, and the function returns a first-order rate constant based on the total available aerosol surface area and the frequency of collisions.

  3. Add a new line to the function SET_HET in gckpp_HetRates.F90 which calls the new function with the appropriate arguments and passes the calculated constant to HET. Example: assuming a molar mass of 93 g/mol, and a sticking coefficient of 0.2, we would write HET(ind_XYZ, 1) = HETXYZ(9.30E1_fp, 2E-1_fp)

The function HETXYZ can then be specialized to distinguish between aerosol types, or extended to provide a second-order reaction rate, or whatever the user desires.

Add photolysis reactions

List photolysis reactions after the heterogeneous reactions, as shown below.

//
// Photolysis reactions
//
O3 + hv = O + O2 :                           PHOTOL(2);      {2014/02/03; Eastham2014; SDE}
O3 + hv = O1D + O2 :                         PHOTOL(3);      {2014/02/03; Eastham2014; SDE}
O2 + hv = 2.000O :                           PHOTOL(1);      {2014/02/03; Eastham2014; SDE}
... etc ...
NO3 + hv = NO2 + O :                         PHOTOL(12);     {2014/02/03; Eastham2014; SDE}
... etc ...

A photolysis reaction can be specified by giving the correct index of the PHOTOL array. This index can be determined by inspecting the file FJX_j2j.dat file.

NOTE: See the PHOTOLYSIS MENU section of the input.geos file in your run directory for the folder in which FJX_j2j.dat is located).

For example, one branch of the NO3 photolysis reaction is specified in the KPP/custom/custom.eqn file as

NO3 + hv = NO2 + O : PHOTOL(12) 

Referring back to FJX_j2j.dat shows that reaction 12, as specified by the left-most index, is indeed NO3 = NO2 + O:

 12 NO3       PHOTON    NO2       O                       0.886 /NO3   /

If your reaction is not already in FJX_j2j.dat, you may add it there. You may also need to modify FJX_spec.dat (in the same folder ast FJX_j2j.dat) to include cross-sections for your species. Note that if you add new reactions to FJX_j2j.dat you will also need to set the parameter JVN_ in module Headers/CMN_FJX_MOD.F to match the total number of entries.

If your reaction involves new cross section data, you will need to follow an additional set of steps. Specifically, you will need to:

  1. Estimate the cross section of each wavelength bin (using the correlated-k method), and
  2. Add this data to the FJX_spec.dat file.

For the first step, you can use tools already available on the Prather research group website. To generate the cross-sections used by Fast-JX, download the file "UCI_fastJ_addX_73cx.zip" from: [1]. You can then simply add your data to FJX_spec.dat and refer to it in FJX_j2j.dat as specified above. The following then describes how to generate a new set of cross-section data for the example of some new species MEKR:

To generate the photolysis cross sections of a new species, come up with some unique name which you will use to refer to it in the FJX_j2j.dat and FJX_spec.dat files - e.g. MEKR. You will need to copy one of the addX_*.f routines and make your own (say, addX_MEKR.f). Your edited version will need to read in whatever cross section data you have available, and you'll need to decide how to handle out-of-range information - this is particularly crucial if your cross section data is not defined in the visible wavelengths, as there have been some nasty problems in the past caused by implicitly assuming that the XS can be extrapolated (I would recommend buffering your data with zero values at the exact limits of your data as a conservative first guess). Then you need to compile that as a standalone code and run it; this will spit out a file fragment containing the aggregated 18-bin cross sections, based on a combination of your measured/calculated XS data and the non-contiguous bin subranges used by Fast-JX. Once that data has been generated, just add it to FJX_spec.dat and refer to it as above. There are examples in the addX files of how to deal with variations of cross section with temperature or pressure, but the main takeaway is that you will generate multiple cross section entries to be added to FJX_spec.dat with the same name.

An important complication: if your cross section data varies as a function of temperature AND pressure, you need to do something a little different. The acetone XS documentation shows one possible way to handle this; Fast-JX currently interpolates over either T or P, but not both, so if your data varies over both simultaneously then this will take some thought. The general idea seems to be that one determines which dependence is more important and uses that to generate a set of 3 cross sections (for interpolation), assuming values for the unused variable based on the standard atmosphere.

Modifying the gckpp.kpp file

(2) Modify the gckpp.kpp file if needed. This file defines the KPP integrator and tells KPP to use the specified .eqn file to build the mechanism. This file also defines the prod/loss families as described below.

  1. Rebuild the mechanism with KPP by navigating to the top-level KPP directory and typing ./build_mechanism.sh Custom. The build_mechanism.sh script will call on KPP to build the gckpp_*.F90 files. You may choose to rebuild the other supported chemical mechanisms by using ./build_mechanism.sh [NAME], but we recommend testing with the Custom mechanism first to avoid breaking the other mechanisms.
  2. Configure GEOS-Chem for the new mechanism with <

Add production & loss families (if desired)

Certain common families (e.g. POx, LOx) have been pre-defined for you. You will find the family definitions near the top of the KPP/custom/gckpp.kpp file:

#FAMILIES
POx : O3 + NO2 + 2NO3 + PAN + PPN + MPAN + HNO4 + 3N2O5 + HNO3 + BrO + HOBr + BrNO2 + 2BrNO3 + MPN + ETHLN + MVKN + MCRHN + MCRHNB + PROPNN + R4N2 + PRN1 + PRPN + R4N1 + HONIT + MONITS + MONITU + OLND + OLNN + IHN1 + IHN2 + IHN3 + IHN4 + INPB + INPD + ICN + 2IDN + ITCN + ITHN + ISOPNOO1 + ISOPNOO2 + INO2B + INO2D + INA + IDHNBOO + IDHNDOO1 + IDHNDOO2 + IHPNBOO + IHPNDOO + ICNOO + 2IDNOO + MACRNO2 + ClO + HOCl + ClNO2 + 2ClNO3 + 2Cl2O2 + 2OClO + O + O1D + IO + HOI + IONO + 2IONO2 + 2OIO + 2I2O2 + 3I2O3 + 4I2O4;
LOx : O3 + NO2 + 2NO3 + PAN + PPN + MPAN + HNO4 + 3N2O5 + HNO3 + BrO + HOBr + BrNO2 + 2BrNO3 + MPN + ETHLN + MVKN + MCRHN + MCRHNB + PROPNN + R4N2 + PRN1 + PRPN + R4N1 + HONIT + MONITS + MONITU + OLND + OLNN + IHN1 + IHN2 + IHN3 + IHN4 + INPB + INPD + ICN + 2IDN + ITCN + ITHN + ISOPNOO1 + ISOPNOO2 + INO2B + INO2D + INA + IDHNBOO + IDHNDOO1 + IDHNDOO2 + IHPNBOO + IHPNDOO + ICNOO + 2IDNOO + MACRNO2 + ClO + HOCl + ClNO2 + 2ClNO3 + 2Cl2O2 + 2OClO + O + O1D + IO + HOI + IONO + 2IONO2 + 2OIO + 2I2O2 + 3I2O3 + 4I2O4;
PCO : CO;
LCO : CO;
PSO4 : SO4;
LCH4 : CH4;
PH2O2 : H2O2;

NOTES:

  1. The Ox and CO rates are used in GEOS-Chem for computing budgets in the 1-month benchmark simulations.
  2. PSO4 is required for simulations using TOMAS aerosol microphysics.

To add a new prod/loss family, add a new line to the #FAMILIES section with the format

FAM_NAME : MEMBER_1 + MEMBER_2 + ... + MEMBER_N;

The family name must start with P or L to indicate whether KPP should calculate a production or a loss rate.

The maximum number of families allowed by KPP is currently set to 300. Depending on how many prod/loss families you add, you may need to increase that to a larger number to avoid errors in KPP. You can change the number for MAX_FAMILIES in KPP/kpp-code/src/gdata.h and then rebuild KPP.

#define MAX_EQN        1500    /* KPP 2.3.0_gc, Bob Yantosca (11 Feb 2021)  */
#define MAX_SPECIES    1000    /* KPP 2.3.0_gc, Bob Yantosca (11 Feb 2021)  */
#define MAX_SPNAME       30
#define MAX_IVAL         40
#define MAX_EQNTAG       12    /* Max length of equation ID in eqn file     */
#define MAX_K           150    /* Max length of rate expression in eqn file */
#define MAX_ATOMS        10
#define MAX_ATNAME       10
#define MAX_ATNR        250
#define MAX_PATH        120
#define MAX_FILES        20
#define MAX_FAMILIES    300
#define MAX_MEMBERS     150
#define MAX_EQNLEN      200

IMPORTANT: When adding a prod/loss family or changing any of the other settings in gckpp.kpp, the chemistry mechanism will need to be rebuilt with KPP as described above.

--Melissa Sulprizio (talk) 17:13, 9 November 2016 (UTC)

Saving out production & loss rates

NetCDF format

GEOS-Chem v11-02 introduces the option to save GEOS-Chem diagnostics to netCDF format by compiling with NC_DIAG=y. To save out the prod/loss diagnostics to netCDF format, you can add the following field names to your defined collection(s) in HISTORY.rc:

  'Prod_?PRD?',                  'GIGCchem',  
  'Loss_?LOS?',                  'GIGCchem',  

where ?PRD? and ?LOS? are wildcards that will be placed with all production and loss species as defined in GEOS-Chem. NOTE: GCHP doesn't accept wildcards at this time, so each prod/loss species will need to be listed separately.

--Melissa Sulprizio (talk) 17:21, 6 February 2018 (UTC)

Building the solver files for GEOS-Chem with KPP

TBD

Previous issues that are now resolved

FlexChem bug fix: do not zero ACTA, EOH, HCOOH

This fix was included in GEOS-Chem 12.0.0.

Katie Travis wrote:

I am working on a VOC simulation, and noticed that in my copy of v11-02f, the following species are set to zero in two places:
     ! Zero certain species
     C(ind_ACTA) = 0.e0_dp
     C(ind_EOH) = 0.e0_dp
     C(ind_HCOOH) = 0.e0_dp
And
  C(ind_ACTA)  = 0.0_dp
  C(ind_HCOOH) = 0.0_dp
Since none of these species are fixed in Tropchem.eqn, shouldn’t they NOT be set to zero?

Mike Long wrote:

I think the code should be removed. This must have been a patch added to maintain parity with SMVGEAR w/o anticipating that the species would become active.

--Bob Yantosca (talk) 16:19, 17 May 2018 (UTC)

Remove calls to UPDATE_SUN, UPDATE_RCONST from gckpp_Integrator.F90

This update was included in GEOS-Chem v11-01 provisional release

A slow down in GEOS-Chem run time was observed following the implementation of FlexChem in v11-01. To resolve this, a temporary workaround was implemented. This fix may be replaced with a more permanent solution in GEOS-Chem v11-01 public release.

Bob Yantosca wrote:

KPP automatically places calls to UPDATE_SUN and UPDATE_RCONST in the routines FunTemplate and JacTemplate (both in the gckpp_Integrator.F90 module). This assumes that you are not interfacing KPP into any other model, and that you will use KPP to compute the sun angles at each timestep.

We now call UPDATE_RCONST once per grid box before calling the KPP integrator. Also, because we use FAST-JX to get the photo rates, we no longer need to call UPDATE_SUN. These duplicate calls were causing a performance bottleneck, as UPDATE_RCONST was being called more than 7 million times per day.

We have removed the duplicate calls from the gckpp_Integrator.F90 modules in each of the chemistry mechanisms. But we will also need to make sure that when building KPP fresh from an equation file, that this step gets automatically added to the build sequence.

--Melissa Sulprizio (talk) 19:53, 9 January 2017 (UTC)

Known issues

None at this time.