Difference between revisions of "The HEMCO User's Guide"

From Geos-chem
Jump to: navigation, search
(Formatting page still (work in progress))
(Formatting page still (work in progress))
Line 526: Line 526:
 
Currently, HEMCO can read data from the following data sources:
 
Currently, HEMCO can read data from the following data sources:
  
Gridded data from netCDF file. More detail on the netCDF file are given below. In an ESMF environment, the MAPL/ESMF generic I/O routines are used to read/remap the data. In a non-ESMF environment, the HEMCO generic reading and remapping algorithms are used. Those support vertical regridding, unit conversion, and more (see below).
+
* Gridded data from netCDF file. More detail on the netCDF file are given below. In an ESMF environment, the MAPL/ESMF generic I/O routines are used to read/remap the data. In a non-ESMF environment, the HEMCO generic reading and remapping algorithms are used. Those support vertical regridding, unit conversion, and more (see below).
Scalar data directly specified in the HEMCO configuration file. Scalar values can be set in the HEMCO configuration file directly. If multiple values - separated by the separator sign (/) - are provided, they are interpreted as temporally changing values: 7 values = Sun, Mon, ..., Sat; 12 values = Jan, Feb, ..., Dec; 24 values = 0am, 1am, ..., 23pm (local time!).
+
* Scalar data directly specified in the HEMCO configuration file. Scalar values can be set in the HEMCO configuration file directly. If multiple values - separated by the separator sign (/) - are provided, they are interpreted as temporally changing values: 7 values = Sun, Mon, ..., Sat; 12 values = Jan, Feb, ..., Dec; 24 values = 0am, 1am, ..., 23pm (local time!).
 
For masks, exactly four values must be provided, interpreted as lower left and upper right mask box corners (lon1/lat1/lon2/lat2).
 
For masks, exactly four values must be provided, interpreted as lower left and upper right mask box corners (lon1/lat1/lon2/lat2).
Country-specific data specified in a separate ASCII file. This file must end with the suffix ’.txt’ and hold the country specific values listed by country ID. The IDs must correspond to the IDs of a corresponding (netCDF) mask file. The container name of this mask file must be given in the first line of the file, and must be listed HEMCO configuration file. ID 0 is reserved for the default values, applied to all countries with no specific values listed. The .txt file must be structured as follows:
+
* Country-specific data specified in a separate ASCII file. This file must end with the suffix ’.txt’ and hold the country specific values listed by country ID. The IDs must correspond to the IDs of a corresponding (netCDF) mask file. The container name of this mask file must be given in the first line of the file, and must be listed HEMCO configuration file. ID 0 is reserved for the default values, applied to all countries with no specific values listed. The .txt file must be structured as follows:
CountryMask
+
#CountryMask
# CountryName CountryID CountryValues
+
## CountryName CountryID CountryValues
DEFAULT 0 1.0/2.0/3.0/4.0/5.0/6.0/7.0
+
#DEFAULT 0 1.0/2.0/3.0/4.0/5.0/6.0/7.0
 
The CountryValues are interpreted the same way as scalar values, except that they are applied to all grid boxes with the given country ID.
 
The CountryValues are interpreted the same way as scalar values, except that they are applied to all grid boxes with the given country ID.
 +
 
Gridded input files are expected to be in the Network Common Data Form (netCDF) format (http://www.unidata.ucar.edu/software/netcdf/) and must adhere to the COARDS metadata conventions (http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html). In particular, the following points must be fullfilled:
 
Gridded input files are expected to be in the Network Common Data Form (netCDF) format (http://www.unidata.ucar.edu/software/netcdf/) and must adhere to the COARDS metadata conventions (http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html). In particular, the following points must be fullfilled:
  
Latitude and Longitude dimension
+
{| border=1 cellspacing=0 cellpadding=5
At this stage of development, only rectilinear (lon-lat) grids are supported. The data is automatically regridded onto the simulation grid (see section 15 for more details).
+
|-bgcolor="#CCCCCC"
Vertical dimension
+
!width="100px"|Attribute
In a non-ESMF environment, 3D data is interpolated onto the simulation levels if (and only if) the number of vertical levels is greater than one and not equal to the number of vertical levels of the simulation grid. In all other cases, it is assumed that data is already on the simulation levels. In particular, this explicitly assumes that the vertical coordinate direction is upwards, i.e. the first level index corresponds to the surface layer. Currently, only hybrid sigma pressure coordinate systems are supported. In order to properly determine the vertical pressure levels of the input data, the file must contain the surface pressure values and the hybrid coefficients (a, b) of the coordinate system. Further, the ’level’ variable must contain the attributes ’standard_name’ and ’formula_terms’ (the attribute ’positive’ is recommended but not required). A header excerpt of a valid netCDF file is shown below:
+
!width="825px"|Description
double lev(lev) ;
+
 
lev:standard_name = ”atmosphere_hybrid_sigma_pressure_coordinate” ;
+
|-valign="top"
lev:units = ”level” ;
+
|<tt>Latitude and Longitude dimension</tt>
lev:positive = ”down” ;
+
|At this stage of development, only rectilinear (lon-lat) grids are supported. The data is automatically regridded onto the simulation grid (see section 15 for more details).
lev:formula_terms = ”ap: hyam b: hybm ps: PS” ;
+
 
double hyam(nhym) ;
+
|-valign="top"
hyam:long_name = ”hybrid A coefficient at layer midpoints” ;
+
|<tt>Vertical dimension</tt>
hyam:units = ”hPa” ;
+
|In a non-ESMF environment, 3D data is interpolated onto the simulation levels if (and only if) the number of vertical levels is greater than one and not equal to the number of vertical levels of the simulation grid. In all other cases, it is assumed that data is already on the simulation levels. In particular, this explicitly assumes that the vertical coordinate direction is upwards, i.e. the first level index corresponds to the surface layer. Currently, only hybrid sigma pressure coordinate systems are supported. In order to properly determine the vertical pressure levels of the input data, the file must contain the surface pressure values and the hybrid coefficients (a, b) of the coordinate system. Further, the ’level’ variable must contain the attributes ’standard_name’ and ’formula_terms’ (the attribute ’positive’ is recommended but not required). A header excerpt of a valid netCDF file is shown below:
double hybm(nhym) ;
+
#double lev(lev) ;
hybm:long_name = ”hybrid B coefficient at layer midpoints” ;
+
#lev:standard_name = ”atmosphere_hybrid_sigma_pressure_coordinate” ;
hybm:units = ”1” ;
+
#lev:units = ”level” ;
double time(time) ;
+
#lev:positive = ”down” ;
time:standard_name = ”time” ;
+
#lev:formula_terms = ”ap: hyam b: hybm ps: PS” ;
time:units = ”days since 2000-01-01 00:00:00” ;
+
#double hyam(nhym) ;
time:calendar = ”standard” ;
+
#hyam:long_name = ”hybrid A coefficient at layer midpoints” ;
double PS(time, lat, lon) ;
+
#hyam:units = ”hPa” ;
PS:long_name = ”surface pressure” ;
+
#double hybm(nhym) ;
PS:units = ”hPa” ;
+
#hybm:long_name = ”hybrid B coefficient at layer midpoints” ;
double EMIS(time, lev, lat, lon) ;
+
#hybm:units = ”1” ;
EMIS:long_name = ”emissions” ;
+
#double time(time) ;
EMIS:units = ”kg m-2 s-1” ;
+
#time:standard_name = ”time” ;
 +
#time:units = ”days since 2000-01-01 00:00:00” ;
 +
#time:calendar = ”standard” ;
 +
#double PS(time, lat, lon) ;
 +
#PS:long_name = ”surface pressure” ;
 +
#PS:units = ”hPa” ;
 +
#double EMIS(time, lev, lat, lon) ;
 +
#EMIS:long_name = ”emissions” ;
 +
#EMIS:units = ”kg m-2 s-1” ;
  
 
Vertical regridding is currently not supported in an ESMF environment.
 
Vertical regridding is currently not supported in an ESMF environment.
  
Time
+
 
Times should be given as relative times, e.g. relative to a specified reference date. Accepted are ’days since yyyy-mm-dd’, ’hours since yyyy-mm-dd hh:mm:ss’, and ’minutes since yyyy-mm-dd hh:mm:ss’. There have been problems with some netCDF files with reference dates prior to 1901 (e.g. days since 1900-1-1) and reference years after 1900 should be used if possible.
+
|-valign="top"
 +
|<tt>Time</tt>
 +
|Times should be given as relative times, e.g. relative to a specified reference date. Accepted are ’days since yyyy-mm-dd’, ’hours since yyyy-mm-dd hh:mm:ss’, and ’minutes since yyyy-mm-dd hh:mm:ss’. There have been problems with some netCDF files with reference dates prior to 1901 (e.g. days since 1900-1-1) and reference years after 1900 should be used if possible.
 
Weekly data must contain seven time slices in increments of one day. The first entry must represent Sunday data, irrespective of the real weekday of the assigned datetime. It is possible to store weekly data for more than one time interval, in which case the first weekday (i.e. Sunday) must hold the starting date for the given set of (seven) time slices. For instance, weekly data for every month of a year can be stored as 12 sets of 7 time slices. The datetime of the first entry of each set must fall on the first day of every month, and the following six entries must be increments of one day. Curretnly, weekly data from netCDF files is not correctly read in an ESMF environment.
 
Weekly data must contain seven time slices in increments of one day. The first entry must represent Sunday data, irrespective of the real weekday of the assigned datetime. It is possible to store weekly data for more than one time interval, in which case the first weekday (i.e. Sunday) must hold the starting date for the given set of (seven) time slices. For instance, weekly data for every month of a year can be stored as 12 sets of 7 time slices. The datetime of the first entry of each set must fall on the first day of every month, and the following six entries must be increments of one day. Curretnly, weekly data from netCDF files is not correctly read in an ESMF environment.
Data units
+
 
It is recommended to store data in one of the HEMCO standard units: ’kg/m2/s’ and ’kg(C)/m2/s’ for fluxes; ’kg/m3’ and ’kg(C)/m3’ for concentrations; ’1’ for unitless data; and ’count’ for index-based data, i.e. discrete distributions (for instance, land types represented as integer values between 1 and 28). HEMCO will attempt to convert all data to one of those units, unless otherwise specified via the ’srcUnit’ attribute (see section 7).
+
|-valign="top"
 +
|<tt>Data units</tt>
 +
|It is recommended to store data in one of the HEMCO standard units: ’kg/m2/s’ and ’kg(C)/m2/s’ for fluxes; ’kg/m3’ and ’kg(C)/m3’ for concentrations; ’1’ for unitless data; and ’count’ for index-based data, i.e. discrete distributions (for instance, land types represented as integer values between 1 and 28). HEMCO will attempt to convert all data to one of those units, unless otherwise specified via the ’srcUnit’ attribute (see section 7).
 
Mass conversion (e.g. from molecules to kg) is performed based on the properties (e.g. molecular weight) of the species assigned to the given data set. It is also possible to convert between species-based and molecule-based units (e.g. kg vs. kg(C)). This conversion is based on the emitted molecular weight and the molecular ratio of the given species (see section 15.5). More details on unit conversion are given in module hco_unit_mod.F90.
 
Mass conversion (e.g. from molecules to kg) is performed based on the properties (e.g. molecular weight) of the species assigned to the given data set. It is also possible to convert between species-based and molecule-based units (e.g. kg vs. kg(C)). This conversion is based on the emitted molecular weight and the molecular ratio of the given species (see section 15.5). More details on unit conversion are given in module hco_unit_mod.F90.
 
Index-based data is regridded in such a manner that every grid box on the new grid represents the index with the largest relative contribution from the overlapping boxes of the original grid. All other data are regridded as ’concentration’ quantities, i.e. conserving the global weighted average.
 
Index-based data is regridded in such a manner that every grid box on the new grid represents the index with the largest relative contribution from the overlapping boxes of the original grid. All other data are regridded as ’concentration’ quantities, i.e. conserving the global weighted average.
 +
 +
|}
  
 
== pyHEMCO GUI ==
 
== pyHEMCO GUI ==
Line 576: Line 591:
 
To facilitate the creation and modification of a HEMCO configuration file, a Graphical User Interface (GUI) is currently developed. The GUI contains the following features:
 
To facilitate the creation and modification of a HEMCO configuration file, a Graphical User Interface (GUI) is currently developed. The GUI contains the following features:
  
Modification of existing configuration files.
+
* Modification of existing configuration files.
Creation of new configuration files, either from scratch or starting from an existing file.
+
* Creation of new configuration files, either from scratch or starting from an existing file.
Direct execution of the stand-alone version of HEMCO
+
* Direct execution of the stand-alone version of HEMCO
Preview of emissions from individual inventories, extensions, or any combination of it.
+
* Preview of emissions from individual inventories, extensions, or any combination of it.
 +
 
 
The pyHEMCO GUI is written in Python and linked to the pyGC visualization and data analysis package developed for GEOS-Chem. The developer version of the pyHEMCO GUI can be found at https://github.com/christophkeller. This version is still under development and not yet operational.
 
The pyHEMCO GUI is written in Python and linked to the pyGC visualization and data analysis package developed for GEOS-Chem. The developer version of the pyHEMCO GUI can be found at https://github.com/christophkeller. This version is still under development and not yet operational.
  
Line 602: Line 618:
 
Before running HEMCO, all variables and objects have to be initialized properly. The initialization of HEMCO occurs in three steps:
 
Before running HEMCO, all variables and objects have to be initialized properly. The initialization of HEMCO occurs in three steps:
  
Read the HEMCO configuration file (Config_ReadFile in hco_config_mod.F90). This writes the content of the entire configuration file into buffer, and creates a data container for each data item (base emission, scale factor, mask) in ConfigList.
+
# Read the HEMCO configuration file (Config_ReadFile in hco_config_mod.F90). This writes the content of the entire configuration file into buffer, and creates a data container for each data item (base emission, scale factor, mask) in ConfigList.
Initialize HcoState.
+
# Initialize HcoState.
Call HCO_INIT, passing HcoState to it. This initializes the HEMCO clock object (see hco_clock_mod.F90) and creates the ReadList (hco_readlist_mod.F90). The ReadList links to the data containers in ConfigList, but sorted by data update frequency. Data that is not used at all (e.g. scale factors that are not used by any base emission, or regional emissions that are outside of the emission grid). The EmisList linked list is only created in the run call.
+
# Call HCO_INIT, passing HcoState to it. This initializes the HEMCO clock object (see hco_clock_mod.F90) and creates the ReadList (hco_readlist_mod.F90). The ReadList links to the data containers in ConfigList, but sorted by data update frequency. Data that is not used at all (e.g. scale factors that are not used by any base emission, or regional emissions that are outside of the emission grid). The EmisList linked list is only created in the run call.
 
Note that steps 1 and 2 occur at the interface level (see section 15.5).
 
Note that steps 1 and 2 occur at the interface level (see section 15.5).
  
Line 611: Line 627:
 
This is the main function to run HEMCO. It can be repeated as often as necessary. Before calling this routine, the internal clock object has to be updated to the current simulation time (HcoClock_Set, see hco_clock_mod.F90). HCO_RUN performs the following steps:
 
This is the main function to run HEMCO. It can be repeated as often as necessary. Before calling this routine, the internal clock object has to be updated to the current simulation time (HcoClock_Set, see hco_clock_mod.F90). HCO_RUN performs the following steps:
  
Updates the time slice index pointers. This is to make sure that the correct time slices are used for every data container. For example, hourly scale factors can be stored in a data container holding 24 individual 2D fields. Module hco_tidx_mod.F90 organizes how to properly access these fields.
+
# Updates the time slice index pointers. This is to make sure that the correct time slices are used for every data container. For example, hourly scale factors can be stored in a data container holding 24 individual 2D fields. Module hco_tidx_mod.F90 organizes how to properly access these fields.
Read/update the content of the data containers (ReadList_Read). Checks if there are any fields that need to be read/updated (e.g. if this is a new month compared to the previous time step) and updates these fields if so by calling the data interface (see section 15.5).
+
# Read/update the content of the data containers (ReadList_Read). Checks if there are any fields that need to be read/updated (e.g. if this is a new month compared to the previous time step) and updates these fields if so by calling the data interface (see section 15.5).
Creates/updates the EmisList object. Similar to ReadList, EmisList points to the data containers in ConfigList, but sorted according to species, emission hierarchy, emissions category. To optimize emission calculations, EmisList already combines base emission fields that share the same species, category, hierarchy, scale factors, and field name (without the field name tag, see section 7).
+
# Creates/updates the EmisList object. Similar to ReadList, EmisList points to the data containers in ConfigList, but sorted according to species, emission hierarchy, emissions category. To optimize emission calculations, EmisList already combines base emission fields that share the same species, category, hierarchy, scale factors, and field name (without the field name tag, see section 7).
Calculate core emissions for the current simulation time. This is performed by subroutine hco_calcemis (hco_calc_mod.F90). This routine walks through EmisList and calculates the emissions for every base emission field by applying the assigned scale factors to it. The (up to 10) container IDs of all scale factors connected to the given base emission field (as set in the HEMCO configuration file) are stored in the data container variable ScalIDs. A container ID index list is used to efficiently retrieve a pointer to each of those containers (see cIDList in hco_datacont_mod.F90).
+
# Calculate core emissions for the current simulation time. This is performed by subroutine hco_calcemis (hco_calc_mod.F90). This routine walks through EmisList and calculates the emissions for every base emission field by applying the assigned scale factors to it. The (up to 10) container IDs of all scale factors connected to the given base emission field (as set in the HEMCO configuration file) are stored in the data container variable ScalIDs. A container ID index list is used to efficiently retrieve a pointer to each of those containers (see cIDList in hco_datacont_mod.F90).
  
 
==== Finalize: HCO_FINAL ====
 
==== Finalize: HCO_FINAL ====
Line 626: Line 642:
 
In analogy to the core module, the three main routines for the extensions are (in hcox_driver_mod.F90):
 
In analogy to the core module, the three main routines for the extensions are (in hcox_driver_mod.F90):
  
HCOX_INIT
+
* HCOX_INIT
HCOX_RUN
+
* HCOX_RUN
HCOX_FINAL
+
* HCOX_FINAL
 +
 
 
These subroutines invoke the corresponding calls of all (enabled) extensions and must be called at the interface level (after the core routines).
 
These subroutines invoke the corresponding calls of all (enabled) extensions and must be called at the interface level (after the core routines).
 
Extension settings (as specified in the configuration file, see also section 5) are automatically read by HEMCO. For any given extension, routines GetExtNr and GetExtOpt can be used to obtain the extension number and desired setting value, respectively. (see HCO_ExtList_Mod.F90). Routine HCO_GetExtHcoID should be used to extract the HEMCO species IDs of all species registered for this extension.
 
Extension settings (as specified in the configuration file, see also section 5) are automatically read by HEMCO. For any given extension, routines GetExtNr and GetExtOpt can be used to obtain the extension number and desired setting value, respectively. (see HCO_ExtList_Mod.F90). Routine HCO_GetExtHcoID should be used to extract the HEMCO species IDs of all species registered for this extension.
Line 641: Line 658:
 
Initialization:
 
Initialization:
  
Read the configuration file (Config_ReadFile in hco_config_mod.F90).
+
* Read the configuration file (Config_ReadFile in hco_config_mod.F90).
Initialize HcoState object (HcoState_Init in hco_state_mod.F90).
+
* Initialize HcoState object (HcoState_Init in hco_state_mod.F90).
Define the emission grid. Grid definitions are stored in HcoState%Grid. The emission grid is defined by its horizontal mid points and edges (all 2D fields), the hybrid sigma coordinate edges (3D), the grid box areas (2D), and the grid box heights. The latter is only used by some extensions (DEAD dust emissions and lightning NOx) and may be left undefined if those are not used.
+
* Define the emission grid. Grid definitions are stored in HcoState%Grid. The emission grid is defined by its horizontal mid points and edges (all 2D fields), the hybrid sigma coordinate edges (3D), the grid box areas (2D), and the grid box heights. The latter is only used by some extensions (DEAD dust emissions and lightning NOx) and may be left undefined if those are not used.
Define emission species. Species definitions are stored in vector HcoState%Spc(:) (one entry per species). For each species, the following parameter are required:
+
* Define emission species. Species definitions are stored in vector HcoState%Spc(:) (one entry per species). For each species, the following parameter are required:
HEMCO species ID: unique integer index for species identification. For internal use only.
+
*# HEMCO species ID: unique integer index for species identification. For internal use only.
Model species ID: the integer index assigned to this species by the employed model.
+
*# Model species ID: the integer index assigned to this species by the employed model.
Species name
+
*# Species name
Species molecular weight in g/mol.
+
*# Species molecular weight in g/mol.
Emitted species molecular weight in g/mol. This value can be different to the species molecular weight if species are emitted on a molecular basis, e.g. in mass carbon (in which case the emitted molecular weight becomes 12 g/mol).
+
*# Emitted species molecular weight in g/mol. This value can be different to the species molecular weight if species are emitted on a molecular basis, e.g. in mass carbon (in which case the emitted molecular weight becomes 12 g/mol).
Molecular ratio: molecules of emitted species per molecules of species. For example, if C3H8 is emitted as kg C, the molecular ratio becomes 3.
+
*# Molecular ratio: molecules of emitted species per molecules of species. For example, if C3H8 is emitted as kg C, the molecular ratio becomes 3.
K0: Liquid over gas Henry constant in M/atm.
+
*# K0: Liquid over gas Henry constant in M/atm.
CR: Temperature dependency of K0 in K.
+
*# CR: Temperature dependency of K0 in K.
pKa: The species pKa, used for correction of the Henry constant.
+
*# pKa: The species pKa, used for correction of the Henry constant.
 
The molecular weight - together with the molecular ratio - determine the mass scaling factors used for unit conversion in hco_unit_mod.F90. The Henry coefficients are only used by the air-sea exchange extension (and only for the specified species) and may be left undefined for other species and/or if the extension is not used.
 
The molecular weight - together with the molecular ratio - determine the mass scaling factors used for unit conversion in hco_unit_mod.F90. The Henry coefficients are only used by the air-sea exchange extension (and only for the specified species) and may be left undefined for other species and/or if the extension is not used.
  
Define simulation time steps. The emission, chemical and dynamic time steps can be defined separately.
+
* Define simulation time steps. The emission, chemical and dynamic time steps can be defined separately.
Initialize HEMCO core (HCO_INIT in hco_driver_mod.F90)
+
* Initialize HEMCO core (HCO_INIT in hco_driver_mod.F90)
Initialize HEMCO extensions (HCOX_INIT in hcox_driver_mod.F90)
+
* Initialize HEMCO extensions (HCOX_INIT in hcox_driver_mod.F90)
 +
 
 
Run:
 
Run:
  
Set current time (HcoClock_Set in hco_clock_mod.F90)
+
* Set current time (HcoClock_Set in hco_clock_mod.F90)
Reset all emission and deposition values (hco_FluxArrReset in hco_fluxarr_mod.F90)
+
* Reset all emission and deposition values (hco_FluxArrReset in hco_fluxarr_mod.F90)
Run HEMCO core to calculate emissions (hco_Run in hco_driver_mod.F90)
+
* Run HEMCO core to calculate emissions (hco_Run in hco_driver_mod.F90)
Link the (used) met. field objects of ExtState to desired data arrays (this step may also be done during initialization)
+
* Link the (used) met. field objects of ExtState to desired data arrays (this step may also be done during initialization)
Run HEMCO extensions to add extensions emissions (hcox_Run in hcox_driver_mod.F90)
+
* Run HEMCO extensions to add extensions emissions (hcox_Run in hcox_driver_mod.F90)
Export HEMCO emissions into desired environment
+
* Export HEMCO emissions into desired environment
 +
 
 
Finalization:
 
Finalization:
  
Finalize HEMCO extensions and extension state object ExtState (hcox_final in hcox_driver_mod.F90).
+
* Finalize HEMCO extensions and extension state object ExtState (hcox_final in hcox_driver_mod.F90).
Finalize HEMCO core (hco_final in hco_driver_mod.F90).
+
* Finalize HEMCO core (hco_final in hco_driver_mod.F90).
Clean up HEMCO state object HcoState (hcoState_final in hco_state_mod.F90).
+
* Clean up HEMCO state object HcoState (hcoState_final in hco_state_mod.F90).
  
 
==== Data interface (reading and regridding) ====
 
==== Data interface (reading and regridding) ====
Line 678: Line 697:
 
Data processing is performed in three steps:
 
Data processing is performed in three steps:
  
Read data from file using the source file information (file name, source variable, desired time stamp) provided in the configuraton file.
+
# Read data from file using the source file information (file name, source variable, desired time stamp) provided in the configuraton file.
Convert unit to HEMCO units based on the unit attribute read from disk and the srcUnit attribute set in the configuration file. See section 13 for more information.
+
# Convert unit to HEMCO units based on the unit attribute read from disk and the srcUnit attribute set in the configuration file. See section 13 for more information.
Remap original data onto the HEMCO emission grid. The grid dimensions of the input field are determined from the source file. If only horizontal regridding is required, e.g. for 2D data or if the number of vertical levels of the input data is equal to the number of vertical levels of the HEMCO grid, the horizontal interpolation routine used by GEOS-Chem is invoked. If vertical regridding is required or to interpolate index-based values (e.g. discrete integer values), the NcRegrid tool described in Joeckel  (2006) is used.
+
# Remap original data onto the HEMCO emission grid. The grid dimensions of the input field are determined from the source file. If only horizontal regridding is required, e.g. for 2D data or if the number of vertical levels of the input data is equal to the number of vertical levels of the HEMCO grid, the horizontal interpolation routine used by GEOS-Chem is invoked. If vertical regridding is required or to interpolate index-based values (e.g. discrete integer values), the NcRegrid tool described in Joeckel  (2006) is used.
  
 
== References ==
 
== References ==

Revision as of 19:33, 26 March 2015

HEMCO user guide v1.1 by Christoph Keller (ckeller@seas.harvard.edu)

March 22, 2015


Overview

The Harvard-NASA Emissions Component (HEMCO) is a software component for computing (atmospheric) emissions from different sources, regions, and species on a user-defined grid. It can combine, overlay, and update a set of data inventories (base emissions) and scale factors, as specified by the user through the HEMCO configuration file. Emissions that depend on environmental variables and non-linear parameterizations are calculated in separate HEMCO extensions. HEMCO can be run in standalone mode or coupled to an atmospheric model. It is included in the standard version of GEOS-Chem. A more detailed description of HEMCO is given in Keller et al. (2014).

Download and installation

HEMCO is a collection of FORTRAN-90 routines that are freely available at http://wiki.geos-chem.org/HEMCO. Installation instructions are provided on the same page. Some basic knowledge of a Unix operating system is expected. To couple HEMCO to an atmospheric model, some working experience with FORTRAN is also required. HEMCO is already included in the standard distribution of GEOS-Chem (http://wiki.geos-chem.org).

Example simulations

HEMCO includes a simple example of a HEMCO standalone simulation. It calculates emissions of carbon monoxide (CO) from the EDGAR inventory (Janssens-Maenhout et al., 2010) on January 1, 2008 on a global grid of 4x5 degrees. The full suite of emission inventories and extensions used by the chemical transport model GEOS-Chem are available at ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/HEMCO, along with the corresponding configuration file.

Getting started

All emission calculation settings are specified in the HEMCO configuration file. Modification of the HEMCO source code (and recompilation) is only required if new extensions are added, or to use HEMCO in a new model environent (see sections ?? and 11). Suppose monthly anthropogenic CO emissions from the MACCity inventory (Lamarque et al., 2010) are stored in file MACCity.nc as variable ’CO’. The following configuration file then simulates CO emissions with gridded hourly scale factors applied to it (the latter taken from variable ’factor’ of file hourly.nc). The horizontal grid and simulation datetimes employed by HEMCO depends on the HEMCO-model interface. If HEMCO is coupled to GEOS-Chem, these values are taken from the chemistry model. If run standalone, the grid specification and desired datetimes need be specified as described in section 11.

#### BEGIN SECTION EXTENSION SWITCHES
#ExtNr	ExtName	on/off	Species								
#0	Base	on	*								
#### END SECTION EXTENSION SWITCHES
#### BEGIN SECTION SETTINGS
#Logfile: HEMCO.log
#DiagnPrefix: HEMCO_diagnostics
#Wildcard: *
#Separator: /
#Unit tolerance: 1
#Negative values: 0
#Verbose: false
#Track code: false
#Show warnings: true
#ROOT: /dir/to/data
#### END SECTION SETTINGS
#### BEGIN SECTION BASE EMISSIONS
#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#0	MACCITY_CO	$ROOT/MACCity.nc	CO	1980-2014/1-12/1/0	C	xy	kg/m2/s	CO	1	1	1
#### END SECTION BASE EMISSIONS
#### BEGIN SECTION SCALE FACTORS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper			
#1	HOURLY_SCALFACT	$ROOT/hourly.nc	factor	2000/1/1/0-23	C	xy	unitless	1			
#### END SECTION SCALE FACTORS
#### BEGIN SECTION MASKS
#### END SECTION MASKS

The various attributes are explained in more detail in sections 7 and 8.

To add regional monthly anthropogenic CO emissions from the EMEP inventory (Vestreng et al., 2009) (in EMEP.nc) to the simulation, the configuration file can be modified as follows (changes are highlighted in bold face):

#### BEGIN SECTION EXTENSION SWITCHES
#ExtNr	ExtName	on/off	Species								
#0	Base	on	*								
#### END SECTION EXTENSION SWITCHES
#### BEGIN SECTION SETTINGS
#Logfile: HEMCO.log
#DiagnPrefix: HEMCO_diagnostics
#Wildcard: *
#Separator: /
#Unit tolerance: 1
#Negative values: 0
#Verbose: false
#Track code: false
#Show warnings: true
#ROOT: /dir/to/data
#### END SECTION SETTINGS
#### BEGIN SECTION BASE EMISSIONS
#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#0	MACCITY_CO	$ROOT/MACCity.nc	CO	1980-2014/1-12/1/0	C	xy	kg/m2/s	CO	1	1	1
#0	EMEP_CO	$ROOT/EMEP.nc	CO	2000-2014/1-12/1/0	C	xy	kg/m2/s	CO	1/1001	1	2
#### END SECTION BASE EMISSIONS
#### BEGIN SECTION SCALE FACTORS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper			
#1	HOURLY_SCALFACT	$ROOT/hourly.nc	factor	2000/1/1/0-23	C	xy	unitless	1			
#### END SECTION SCALE FACTORS
#### BEGIN SECTION MASKS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper	Box		
#1001	MASK_EUROPE	$ROOT/mask_europe.nc	MASK	2000/1/1/0	C	xy	unitless	1	-30/30/45/70		
#### END SECTION MASKS

Note the increased hierarchy of the regional EMEP inventory compared to the global MACCITY emissions (column Hier). To add aircraft emissions from the AEIC inventory (Stettler et al., 2011), available in file AEIC.nc, add the following line to the base emission section of the configuration file:

#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#0	AEIC_CO	$ROOT/AEIC.nc	CO	2005/1-12/1/0	C	xyz	kg/m2/s	CO	-	2	1

Note the change in the emission category (column Cat) compared to the anthropogenic CO emissions. Biomass burning emissions calculated by GFED3 (van der Werf et al., 2010) can be added by adding the corresponding extension to section ’Extension switches’, and adding all the input data needed by GFED3 to section ’Base emissions’. The extension number defined in section extension switches must match the corresponding ExtNr entry in the base emissions section:

#### BEGIN SECTION EXTENSION SWITCHES
#ExtNr	ExtName	on/off	Species								
#0	Base	on	*								
#100	GFED3	on	CO								
#→	CO scale factor:	1.05									
#### END SECTION EXTENSION SWITCHES
#...
#### BEGIN SECTION BASE EMISSIONS
#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#100	GFED3_WDL	$ROOT/GFED3.nc	WDL	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_AGW	$ROOT/GFED3.nc	AGW	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_DEF	$ROOT/GFED3.nc	DEF	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_FOR	$ROOT/GFED3.nc	FOR	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_PET	$ROOT/GFED3.nc	PET	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_SAV	$ROOT/GFED3.nc	SAV	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	HUMTROP	$ROOT/GFED3_humtrop.nc	humtrop	2000/1/1/0	C	xy	unitless	*	-	1	1
#### END SECTION BASE EMISSIONS
#...


The HEMCO configuration file can hold emission specifications of as many species as desired. For example, to add anthropogenic NO emissions from the MACCity inventory, add the following line to the configuration file:

#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#0	MACCITY_NO	$ROOT/MACCity.nc	NO	1980-2014/1-12/1/0	C	xy	kg/m2/s	NO	1	1	1

And to include NO to GFED3:

#### BEGIN SECTION EXTENSION SWITCHES
#ExtNr	ExtName	on/off	Species								
#0	Base	on	*								
#100	GFED3	on	CO/NO								


Finally, let’s add sulfate emissions to the simulation. Emissions of SO4 are approximated from the MACCity SO2 data, assuming that SO4 constitutes 3.1% of the SO2 emissions. The final configuration file then becomes

#### BEGIN SECTION EXTENSION SWITCHES
#ExtNr	ExtName	on/off	Species								
#0	Base	on	*								
#100	GFED3	on	CO/NO/SO2								
#→	CO scale factor:	1.05									
#### END SECTION EXTENSION SWITCHES
#### BEGIN SECTION SETTINGS
#Logfile: HEMCO.log
#DiagnPrefix: HEMCO_diagnostics
#Wildcard: *
#Separator: /
#Unit tolerance: 1
#Negative values: 0
#Verbose: false
#Track code: false
#Show warnings: true
#ROOT: /dir/to/data
#### END SECTION SETTINGS
#### BEGIN SECTION BASE EMISSIONS
#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#0	MACCITY_CO	$ROOT/MACCity.nc	CO	1980-2014/1-12/1/0	C	xy	kg/m2/s	CO	1	1	1
#0	MACCITY_NO	$ROOT/MACCity.nc	NO	1980-2014/1-12/1/0	C	xy	kg/m2/s	NO	1	1	1
#0	MACCITY_SO2	$ROOT/MACCity.nc	SO2	1980-2014/1-12/1/0	C	xy	kg/m2/s	SO2	-	1	1
#0	MACCITY_SO4	-	-	-	-	-	-	SO4	2	1	1
#0	AEIC_CO	$ROOT/AEIC.nc	CO	2005/1-12/1/0	C	xyz	kg/m2/s	CO	-	2	1
#0	EMEP_CO	$ROOT/EMEP.nc	CO	2000-2014/1-12/1/0	C	xy	kg/m2/s	CO	1/1001	1	2
#100	GFED3_WDL	$ROOT/GFED3.nc	WDL	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_AGW	$ROOT/GFED3.nc	AGW	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_DEF	$ROOT/GFED3.nc	DEF	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_FOR	$ROOT/GFED3.nc	FOR	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_PET	$ROOT/GFED3.nc	PET	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	GFED3_SAV	$ROOT/GFED3.nc	SAV	1997-2011/1-12/1/0	C	xy	kg/m2/s	*	-	1	1
#100	HUMTROP	$ROOT/GFED3_humtrop.nc	humtrop	2000/1/1/0	C	xy	unitless	*	-	1	1
#### END SECTION BASE EMISSIONS
#### BEGIN SECTION SCALE FACTORS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper			
#1	HOURLY_SCALFACT	$ROOT/hourly.nc	factor	2000/1/1/0-23	C	xy	unitless	1			
#2	SO2toSO4	0.031	-	-	-	-	unitless	1			
#### END SECTION SCALE FACTORS
#### BEGIN SECTION MASKS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper	Box		
#1001	MASK_EUROPE	$ROOT/mask_europe.nc	MASK	2000/1/1/0	C	xy	unitless	1	-30/30/45/70		
#### END SECTION MASKS

Extension switches

HEMCO performs automatic emission calculations using all fields that belong to the ’base’ extension (see below). Additional emissions that depend on environmental parameter such as wind speed or air temperature - and/or that use non-linear parameterizations - are calculated through HEMCO extensions. A list of currently implemented extensions in HEMCO is given in Keller et al. (2014). To add additional extensions to HEMCO, modifications of the source code are required, as described further in section 15. The first section of the configuration file lists all available extensions and whether they shall be used or not. For each extension, the following attributes need to be specified:

Extension Description
ExtNr (Unique) extension number.
ExtName Extension name.
on/off Extension toggle. Extension is only used if this attribute is set to ’on’.
Species List of species to be used by this extension. Multiple species are separated by the separator symbol. All listed species must be supported by the given extension. For example, the soil NO emissions extension only supports one species (NO) and an error will be prompted if additional species are added.

Additional - extension specific - settings can also be specified in the ’Extensions settings’ section (see also example in section 4 and definition of data collections, section 10). These settings must immediately follow the extension definition. HEMCO expects an extension with extension number zero, denoted the ’base’ extension. All emission fields linked to the base extension will be used for automatic emission calculation. Fields assigned to any other extension number will be ignored for emission calculation, but they are still read/regridded by HEMCO (and can be made available readily anywhere in the model code). These data is only read if the corresponding extension is enabled. All species to be used by HEMCO must be listed in column ’Species’ of the base extension switch. In particular, all species used by any of the other extensions must also be listed as base species, otherwise they will not be recognized. It is possible (and recommended) to use the wildcard character, in which case HEMCO automatically determines what species to use by matching the atmospheric model species names with the species names assigned to the base emission fields and/or any emission extension.

HEMCO settings

Section settings of the configuration file defines a number of parameter and variables used by HEMCO. The order in which they appear in the configuration file is irrelevant. The settings are:

Name Description
Logfile Path and name of the output logfile. If set to the wildcard character, all output is written to standard output.
DiagnPrefix Path and prefix of the hemco diagnostics output. All diagnostics will be written to DiagnPrefix_YYYYMMDDHH.nc.
Wildcard Wildcard character symbol. Defaults to ’*’.
Separator Separator character symbol. Defaults to ’/’.
Unit tolerance Integer value denoting the tolerance against differences between the units set in the configuration file and data units found in the source file (0 = no tolerance, 2 = high tolerance). See section 13 for details.
Negative values Defines how negative values are handled. If set to 0, no negative values are allowed (default). If set to 1, all negative values are set to zero and a warning is prompted. If set to 2, negative values are kept as they are.
Verbose 0 = no verbose; 3 = very verbose.
Show warnings If true, prompts all warnings to the logfile.
ROOT The root directory. Can be used to specify the root directory of file data (see section 7).
MODEL Can be used to set the $MODEL token (see section 7). If omitted, this value is determined based on compiler switches
RES Can be used to set the $RES token (see section 7). If omitted, this value is determined based on compiler switches

In standalone mode, the three simulation description files can also be specified (see also section 11):

Name Description
GridFile Path and name of the grid description file. Defaults to HEMCO_sa_Grid.rc.
SpecFile Path and name of the species description file. Defaults to HEMCO_sa_Spec.rc.
TimeFile Path and name of the time specification file. Defaults to HEMCO_sa_Time.rc.

Base emissions

The base emission section lists all base emission fields and how they are linked to scale factors. For each base emissions, the following attributes need to be defined:

Attribute Description
ExtNr Extension number associated with this field. All base emissions should have extension number zero. The ExtNr of the data listed in section ’Extensions data’ must match with the corresponding extension number (see section 5). The extension number can be set to the wildcard character. In that case, the field is read by HEMCO (if the assigned species name matches any of the HEMCO species, see ’Species’ below) but not used for emission calculation. This is particularly useful if HEMCO is only used for data I/O but not for emission calculation.
Name Descriptive field identification name. Two consecutive underscore characters (’_’) can be used to attach a ’tag’ to a name. This is only of relevance if multiple base emission fields share the same species, category, hierarchy, and scale factors. In this case, emission calculation can be optimized by assigning field names that only differ by its tag to those fields (e.g. ’DATA__SECTOR1’, ’DATA__SECTOR2’, etc.).

For fields assigned to extensions other than the base extension (ExtNr = 0), the field names are prescribed and must not be modified because the data is identified by these extensions by name.

sourceFile Path and name of the input file. See section 13 for more details on the input file format requirements.

Name tokens can be provided that become evaluated during runtime. For example, to use the root directory specified in the settings (see section 6), the token ’$ROOT’ can be used. Similarly, the token ’$CFDIR’ refers to the location of the configuration file. This allows to reference to data relative to the location of the configuration file. For instance, if the data is located in subfolder ’data’ of the same directory as the configuration file, the file name can be set to ’$CFDIR/data/filename.nc’. Similarly, the date tokens ’$YYYY’, ’$MM’, ’$DD’, and ’$HH’ can be used to refer to the current valid year, month, day, and hour, respectively. These values are determined from the current simulation datetime and the ’SrcTime’ specification for this entry (see below). The tokens ’$MODEL’ and ’$RES’ refer to the meteorological model and resolution. These tokens can be set explicitly in the settings section. In GEOS-Chem, they are set to compiler-flag specific values if not set in the settings section. As an alternative to an input file, geospatial uniform values can directly be specified in the configuration file (see e.g. scale factor SO2toSO4 in the example of section 4). If multiple values are provided (separated by the separator character), they are interpreted as different time slices. In this case, the sourceTime attribute can be used to specify the times associated with the individual slices. If no time attribute is set, HEMCO attempts to determine the time slices from the number of data values: 7 values are interpreted as weekday (Sun, Mon, ... Sat); 12 values as month (Jan, ..., Dec); 24 values as hour-of-day (12am, 1am, ..., 11pm). Country-specific data can be provided through an ASCII file (.txt). More details on this option is given in section 13. If this entry is left empty (’-’), the filename from the preceding entry is taken, and the next 5 attributes will be ignored (see entry MACCITY_SO4 in section 4).

sourceVar Source file variable of interest. Leave empty (’-’) if values are directly set through the sourceFile attribute or if sourceFile is empty.
sourceTime This attribute defines the time slices to be used and the data refresh frequency. The format is year/month/day/hour. Accepted are discrete dates for time-independent data (e.g. 2000/1/1/0) and time ranges for temporally changing fields (e.g. 1980-2007/1-12/1-31/0-23). Data will automatically become updated as soon as the simulation date enters a new time interval.

The provided time attribute determines the data refresh frequency. It does not need to correspond to the datetimes of the input file. For example, if the input file contains daily data of year 2005 and the time attribute is set to 2005/1/1/0, the file will be read just once (at the beginning of the simulation) and the data of Jan 1, 2005 is used throughout the simulation. If the time attribute is set to 2005/1-12/1/0, the data is updated on every month, using the first day data of the given month. For instance, if the simulation starts on July 15, the data of July 1, 2005 are used until August 1, at which point the data will be refreshed to values from August 1, 2005. Only a time attribute of 2005/1-12/1-31/0 will make sure that the input data are refreshed daily to the current day’s data. Finally, if the time attribute is set to 2005/1-12/1-31/0-23, the data file is read every simulation hour, but the same daily data is used throught the day (since there are no hourly data in the file). Providing too high update frequencies is not recommended unless the data interpolation option is enabled (see below). If the provided time attributes do not match a datetime of the input file, the ’most likely’ time slice is selected. The most likely time slice is determined based on the specified source time attribute, the datetimes available in the input file, and the current simulation date. In most cases, this is just the closest available time slice that lies in the past. For example, if a file contains annual data from 2005 to 2010 and the source time attribute is set to 2005-2010/1-12/1/0, the data of 2005 is used for all simulation months in 2005. More complex datetime selections occur for files with discontinuous time slices, e.g. a file with monthly data for year 2005, 2010, 2020, and 2050. In this case, if the time attribute is set to 2005-2020/1-12/1/0, the monthly values of 2005 are (re-)used for all years between 2005 and 2010, the monthly values of 2010 are used for simulation years 2010 - 2020, etc. It is possible to use the tokens $YYYY, $MM, $DD, and $HH, which will automatically be replaced by the current simulation date. Weekly data (e.g. data changing by the day of the week) can be indicated by setting the day attribute to ’WD’ (the wildcard character will work, too, but is not recommended). Weekly data needs to consist of at least seven time slices - in increments of one day - representing data for every weekday starting on Sunday. It is possible to store multiple weekly data, e.g. for every month of a year: 2000/1-12/WD/0. These data must contain time slices for the first seven days of every month, with the first day per month representing Sunday data, then followed by Monday, etc. (irrespective of the real weekdays of the given month). If the wildcard character is used for the days, the data will be interpreted if (and only if) there are exactly seven time slices. See section 13 for more details. Similar to the weekday option, there is an option to indicate hourly data that represents local time: ’LH’. If using this flag, all hourly data of a given time interval (day, month, year) are read into memory and the local hour is picked at every location. A downside of this is that all hourly time slices in memory are updated based on UTC time. For instance, if a file holds local hourly data for every day of the year, the source time attribute can be set to 2011/1-12/1-31/LH. On every new day (according to UTC time), this will read all 24 hourly time slices of that UTC day and use those hourly data for the next 24 hours. For the US, for instance, this results in the wrong daily data being used for the last 6-9 hours of the day (when UTC time is one day ahead of local US time). There is a difference between source time attributes ’2005-2008/$MM/1/0’ and ’2005-2008/1-12/1/0’. In the first case, the file will be updated annually, while the update frequency is monthly in the second case. The token $MM simply indicates that the current simulation month shall be used whenever the file is updated, but it doesn’t imply a refresh interval. Thus, if the source time attribute is set to ’$YYYY/$MM/$DD/$HH’, the file will be read only once and the data of the simulation start date is taken (and used throughout the simulation). For uniform values directly set in the configuration file, all time attributes but one must be fixed, e.g. valid entries are 1990-2007/1/1/0 or 2000/1-12/1/1, but not 1990-2007/1-12/1/1. All data read from netCDF file are assumed to be in UTC time, except for weekdaily data that are always assumed to be in local time. Data read from the configuration file and/or from ASCII are always assumed to be in local time. It is legal to keep different time slices in different files, e.g. monthly data of multiple years can be stored in files file_200501.nc, file_200502.nc, ..., file_200712.nc By setting the source file attribute to ’file_$YYYY$MM.nc’ and the source time attribute to ’2005-2007/1-12/1/0’, data of file_200501.nc is used for simulation dates of January 2005 (or any January of a previous year), etc. The individual files can also contain only a subset of the provided data range, e.g. all monthly files of a year can be stored in one file: file_2005.nc, file_2006.nc, file_2007.nc. In this case, the source file name should be set to ’file_$YYYY’, but the source time attribute should still be ’2005-2007/1-12/1/0’ to indicate that the field shall be updated monthly.

CRE Determines the time slice selection for simulation dates outside the provided source time range. Allowed are ’C’, ’R’, ’E’, and ’I’.

Value ’C’ is interpreted as climatology and data are recylced once the end of the last time slice is reached. For instance, if the input data contains monthly data of year 2000, and the source time attribute is set to 2000/1-12/1/0, the same monthly data will be re-used every year. If the input data spans multiple years (e.g. monthly data from 2000-2003), the closest available year will be used outside of the available range (e.g. the monthly data of 2003 is used for all simulation years after 2003). If the value is set to ’R’, data are only considered as long as the simulation time is within the time range specified in entry sourceTime. The provided range does not necessarily need to match the time stamps of the input file. If it is outside of the range of the netCDF time stamps, the closest available date will be used. For instance, if a netCDF file contains data for years 2003 to 2010 and the provided range is set to 2006-2100/1/1/0, the file will only be considered if the simulation year is 2006 or higher. For simulation years 2006 thorugh 2009, the corresponding netCDF time stamp is used. For all years beyond 2009, data of year 2010 is used. If the flag is set to ’E’, an error is returned if none of the time slices specified in sourceTime matches the current simulation time. This setting is useful for restart files or other data that is highly sensitive to the datetime. A flag of ’I’ indicates that data fields shall be interpolated in time. As an example, let’s assume a file contains annual data for years 2005, 2010, 2020, and 2050. If the source time attribute is set to 2005-2050/1/1/0 ’I’, data becomes interpolated between the two closest years every time we enter a new simulation year. If the simulation starts on January 2004, the value of 2005 is used for years 2004 and 2005. At the beginning of 2006, the used data is calculated as a weighted mean fo the 2005 and 2010 data, with 0.8 weight given to 2005 and 0.2 weight given to 2010 values. Once the simulation year changes to 2007, the weights change to 0.6 for 2005 and 0.4 for 2010, etc. The interpolation frequency is determined by the source time attribute: in the given example, setting the source time attribute to 2005-2050/1-12/1/0 ’I’ would result in a recalculation of the weights on every new simulation month. Interpolation works in a very similar manner for discontinuous monthly, daily, and hourly data. For instance if a file contains monthly data of 2005, 2010, 2020, and 2050 - and the source time attribute is set to 2005-2050/1-12/1/0 ’I’ - the field is recalculated every month using the two bracketing fields of the given month: July 2007 values are calculated from July 2005 and July 2010 data (with weights of 0.6 and 0.4, respectively), etc. Data interpolation also works between multiple files. For instance, if monthly data are stored in files ’file_200501.nc’, ’file_200502.nc’, etc., a combination of source file name ’file_$YYYY$MM.nc’ and source time attribute ’2005-2007/1-12/1-31/0’ will result in daily data interpolation between the two bracketing files, e.g. if the simulation day is July 15, 2005, the fields current values are calculated from files ’file_200507.nc’ and ’file_200508.nc’, respectively. Data interpolation across multiple files also works if there are file ’gaps’, for example if there is a file only every three hours: file_20120101_0000.nc, file_20120101_0300.nc, etc. Hourly data interpolation between those files can be achieved by setting source file to ’file_$YYYY$MM$DD_$HH00.nc’, and source time to ’2000-2015/1-12/1-31/0-23’ (or whatever the covered year range is).

SrcDim Spatial dimension of input data. xy for horizontal data, xyz for 3-dimensional data.
SrcUnit Units of input data. In combination with the unit tolerance parameter specified in the HEMCO settings (see section 6), this parameter is used by HEMCO for unit conversion and regridding. In general, HEMCO will attempt to convert all data to HEMCO standard units. The data units are determined from the source file. If unit tolerance is set to zero, HEMCO stops with an error if the SrcUnit attribute does not match with the unit string read from the source file (only a warning is issued for higher unit tolerances). For higher unit tolerances, SrcUnit can be set to ’1’ or ’count’ to force HEMCO to perform no unit conversions. If unit tolerance is set to 1, this behavior is only accepted for data recognized by HEMCO as being unitless or in HEMCO units. Data with units of ’count’ are assumed to represent index-based scalar fields (e.g. land types) and regridding is performed accordingly. For data directly specified in the configuration file, unit conversion is always performed based on SrcUnit. See section 13 for more details on data units.
Species HEMCO emission species name. Emissions will be added to this species. All HEMCO emission species are defined at the beginning of the simulation (see section 11). If the species name does not match any of the HEMCO species, the field is ignored alltogether.

The species name can be set to the wildcard character, in which case the field is always read by HEMCO but no species is assigned to it. This can be useful for extensions that import some (species-independent) fields by name. The following three entries only take effect for fields that are assigned to the base extension (ExtNr = 0), e.g. that are used for automatic emission calculation. They are used by HEMCO to determine the priority of the emission fields, i.e. how the final emission fields are assembled from all provided data fields.

ScalIDs Identification numbers of all scale factors and masks that shall be applied to this base emission field. Multiple entires must be separated by the separator character. The ScalIDs must correspond to the ScalID numbers provided in section ’Scale factors’ and ’Masks’.
Cat Emission category. Used to distinguish different, independent emission sources. Emissions of different categories are always added to each other. Up to three emission categories can be assigned to each entry (separated by the separator character). Emissions are always entirely written into the first listed category, while emissions of zero are used for any other assigned category.
Hier Emission hierarchy. Used to prioritize emission fields within the same emission category. Emissions of higher hierarchy overwrite lower-hierarchy data. Fields are only considered within their defined domain, i.e. regional inventories are only considered within their mask boundaries.

Scale factors

The scale factors section of the configuration file lists all scale factors applied to the base emission field. Scale factors that are not used by any of the base emission fields are ignored. Scale factors can represent (1) temporal emission variations including diurnal, seasonal, or interannual variability; (2) regional masks that restrict the applicability of the base inventory to a given region; or (3) species-specific scale factors, e.g., to split lumped organic compound emissions into individual species. Most attributes in this section are very similar to the base emissions, except for attributes ’ScalID’ and ’Oper’:

Attribute Description
ScalID Scale factor identification number. Used to link the scale factors to the base emissions through the corresponding ScalIDs attribute in the base emissions section.
Name See section 7.
sourceFile See section 7.
sourceVar See section 7.
sourceTime See section 7.
CRE See section 7.
SrcDim See section 7.
SrcUnit as described in section 7, with the exception that scale factors are assumed to be unitless and no automatic unit conversion is performed.
Oper Scale factor operator. Determines the operation performed on the scale factor. Possible values are: 1 for multiplication (Emission = Base⋅Scale); -1 for division (E = B∕S); 2 for squared (E = B⋅S2).
MaskID (optional) ScalID of a mask field. This optional value can be used if a scale factor shall only be used over a given region. The provided MaskID must have a corresponding entry in section Masks of the configuration file.

Masks

This section lists all masks used by HEMCO. Masks are binary scale factors (1 inside the mask region, 0 outside). If masks are regridded, the remapped mask values (1 and 0) are deterimend through regular rounding, i.e. a remapped mask value of 0.49 will be set to 0 while 0.5 will be set to 1. Required attributes for mask fields are:

Attribute Description
ScalID Mask identification number. See section 8.
Name See section 7.
sourceFile See section 7. In addition to a netCDF input file, it is also possible to directly provide the lower left and upper right box coordinates, i.e. Lon1/Lat1/Lon2/Lat2.
sourceVar See section 7.
sourceTime See section 7.
Cycle See section 7.
SrcDim See section 7.
SrcUnit See section 7.
Oper Data operator. As for scale factors, except that value 2 (squared) is not allowed. Instead, Oper can be set to 3, which will ’mirror’ the mask, i.e. Y = 1-X, where Y and X are the new and original mask value, respectively
Box the approximate mask region grid box edges (Lon1/Lat1/Lon2/Lat2; lower left and upper right). This is only of relevance for regional emission grids or in a parallel computing environment (to exclude fields that have no coverage on the area covered by this CPU or within the specified emission region).

Data collections

The fields listed in the configuration file can be grouped into data collections. Collections can be enabled/disabled in section extension switches. Only the fields that are part of an enabled collection will be used by HEMCO. The beginning and end of a collection is indicated by an opening and closing bracket, respectively: ’(((CollectionName’ and ’)))CollectionName’. These brackets must be on individual lines immediately preceeding / following the first/last entry of a collection. The same collection bracket can be used as many times as needed. The collections are enabled/disabled in section ’Extension switches’ (see section 5). Each collection name must be provided as an extension setting and can then be readily enabled/disabled:

#### BEGIN SECTION EXTENSION SWITCHES
#ExtNr	ExtName	on/off	Species								
#0	Base	on	*								
#→	MACCITY:	on									
#→	EMEP:	on									
#### END SECTION EXTENSION SWITCHES
#### BEGIN SECTION SETTINGS
#Logfile: HEMCO.log
#DiagnPrefix: HEMCO_diagnostics
#Wildcard: *
#Separator: /
#Unit tolerance: 1
#Negative values: 0
#Verbose: false
#Track code: false
#Show warnings: true
#ROOT: /dir/to/data
#### END SECTION SETTINGS
#### BEGIN SECTION BASE EMISSIONS
#ExtNr	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Species	ScalIDs	Cat	Hier
#(((MACCITY
#0	MACCITY_CO	$ROOT/MACCity.nc	CO	1980-2014/1-12/1/0	C	xy	kg/m2/s	CO	1	1	1
#0	MACCITY_NO	$ROOT/MACCity.nc	NO	1980-2014/1-12/1/0	C	xy	kg/m2/s	NO	1	1	1
#0	MACCITY_SO2	$ROOT/MACCity.nc	SO2	1980-2014/1-12/1/0	C	xy	kg/m2/s	SO2	-	1	1
#0	MACCITY_SO4	-	-	-	-	-	-	SO4	2	1	1
#)))MACCITY
#(((EMEP
#0	EMEP_CO	$ROOT/EMEP.nc	CO	2000-2014/1-12/1/0	C	xy	kg/m2/s	CO	1/1001	1	2
#)))EMEP
#### END SECTION BASE EMISSIONS
#### BEGIN SECTION SCALE FACTORS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper			
#1	HOURLY_SCALFACT	$ROOT/hourly.nc	factor	2000/1/1/0-23	C	xy	unitless	1			
#2	SO2toSO4	0.031	-	-	-	-	unitless	1			
#### END SECTION SCALE FACTORS
#### BEGIN SECTION MASKS
#ScalID	Name	srcFile	srcVar	srcTime	CRE	Dim	Unit	Oper	Box		
#1001	MASK_EUROPE	$ROOT/mask_europe.nc	MASK	2000/1/1/0	C	xy	unitless	1	-30/30/45/70		
#### END SECTION MASKS

If no corresponding entry is found in the extensions section for a given collection, it will be ignored. Collections are also ignored if the collection is defined in an extension that is disabled. It is recommended to list all collections under the base extension.

Interfaces

In order to perform an emission simulation, information on the simulation grid, species, dates and times must be provided to HEMCO. These information can be passed from an atmospheric model (e.g. GEOS-Chem) or from a suite of configuration files (for stand-alone applications). The emission fields calculated by HEMCO are either returned to the atmospheric model or written to disk.

Stand-alone interface

HEMCO can be employed as stand-alone model, in which case all simulation information is read from separate input files through the stand-alone interface, as described in detail in hcoi_standalone_mod.F90. For each species, total emissions per species are written to a netCDF file (via the HEMCO diagnostics). For the standalone version of HEMCO, all extensions input data has to be provided through input files, e.g. all required environmental data (wind speed, radiation, etc.) must be read from disk. These input files should be listed in section extensions data of the configuration file.

Interfaces to atmospheric models

HEMCO can be coupled to an atmospheric model and all simulation specifications are obtained from that model through a model-specific interface. Currently, HEMCO is implemented in the NASA Goddard Earth Observing System (GEOS-5) Earth system model and the GEOS-Chem chemical transport model. The GEOS-5 interface is based on the Earth System Modeling Framework (ESMF) software environment and thus easily adoptable to other ESMF applications. The HEMCO-model interface provides the link between the atmospheric model and HEMCO. It invokes the calls to the HEMCO driver routines (see section 15).

Diagnostics

tbd

Input file format

Currently, HEMCO can read data from the following data sources:

  • Gridded data from netCDF file. More detail on the netCDF file are given below. In an ESMF environment, the MAPL/ESMF generic I/O routines are used to read/remap the data. In a non-ESMF environment, the HEMCO generic reading and remapping algorithms are used. Those support vertical regridding, unit conversion, and more (see below).
  • Scalar data directly specified in the HEMCO configuration file. Scalar values can be set in the HEMCO configuration file directly. If multiple values - separated by the separator sign (/) - are provided, they are interpreted as temporally changing values: 7 values = Sun, Mon, ..., Sat; 12 values = Jan, Feb, ..., Dec; 24 values = 0am, 1am, ..., 23pm (local time!).

For masks, exactly four values must be provided, interpreted as lower left and upper right mask box corners (lon1/lat1/lon2/lat2).

  • Country-specific data specified in a separate ASCII file. This file must end with the suffix ’.txt’ and hold the country specific values listed by country ID. The IDs must correspond to the IDs of a corresponding (netCDF) mask file. The container name of this mask file must be given in the first line of the file, and must be listed HEMCO configuration file. ID 0 is reserved for the default values, applied to all countries with no specific values listed. The .txt file must be structured as follows:
#CountryMask
## CountryName CountryID CountryValues
#DEFAULT 0 1.0/2.0/3.0/4.0/5.0/6.0/7.0

The CountryValues are interpreted the same way as scalar values, except that they are applied to all grid boxes with the given country ID.

Gridded input files are expected to be in the Network Common Data Form (netCDF) format (http://www.unidata.ucar.edu/software/netcdf/) and must adhere to the COARDS metadata conventions (http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html). In particular, the following points must be fullfilled:

Attribute Description
Latitude and Longitude dimension At this stage of development, only rectilinear (lon-lat) grids are supported. The data is automatically regridded onto the simulation grid (see section 15 for more details).
Vertical dimension In a non-ESMF environment, 3D data is interpolated onto the simulation levels if (and only if) the number of vertical levels is greater than one and not equal to the number of vertical levels of the simulation grid. In all other cases, it is assumed that data is already on the simulation levels. In particular, this explicitly assumes that the vertical coordinate direction is upwards, i.e. the first level index corresponds to the surface layer. Currently, only hybrid sigma pressure coordinate systems are supported. In order to properly determine the vertical pressure levels of the input data, the file must contain the surface pressure values and the hybrid coefficients (a, b) of the coordinate system. Further, the ’level’ variable must contain the attributes ’standard_name’ and ’formula_terms’ (the attribute ’positive’ is recommended but not required). A header excerpt of a valid netCDF file is shown below:
#double lev(lev) ;
#lev:standard_name = ”atmosphere_hybrid_sigma_pressure_coordinate” ;
#lev:units = ”level” ;
#lev:positive = ”down” ;
#lev:formula_terms = ”ap: hyam b: hybm ps: PS” ;
#double hyam(nhym) ;
#hyam:long_name = ”hybrid A coefficient at layer midpoints” ;
#hyam:units = ”hPa” ;
#double hybm(nhym) ;
#hybm:long_name = ”hybrid B coefficient at layer midpoints” ;
#hybm:units = ”1” ;
#double time(time) ;
#time:standard_name = ”time” ;
#time:units = ”days since 2000-01-01 00:00:00” ;
#time:calendar = ”standard” ;
#double PS(time, lat, lon) ;
#PS:long_name = ”surface pressure” ;
#PS:units = ”hPa” ;
#double EMIS(time, lev, lat, lon) ;
#EMIS:long_name = ”emissions” ;
#EMIS:units = ”kg m-2 s-1” ;

Vertical regridding is currently not supported in an ESMF environment.


Time Times should be given as relative times, e.g. relative to a specified reference date. Accepted are ’days since yyyy-mm-dd’, ’hours since yyyy-mm-dd hh:mm:ss’, and ’minutes since yyyy-mm-dd hh:mm:ss’. There have been problems with some netCDF files with reference dates prior to 1901 (e.g. days since 1900-1-1) and reference years after 1900 should be used if possible.

Weekly data must contain seven time slices in increments of one day. The first entry must represent Sunday data, irrespective of the real weekday of the assigned datetime. It is possible to store weekly data for more than one time interval, in which case the first weekday (i.e. Sunday) must hold the starting date for the given set of (seven) time slices. For instance, weekly data for every month of a year can be stored as 12 sets of 7 time slices. The datetime of the first entry of each set must fall on the first day of every month, and the following six entries must be increments of one day. Curretnly, weekly data from netCDF files is not correctly read in an ESMF environment.

Data units It is recommended to store data in one of the HEMCO standard units: ’kg/m2/s’ and ’kg(C)/m2/s’ for fluxes; ’kg/m3’ and ’kg(C)/m3’ for concentrations; ’1’ for unitless data; and ’count’ for index-based data, i.e. discrete distributions (for instance, land types represented as integer values between 1 and 28). HEMCO will attempt to convert all data to one of those units, unless otherwise specified via the ’srcUnit’ attribute (see section 7).

Mass conversion (e.g. from molecules to kg) is performed based on the properties (e.g. molecular weight) of the species assigned to the given data set. It is also possible to convert between species-based and molecule-based units (e.g. kg vs. kg(C)). This conversion is based on the emitted molecular weight and the molecular ratio of the given species (see section 15.5). More details on unit conversion are given in module hco_unit_mod.F90. Index-based data is regridded in such a manner that every grid box on the new grid represents the index with the largest relative contribution from the overlapping boxes of the original grid. All other data are regridded as ’concentration’ quantities, i.e. conserving the global weighted average.

pyHEMCO GUI

To facilitate the creation and modification of a HEMCO configuration file, a Graphical User Interface (GUI) is currently developed. The GUI contains the following features:

  • Modification of existing configuration files.
  • Creation of new configuration files, either from scratch or starting from an existing file.
  • Direct execution of the stand-alone version of HEMCO
  • Preview of emissions from individual inventories, extensions, or any combination of it.

The pyHEMCO GUI is written in Python and linked to the pyGC visualization and data analysis package developed for GEOS-Chem. The developer version of the pyHEMCO GUI can be found at https://github.com/christophkeller. This version is still under development and not yet operational.

Behind the scenes of HEMCO

Overview

This section provides a short description of the main principles of HEMCO. More details are provided in the source code, and references to the corresponding modules is given where appropriate. The HEMCO code can be broken up into three parts: core code, extensions, interfaces. The core code consists of all core modules that are essential for every HEMCO simulation. The extensions are a collection of emission parameterizations that can be optionally selected (e.g. dust emissions, air-sea exchange, etc.). Most of the extensions require meteorological variables (2D or 3D fields) passed from an atmospheric model or an external input file to HEMCO. The interfaces are top-level routines that are only required in a given model environment (e.g. in stand-alone mode or under an ESMF framework). The HEMCO-model interface routines are located outside of the HEMCO code structure, calling down to the HEMCO driver routines for both the HEMCO core and extensions. HEMCO stores all emission data (base emissions, scale factors, masks) in a generic data structure (a ’HEMCO data container’). Input data read from disk is translated into this data structure by the HEMCO input/output module (hcoio_dataread_mod.F90 in HEMCO core). This step includes unit conversion and regridding.

HEMCO data objects

All emission data (base emissions, scale factors, masks) are internally stored in a data container. For each data element of the HEMCO configuration file, a separate data container object is created when reading the configuration file at the beginning of the simulation. The data container object is a FORTRAN derived type that holds information of one entry of the configuration file. All file data information such as filename, file variable, time slice options, etc. are stored in a ’FileData’ derived type object (defined in hco_filedata_mod.F90). This object also holds a pointer to the data itself. All data is stored as 2 or 3 dimensional data arrays. HEMCO can keep multiple time slices in memory simultaneously, e.g. for diurnal scale factors, in which case a vector of data arrays is created. Data arrays are defined in module hco_arr_mod.F90. Data containers (and as such, emissions data) are accessed through three different linked lists: ConfigList, ReadList, EmisList. These lists all point to the same content (i.e. the same containers) but ordered in a manner that is most efficient for the intendend purpose: for example, ReadList contains sub-lists of all containers that need to be updated annually, monthly, daily, hourly, or never. Thus, if a new month is entered, only a few lists (monthly, daily and hourly) have to be scanned and updated instead of going through the whole list of data containers. Similarly, EmisList sorts the data containers by model species, emission category and hierarchy. This allows an efficient emission calculation since the EmisList has to be scanned only once. List containers and generic linked list routines are defined in hco_datacont_mod.F90. Specific routines for ConfigList, ReadList and EmisList are defined in hco_config_mod.F90, hco_readlist_mod.F90, and hco_emislist_mod.F90, respectively.

Core

HEMCO core consists of all routines and variables required to read, store, and update data used for emissions calculation. The driver routines to execute (initialize, run and finalize) a HEMCO core simulation are (see hco_driver_mod.F90: HCO_INIT, HCO_RUN, HCO_FINAL. These are also the routines that are called at the interface level (see section 15.5). Each HEMCO simulation is defined by its state object ’HcoState’, which is a derived type that holds all simulation information, including a list of the defined HEMCO species, emission grid information, configuration file name, and additional run options. More details on the HEMCO state object can be found in hco_state_mod.F90. HcoState is defined at the interface level and then passed down to all HEMCO routines (see also section 15.5).

Initialize: HCO_INIT

Before running HEMCO, all variables and objects have to be initialized properly. The initialization of HEMCO occurs in three steps:

  1. Read the HEMCO configuration file (Config_ReadFile in hco_config_mod.F90). This writes the content of the entire configuration file into buffer, and creates a data container for each data item (base emission, scale factor, mask) in ConfigList.
  2. Initialize HcoState.
  3. Call HCO_INIT, passing HcoState to it. This initializes the HEMCO clock object (see hco_clock_mod.F90) and creates the ReadList (hco_readlist_mod.F90). The ReadList links to the data containers in ConfigList, but sorted by data update frequency. Data that is not used at all (e.g. scale factors that are not used by any base emission, or regional emissions that are outside of the emission grid). The EmisList linked list is only created in the run call.

Note that steps 1 and 2 occur at the interface level (see section 15.5).

Run: HCO_RUN

This is the main function to run HEMCO. It can be repeated as often as necessary. Before calling this routine, the internal clock object has to be updated to the current simulation time (HcoClock_Set, see hco_clock_mod.F90). HCO_RUN performs the following steps:

  1. Updates the time slice index pointers. This is to make sure that the correct time slices are used for every data container. For example, hourly scale factors can be stored in a data container holding 24 individual 2D fields. Module hco_tidx_mod.F90 organizes how to properly access these fields.
  2. Read/update the content of the data containers (ReadList_Read). Checks if there are any fields that need to be read/updated (e.g. if this is a new month compared to the previous time step) and updates these fields if so by calling the data interface (see section 15.5).
  3. Creates/updates the EmisList object. Similar to ReadList, EmisList points to the data containers in ConfigList, but sorted according to species, emission hierarchy, emissions category. To optimize emission calculations, EmisList already combines base emission fields that share the same species, category, hierarchy, scale factors, and field name (without the field name tag, see section 7).
  4. Calculate core emissions for the current simulation time. This is performed by subroutine hco_calcemis (hco_calc_mod.F90). This routine walks through EmisList and calculates the emissions for every base emission field by applying the assigned scale factors to it. The (up to 10) container IDs of all scale factors connected to the given base emission field (as set in the HEMCO configuration file) are stored in the data container variable ScalIDs. A container ID index list is used to efficiently retrieve a pointer to each of those containers (see cIDList in hco_datacont_mod.F90).

Finalize: HCO_FINAL

This routine cleans up all internal lists, variables, and objects. This does not clean up the HEMCO state object, which is removed at the interface level.

Extensions

HEMCO extensions are used to calculate emissions based on meteorological input variables and/or non-linear parameterizations. Each extension is provided in a separate FORTRAN module. Each module must contain a public subroutine to initialize, run and finalize the extension. Emissions calculated in the extensions are added to the HEMCO emission array using subroutine HCO_Emis_Add (HCO_FluxArr_mod.F90). Meteorological input data is passed to the individual extension routines through the extension state object ’ExtState’, which provides a pointer slot for all met fields used by any of the extension (see hcox_state_mod.F90). These pointers must be assigned at the interface level (see section 15.5). In analogy to the core module, the three main routines for the extensions are (in hcox_driver_mod.F90):

  • HCOX_INIT
  • HCOX_RUN
  • HCOX_FINAL

These subroutines invoke the corresponding calls of all (enabled) extensions and must be called at the interface level (after the core routines). Extension settings (as specified in the configuration file, see also section 5) are automatically read by HEMCO. For any given extension, routines GetExtNr and GetExtOpt can be used to obtain the extension number and desired setting value, respectively. (see HCO_ExtList_Mod.F90). Routine HCO_GetExtHcoID should be used to extract the HEMCO species IDs of all species registered for this extension. Gridded data associated to an extension (i.e. listed in section extension data of the configuration file) is automatically added to the EmisList, but ignored by the HEMCO core module during emissions calculation. Pointers to these data arrays can be obtained through routine EmisList_GetDataArr (HCO_EmisList_Mod.F90). Note that this routine identifies the array based on its container name. It is therefore important that the container name set in the configuration file matches the names used by this routine!

Interfaces

HEMCO - model interface

The interface provides the link between HEMCO and the model environment. This may be a sophisticated Earth System model or a simple environment that allows the user to run HEMCO in standalone mode. The standalone interface is provided along with the HEMCO distribution (hcoi_standalone_mod.F90). The HEMCO-GEOS-Chem model interface is included in the GEOS-Chem source code (hcoi_gc_main_mod.F90 in GeosCore). HEMCO has also been successfully employed as a stand-alone gridded component within an ESMF environment. Please contact Christoph Keller for more information on the ESMF implementation. The interface routines provide HEMCO with all the necessary information to perform the emission calculation. This includes the following tasks:

Initialization:

  • Read the configuration file (Config_ReadFile in hco_config_mod.F90).
  • Initialize HcoState object (HcoState_Init in hco_state_mod.F90).
  • Define the emission grid. Grid definitions are stored in HcoState%Grid. The emission grid is defined by its horizontal mid points and edges (all 2D fields), the hybrid sigma coordinate edges (3D), the grid box areas (2D), and the grid box heights. The latter is only used by some extensions (DEAD dust emissions and lightning NOx) and may be left undefined if those are not used.
  • Define emission species. Species definitions are stored in vector HcoState%Spc(:) (one entry per species). For each species, the following parameter are required:
    1. HEMCO species ID: unique integer index for species identification. For internal use only.
    2. Model species ID: the integer index assigned to this species by the employed model.
    3. Species name
    4. Species molecular weight in g/mol.
    5. Emitted species molecular weight in g/mol. This value can be different to the species molecular weight if species are emitted on a molecular basis, e.g. in mass carbon (in which case the emitted molecular weight becomes 12 g/mol).
    6. Molecular ratio: molecules of emitted species per molecules of species. For example, if C3H8 is emitted as kg C, the molecular ratio becomes 3.
    7. K0: Liquid over gas Henry constant in M/atm.
    8. CR: Temperature dependency of K0 in K.
    9. pKa: The species pKa, used for correction of the Henry constant.

The molecular weight - together with the molecular ratio - determine the mass scaling factors used for unit conversion in hco_unit_mod.F90. The Henry coefficients are only used by the air-sea exchange extension (and only for the specified species) and may be left undefined for other species and/or if the extension is not used.

  • Define simulation time steps. The emission, chemical and dynamic time steps can be defined separately.
  • Initialize HEMCO core (HCO_INIT in hco_driver_mod.F90)
  • Initialize HEMCO extensions (HCOX_INIT in hcox_driver_mod.F90)

Run:

  • Set current time (HcoClock_Set in hco_clock_mod.F90)
  • Reset all emission and deposition values (hco_FluxArrReset in hco_fluxarr_mod.F90)
  • Run HEMCO core to calculate emissions (hco_Run in hco_driver_mod.F90)
  • Link the (used) met. field objects of ExtState to desired data arrays (this step may also be done during initialization)
  • Run HEMCO extensions to add extensions emissions (hcox_Run in hcox_driver_mod.F90)
  • Export HEMCO emissions into desired environment

Finalization:

  • Finalize HEMCO extensions and extension state object ExtState (hcox_final in hcox_driver_mod.F90).
  • Finalize HEMCO core (hco_final in hco_driver_mod.F90).
  • Clean up HEMCO state object HcoState (hcoState_final in hco_state_mod.F90).

Data interface (reading and regridding)

The data interface (hcoi_dataread_mod.F90) organizes reading, unit conversion, and remapping of data from source files. Its public routine HCOI_DataRead is only called by subroutine ReadList_Fill in hco_readlist_mod.F90. Data processing is performed in three steps:

  1. Read data from file using the source file information (file name, source variable, desired time stamp) provided in the configuraton file.
  2. Convert unit to HEMCO units based on the unit attribute read from disk and the srcUnit attribute set in the configuration file. See section 13 for more information.
  3. Remap original data onto the HEMCO emission grid. The grid dimensions of the input field are determined from the source file. If only horizontal regridding is required, e.g. for 2D data or if the number of vertical levels of the input data is equal to the number of vertical levels of the HEMCO grid, the horizontal interpolation routine used by GEOS-Chem is invoked. If vertical regridding is required or to interpolate index-based values (e.g. discrete integer values), the NcRegrid tool described in Joeckel (2006) is used.

References

  Janssens-Maenhout, A., Petrescu, A., Muntean, M., and Blujdea, V.: Verifying Greenhouse Gas Emissions: Methods to Support International Climate Agreements, The National Academies Press, Washington, DC., 2010.
  Joeckel P: , Technical note: Recursive rediscretisation of geo-scientific data in the Modular Earth Submodel System (MESSy), Atmos. Chem. Phys., 6, 3557–3562, 2006.
  Keller, C. A., Long, M. S., Yantosca, R. M., Da Silva, A. M., Pawson, S., and Jacob D. J.: , HEMCO v1.0: a versatile, ESMF-compliant component for calculating emissions in atmospheric models, Geosci. Model Dev., 7, 1409–1417, 2014.
  Lamarque, J.-F., Bond, T. C., Eyring, V., Granier, C., Heil, A., Klimont, Z., Lee, D., Liousse, C., Mieville, A., Owen, B., Schultz, M. G., Shindell, D., Smith, S. J., Stehfest, E., Van Aardenne, J., Cooper, O. R., Kainuma, M., Mahowald, N., McConnell, J. R., Naik, V., Riahi, K., and van Vuuren, D. P.: Historical (1850–2000) gridded anthropogenic and biomass burning emissions of reactive gases and aerosols: methodology and application, Atmospheric Chemistry and Physics, 10, 7017–7039, 2010.
  Stettler, M., Eastham, S., and Barrett, S.: Air quality and public health impacts of UK airports. Part I: Emissions, Atmospheric Environment, 45, 5415 – 5424, 2011.
  Vestreng, V., Ntziachristos, L., Semb, A., Reis, S., Isaksen, I. S. A., and Tarrasón, L.: Evolution of NOx emissions in Europe with focus on road transport control measures, Atmospheric Chemistry and Physics, 9, 1503–1520, 2009.
  van der Werf, G. R., Randerson, J. T., Giglio, L., Collatz, G. J., Mu, M., Kasibhatla, P. S., Morton, D. C., DeFries, R. S., Jin, Y., and van Leeuwen, T. T.: Global fire emissions and the contribution of deforestation, savanna, forest, agricultural, and peat fires (1997–2009), Atmospheric Chemistry and Physics, 10, 11 707–11 735, 2010.
  Wiedinmyer, C., Yokelson, R. J., and Gullett, B. K.: Global Emissions of Trace Gases, Particulate Matter, and Hazardous Air Pollutants from Open Burning of Domestic Waste, Environmental Science & Technology, 16, 9523–9530, 2014.