Difference between revisions of "HEMCO data directories"

From Geos-chem
Jump to: navigation, search
(Setting up the configuration file)
m (Setting up the configuration file)
Line 373: Line 373:
 
  Dryrun only?          | no
 
  Dryrun only?          | no
  
# The '''Remote HEMCO data path''' is the location on the FTP server from which you are going to download the data. This can be from either Harvard or from Dalhousie. (For now we will use the Harvard server). You can edit this accordingly.
+
#'''Remote HEMCO data path''' is the location on the FTP server from which you are going to download the data. This can be from either Harvard or from Dalhousie. (For now we will use the Harvard server). You can edit this accordingly.
# The '''Your HEMCO data path''' specifies the root-level directory for HEMCO data on your own disk space. If you are not sure where to place this, then ask your sysadmin.
+
#'''Your HEMCO data path''' specifies the root-level directory for HEMCO data on your own disk space. If you are not sure where to place this, then ask your sysadmin.
#The '''Verbose output''' flag lets you specify if you want to print out extra output during the download process. This can be set to either "yes" or "no".
+
#'''Verbose output''' lets you specify if you want to print out extra output during the download process. This can be set to either "yes" or "no".
#The '''Dryrun only''' flag allows you to print out the data download commands without actually downloading the data. This is useful for debugging.
+
#'''Dryrun only''' flag allows you to print out the data download commands without actually downloading the data. This is useful for debugging.
  
 
In the next section you specify all of the HEMCO inventories that you want to download. You will see this header:
 
In the next section you specify all of the HEMCO inventories that you want to download. You will see this header:
Line 399: Line 399:
 
   
 
   
 
Lines starting with the comment character # will be ignored. Each line specifies the name of a HEMCO emissions inventory and the data path where
 
Lines starting with the comment character # will be ignored. Each line specifies the name of a HEMCO emissions inventory and the data path where
it can be found on disk, relative to the root data path.  
+
it can be found on disk, relative to the root data path. NOTE: The script will replace the <tt>$ROOT</tt> token with the value you gave to the "HEMCO remote data path" above.
 
+
NOTE: The script will replace the $ROOT token with the value you gave to the "HEMCO remote data path" above.
+
  
 
Any inventory found in this section will be downloaded. To prevent an inventory from being downloaded you can either comment it out (i.e. place a
 
Any inventory found in this section will be downloaded. To prevent an inventory from being downloaded you can either comment it out (i.e. place a
# in the first column) or move the inventory to the next section. The final section specifies HEMCO emission inventories that you do not wish
+
# in the first column) or move the inventory to the next section.  
to download. The section looks like this:
+
 
 +
The final section specifies HEMCO emission inventories that you do not wish to download. The section looks like this:
  
 
  ###############################################################################
 
  ###############################################################################
Line 422: Line 421:
 
  ... etc ...
 
  ... etc ...
  
--[[User:Bmy|Bob Y.]] 13:51, 12 February 2015 (EST)
+
--[[User:Bmy|Bob Y.]] 13:55, 12 February 2015 (EST)
  
 
=== Downloading ===
 
=== Downloading ===

Revision as of 18:55, 12 February 2015

NOTE: Page under construction

Overview

HEMCO data directory structure

In the following sections we list the directory structure containing emissions inventories and related data files for use with HEMCO. Each data directory is referenced relative to the HEMCO root directory, which is specified by $ROOT token.

For example, on the Harvard disk server, the value of $ROOT is /mnt/gcgrid/data/ExtData/HEMCO/. All of the data files read by HEMCO are stored in subfolders of this root directory.

--Bob Y. 13:06, 12 February 2015 (EST)

Aerosols

Inventory Path Status
AEROCOM volcanic emissions $ROOT/VOLCANO/v2014-10 CURRENTLY USED
Tami Bond et al BC and OC emissions $ROOT/BCOC_BOND/v2014-07 CURRENTLY USED
Cooke et al BC and OC emissions $ROOT/BCOC_COOKE/v2014-07 CURRENTLY USED
Secondary organic aerosols $ROOT/HEMCO/SOA/2014-07 CURRENTLY USED

--Bob Y. 11:50, 12 February 2015 (EST)

Anthropogenic and biofuel emissions

Inventory Path Status
GEIA global anthropogenic $ROOT/GEIA/v2014-07/ Slated for replacement
GEIA NH3 (anthro, biofuel, natural source) $ROOT/NH3/v2014-07 CURRENTLY USED
EDGAR global anthropogenic $ROOT/EDGAR/v2014-07 CURRENTLY USED
RETRO VOC emissions $ROOT/RETRO/v2014-07 CURRENTLY USED
Yevich & Logan biofuels $ROOT/BIOFUEL/v2014-07/ CURRENTLY USED
BRAVO regional anthropogenic $ROOT/BRAVO/v2014-07 CURRENTLY USED
CAC regional anthropogenic $ROOT/CAC/v2014-07 CURRENTLY USED
EMEP regional anthropogenic $ROOT/EMEP/v2015-01 CURRENTLY USED
NEI2005 regional anthro/biofuel $ROOT/NEI2005/v2014-09 CURRENTLY USED
NEI/VISTAS scale factors $ROOT/VISTAS/v2014-07 CURRENTLY USED
Streets regional anthro $ROOT/STREETS/v2014-07 CURRENTLY USED
MASAGE agricultural NH3 $ROOT/MASAGE_NH3/v2015-02 To debut in GEOS-Chem v10-01
Yaping Xiao et al C2H6 and C3H8 anthropogenic $ROOT/XIAO/v2014-09/ CURRENTLY USED

--Bob Y. 12:13, 12 February 2015 (EST)

Anthropogenic aircraft and ship emissions

Inventory Path Status
AEIC aircraft $ROOT/AEIC/v2014-10 CURRENTLY USED
ARCTAS ship emissions $ROOT/ARCTAS_SHIP/v2014-07 CURRENTLY USED
Corbett et al ship emissions $ROOT/VOLCANO/v2014-10 CURRENTLY USED
ICOADS ship $ROOT/ICOADS_SHIP/v2014-07 CURRENTLY USED

--Bob Y. 12:19, 12 February 2015 (EST)

Biomass burning emissions

Inventory Path Status
GFED3 biomass $ROOT/GFED3/v2014-10 CURRENTLY USED
  • This is the default biomass burning inventory.
  • Soon to be replaced by GFED4.
FINN biomass $ROOT/FINN/v2015-02 OPTIONAL
  • You may choose to replace GFED3 with FINN for research purposes.
  • If you do not need FINN, you may choose not to download it, in order to save disk space.
QFED biomass $ROOT/QFED/v2014-09 OPTIONAL
  • You may choose to replace GFED3 with QFED for research purposes.
  • If you do not need QFED, you may choose not to download it, in order to save disk space.
GFED2 biomass $ROOT/GFED2/v2014-07 OBSOLETE
  • Superseded by GFED3.
Duncan et al biomass $ROOT/BIOBURN/v2014-07 OBSOLETE
  • Superseded by GFED3.

--Bob Y. 12:27, 12 February 2015 (EST)

Emissions implemented as HEMCO extensions

Inventory Path Status
DEAD dust model $ROOT/DUST_DEAD/2014-07 CURRENTLY USED
GINOUX dust model $ROOT/DUST_GINOUX/2014-07 CURRENTLY USED
MEGAN biogenic emissions $ROOT/MEGAN/v2014-07 CURRENTLY USED
NO from lightning $ROOT/LIGHTNOX/v2014-07 CURRENTLY USED
NO from soils/fertilizers $ROOT/SOILNOX/v2014-07 CURRENTLY USED
PARANOX ship plume model $ROOT/PARANOX/v2014-07 CURRENTLY USED

--Bob Y. 12:43, 12 February 2015 (EST)

Future and historical emissions

Inventory Path Status
RCP future scenarios $ROOT/RCP/RCP26 STILL BEING IMPLEMENTED

--Bob Y. 12:40, 12 February 2015 (EST)

GEOS-Chem specialty simulation data

Inventory Path Status
Aerosol-only simulation $ROOT/OFFLINE_AEROSOL/v2014-09 CURRENTLY USED
CH4 simulation $ROOT/CH4/v2014-09 CURRENTLY USED
CO2 simulation $ROOT/CO2/v2014-09 CURRENTLY USED
Mercury simulation $ROOT/MERCURY/v2014-09/ CURRENTLY USED
POPs simulation $ROOT/POPs/v2014-09 CURRENTLY USED
Tagged CO simulation $ROOT/TAGGED_CO/v2014-08 CURRENTLY USED
Tagged O3 simulation $ROOT/TAGGED_O3/v2014-09 CURRENTLY USED
O3 for offline simulations $ROOT/O3/v2014-09/ CURRENTLY USED
OH for offline simulations $ROOT/OH/v2014-09 CURRENTLY USED
H2O2 for offline simulations $ROOT/OXIDANTS/v2014-07 CURRENTLY USED
CH3I simulation $ROOT/CH3I/v2014-07 OBSOLETE

--Bob Y. 12:53, 12 February 2015 (EST)

Seawater concentrations

Inventory Path Status
Acetone seawater $ROOT/ACET/v2014-07 CURRENTLY USED
DMS seawater $ROOT/DMS/v2014-07 CURRENTLY USED

--Bob Y. 12:55, 12 February 2015 (EST)

Stratospheric data

Inventory Path Status
Stratospheric Bry data $ROOT/STRAT/v2015-01 CURRENTLY USED

--Bob Y. 12:58, 12 February 2015 (EST)

Other inputs for HEMCO

Inventory Path Status
Annual scale factors $ROOT/AnnualScalar/v2014-07 CURRENTLY USED
Mask files $ROOT/MASKS/v2014-07 CURRENTLY USED
MAP_A2A regridding data $ROOT/MAP_A2A/v2014-07 CURRENTLY USED
Timezone offsets from UTC $ROOT/TIMEZONES/v2015-02 Will debut in GEOS-Chem v10-01
Weekly scale factors $ROOT/WEEKSCALE/v2014-07 CURRENTLY USED

--Bob Y. 12:59, 12 February 2015 (EST)

Downloading the HEMCO data directories

The GEOS-Chem Support Team has created a package called hemco_data_download. With this package, you can download the various emissions inventories and related data files for HEMCO to your own disk server. Furthermore, you can specify which data directories that you would like to download (as well as those you would like to ignore) via a configuration file.

Obtaining the hemco_data_download package

To obtain the hemco_data_download package, use Git to clone this repository

git clone https://github.com/GCST/hemco_data_download.git

This will create a directory named hemco_data_download, in which you should see the following files:

README 
File with an overall description of the directory contents
hemcoDataDownload.pl 
Perl script to download HEMCO data directories
hemcoDataDownload.rc 
Configuration file for the hemcoDataDownload.pl script. In this file you can specify which HEMCO data directories you would like to download and which you would like to omit.
forTesting.rc 
A configuration file that you can use for testing or debugging. This will tell hemcoDataDownload.pl only to download a couple of emissions inventories with files that do not take up much disk space.

Setting up the configuration file

The configuration files (i.e. hemcoDataDownload.rc and forTesting.rc) are pretty much self-explanatory. At the top of the file you will see this section:

###############################################################################
#                                                                             #
#  Specify the remote and local HEMCO data paths, plus other options.         #
#                                                                             #
###############################################################################

Remote HEMCO data path | ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/HEMCO
Your HEMCO data path   | /as/scratch/bmy/HEMCO
Verbose output         | yes
Dryrun only?           | no
  1. Remote HEMCO data path is the location on the FTP server from which you are going to download the data. This can be from either Harvard or from Dalhousie. (For now we will use the Harvard server). You can edit this accordingly.
  2. Your HEMCO data path specifies the root-level directory for HEMCO data on your own disk space. If you are not sure where to place this, then ask your sysadmin.
  3. Verbose output lets you specify if you want to print out extra output during the download process. This can be set to either "yes" or "no".
  4. Dryrun only flag allows you to print out the data download commands without actually downloading the data. This is useful for debugging.

In the next section you specify all of the HEMCO inventories that you want to download. You will see this header:

###############################################################################
#                                                                             #
#  THE FOLLOWING DATA DIRECTORIES WILL BE DOWNLOADED.                         #
#                                                                             #
#  These data directories comprise the recommended emissions configuration    #
#  for typical GEOS-Chem full-chemistry and specialty simulations.            #
#                                                                             #
###############################################################################

#=============================+================================================
# AEROSOLS                    | Directory paths
#=============================+================================================
AEROCOM volcano emissions     | $ROOT/VOLCANO/v2014-10
Bond et al BC/OC              | $ROOT/BCOC_BOND/v2014-07
Cooke et al BC/OC             | $ROOT/BCOC_COOKE/v2014-07
Secondary organic aerosols    | $ROOT/SOA/v2014-07
... etc ...

Lines starting with the comment character # will be ignored. Each line specifies the name of a HEMCO emissions inventory and the data path where it can be found on disk, relative to the root data path. NOTE: The script will replace the $ROOT token with the value you gave to the "HEMCO remote data path" above.

Any inventory found in this section will be downloaded. To prevent an inventory from being downloaded you can either comment it out (i.e. place a

  1. in the first column) or move the inventory to the next section.

The final section specifies HEMCO emission inventories that you do not wish to download. The section looks like this:

###############################################################################
#                                                                             #
#  THE FOLLOWING DATA DIRECTORIES WILL NOT BE DOWNLOADED.                     #
#                                                                             #
#  These data directories contain are optional emissions inventories that     #
#  are not used in typical GEOS-Chem simulations.  If you wish to download    #
#  any of these inventories, simply move the corresponding entry for each     #
#  inventory to the previous section.                                         #
#                                                                             #
###############################################################################

CH3I simulation (obsolete)    | $ROOT/CH3I/v2014-07
Chlorophyll A                 | $ROOT/CHLA/v2014-07
... etc ...

--Bob Y. 13:55, 12 February 2015 (EST)

Downloading

To run the script you can type:

hemcoDataDownload.pl

which will read the default "hemcoDataDownload.rc" configuration file. You can also specify a different configuration file name as an argument to the script. For example, we have provided a configuration file named forTesting.rc that you can use to test if the data is being downloaded to the right directory path. Typing:

hemcoDataDownload.pl forTesting.pl

will only download a couple of data inventories that do not take up much disk space. This allows you to ensure that the data transfer is sucessful without making you wait a long time.