Downloading GEOS-Chem data directories
- Minimum system requirements
- Downloading source code
- Downloading data directories
- Creating run directories
- Configuring runs
- Output files
- Visualizing and processing output
- Coding and debugging
- Further reading
This page describes where you can obtain the GEOS-Chem source code and required data files.
- 1 Overview
- 2 Shared data directory archives
- 3 1-month and 1-year benchmark output
- 4 Using wget to download the shared data directories
- 5 Downloading the HEMCO data directories
- 6 For more information
In addition to the files contained in the run directories, GEOS-Chem also needs to access data directories containing:
- Meteorological data (a.k.a. the "met fields) used to drive GEOS–Chem
- Emissions inventories used by GEOS-Chem
- Scale factors used to scale emissions from a base year to a given year
- Sample restart files that you can use to spin up your GEOS-Chem simulations
- Oxidant (OH, O3) concentrations for both full-chemistry and offline simulations
- IPCC future scenarios (for GCAP simulations)
- Other GEOS–Chem specific data files.
These files are often too large to store in a single user's disk space. Therefore, they are meant to be stored in shared disk space where all GEOS-Chem users in your group can have access to them.
The GEOS-Chem shared data directories can be downloaded from the Compute Canada archive. Unlike the source code and run directories, the data directory download can be done either by anonymous FTP or by the freely-available GNU wget utility. (We recommend wget because it is much more flexible and can be used to download several directories recursively.)
NOTE: If you are using GEOS-Chem on the Amazon Web Services cloud computing platform, then you can access the GEOS-Chem shared data directories as an S3 bucket (s3://gcgrid). Please see our cloud computing tutorial (cloud.geos-chem.org) for more details.
This page describes how you can download the GEOS-Chem shared data directories. Please also be sure to view the following pages for additional information:
The GEOS–Chem shared data directories may be downloaded from the following locations:
|Amazon Web Services S3 storage||s3://gcgrid||
Compute Canada data directory archive
GEOS-Chem input data may be downloaded from Compute Canada at:
We recommend that you use the wget utility to download these directories. Wget allows you to download multiple directories at once. For example:
wget -r -np -nH -R "*.html" "http://geoschemdata.computecanada.ca/DIRECTORY_NAME"
See the table below for the DIRECTORY_NAME options. The -R "*.html" option will reject HTML files.
Compute Canada directory structure
|ExtData/||Root data directory containing all meteorlogy fields, emissions data, and chemistry input data.|
|ExtData/CHEM_INPUTS/||Contains non-emissions data for GEOS-Chem chemistry modules|
|ExtData/HEMCO/||Contains emissions data for the HEMCO emissions component|
|ExtData/GEOSCHEM_RESTARTS/||Contains sample restart files uses to initialize GEOS-Chem simulations.|
|0.25° x 0.3125° Data Directories||Description|
|ExtData/GEOS_0.25x0.3125/GEOS_FP/YYYY/MM/||0.25° x 0.3125° GEOS-FP global met fields|
|ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/YYYY/MM/||0.25° x 0.3125° GEOS-FP met fields cropped to the Asia domain|
|ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/YYYY/MM/||0.25° x 0.3125° GEOS-FP met fields cropped to the China domain|
|ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/YYYY/MM/||0.25° x 0.3125° GEOS-FP met fields cropped to the Europe domain|
|ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/YYYY/MM/||0.25° x 0.3125° GEOS-FP met fields cropped to the North America domain|
|0.5° x 0.625° Data Directories||Description|
|ExtData/GEOS_0.5x0.625/MERRA2/YYYY/MM/||0.5° x 0.625° MERRA-2 global met fields|
|ExtData/GEOS_0.5x0.625_AS/MERRA2/YYYY/MM/||0.5° x 0.625° MERRA-2 met fields cropped to the Asia domain|
|ExtData/GEOS_0.5x0.625_CH/MERRA2/YYYY/MM/||0.5° x 0.625° MERRA-2 met fields cropped to the China domain|
|ExtData/GEOS_0.5x0.625_EY/MERRA2/YYYY/MM/||0.5° x 0.625° MERRA-2 met fields cropped to the Europe domain|
|ExtData/GEOS_0.5x0.625_NA/MERRA2/YYYY/MM/||0.5° x 0.625° MERRA-2 met fields cropped to the North America domain|
|2° x 2.5° Data Directories||Description|
|ExtData/GEOS_2x2.5/GEOS_FP/YYYY/MM||2° x 2.5° GEOS-FP global met fields|
|ExtData/GEOS_2x2.5/MERRA2/YYYY/MM||2° x 2.5° MERRA-2 global met fields|
|4° x 5° Data Directories||Description|
|ExtData/GEOS_4x5/GEOS_FP/YYYY/MM/||4° x 5° GEOS-FP global met fields|
|gcgrid/data/ExtData/GEOS_4x5/MERRA2/YYYY/MM/||4° x 5° MERRA-2 global met fields|
Amazon Web Services S3 storage
The Harvard data directory archive is mirrored as an Amazon Web Services S3 directory. You can get a directory listing with this command:
aws s3 ls --request-payer=requester s3://gcgrid/...
and you can copy data to your Elastic Block Storage
aws s3 cp --request-payer=requester s3:/gcgrid/DIRECTORY-NAME
For more information, please see:
- Our GEOS-Chem cloud computing tutorial cloud.geos-chem.org
- aws s3 ls command reference
- aws s3 cp command reference
- More AWS S3 examples
1-month and 1-year benchmark output
You can find output files and evaluation plots for 1-month and 1-year benchmark simulations in the following locations:
|http://ftp.as.harvard.edu/gcgrid/geos-chem/1mo_benchmarks/||Contains the following data from the 1-month benchmarks used to evaluate GEOS-Chem:
|http://ftp.as.harvard.edu/gcgrid/geos-chem/1yr_benchmarks/||Contains the following data from the 1-year benchmarks used to evaluate GEOS-Chem:
A simple way to download the GEOS-Chem emissions and met field data is to use the Unix wget utility. This allows you to download multiple files and directories at a time.
The wget utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.). See the wget manual for more information.
Most of the time, the syntax you will use to download multiple directories is as follows:
wget -r -np -nH -R "*.html" "http://geoschemdata.computecanada.ca/DIRECTORY_NAME/"
The options to wget are as follows:
-r Specifies recursive directory transfer (i.e. will download all subdirectories) -np Will not allow ascent to the parent directory -nH Will store all directories and subdirectories in DIRECTORY_NAME, not geoschemdata.computecanada.ca/DIRECTORY_NAME -R "*.html" Will reject any file ending in .html
If you wish to trim the name of the downloaded directory (i.e., so it downloads as DIRECTORY_NAME, not pub/geos-chem/data/DIRECTORY_NAME), then use the --cut-dirs option:
wget -r -np -nH -R "*.html" --cut-dirs=X "http://geoschemdata.computecanada.ca/DIRECTORY_NAME/"
where X is the number of directories to trim.
NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then wget will just return a directory listing in a file named index.html without any files being downloaded.
Obtain only updated files
Prasad Kasibhatla wrote:
- Maybe this is common knowledge, but I just discovered that using the -N option in wget ensures that only files with newer timestamps than what resides on my local machines are downloaded - found this very useful to update my shared data directories.
Downloading the HEMCO data directories
The GEOS-Chem Support Team has created a package called hemco_data_download. With this package, you can download the various emissions inventories and related data files for HEMCO to your own disk server. For complete instructions, please see our HEMCO data directories wiki page
--Melissa Sulprizio 16:52, 8 April 2015 (EDT)
For more information
For GCAP users
--Bob Y. 09:52, 8 March 2010 (EST)