Downloading GEOS-Chem data directories

From Geos-chem
Jump to: navigation, search

Previous | Next | Getting Started with GEOS-Chem

  1. Minimum system requirements
  2. Downloading source code
  3. Downloading data directories
  4. Creating run directories
  5. Configuring runs
  6. Compiling
  7. Running
  8. Output files
  9. Visualizing and processing output
  10. Coding and debugging
  11. Further reading


This page describes where you can obtain the GEOS-Chem source code and required data files.

Overview

In addition to the files contained in the run directories, GEOS-Chem also needs to access data directories containing:

  • Meteorological data (a.k.a. the "met fields) used to drive GEOS–Chem
  • Emissions inventories used by GEOS-Chem
  • Scale factors used to scale emissions from a base year to a given year
  • Sample restart files that you can use to spin up your GEOS-Chem simulations
  • Oxidant (OH, O3) concentrations for both full-chemistry and offline simulations
  • IPCC future scenarios (for GCAP simulations)
  • Other GEOS–Chem specific data files.

These files are often too large to store in a single user's disk space. Therefore, they are meant to be stored in shared disk space where all GEOS-Chem users in your group can have access to them.

The GEOS-Chem shared data directories can be downloaded from the Compute Canada archive. Unlike the source code and run directories, the data directory download can be done either by anonymous FTP or by the freely-available GNU wget utility. (We recommend wget because it is much more flexible and can be used to download several directories recursively.)

NOTE: If you are using GEOS-Chem on the Amazon Web Services cloud computing platform, then you can access the GEOS-Chem shared data directories as an S3 bucket (s3://gcgrid). Please see our cloud computing tutorial (cloud.geos-chem.org) for more details.

This page describes how you can download the GEOS-Chem shared data directories. Please also be sure to view the following pages for additional information:

Shared data directory archives

The GEOS–Chem shared data directories may be downloaded from the following locations:

Name Location Notes
Compute Canada http://geoschemdata.computecanada.ca Recomended
Amazon Web Services S3 storage s3://gcgrid
  • This is an AWS S3 bucket containing a mirror of the Harvard University storage server. It will not contain the complete record of met fields, but additional data may be added by submitting a request to the GCST.
  • See our cloud computing tutorial (cloud.geos-chem.org) for more information.
  • NOTE: Downloading data from the cloud to your local server will incur an egress fee.

Compute Canada data directory archive

GEOS-Chem input data may be downloaded from Compute Canada at:

http://geoschemdata.computecanada.ca

We recommend that you use the wget utility to download these directories. Wget allows you to download multiple directories at once. For example:

wget -r -np -nH -R "*.html" "http://geoschemdata.computecanada.ca/DIRECTORY_NAME"

See the table below for the DIRECTORY_NAME options. The -R "*.html" option will reject HTML files.

Compute Canada directory structure

If you are downloading the GEOS-FP or MERRA-2 met data, then please note the following:

  1. You will need to download the "CN" (constant) data files for each horizontal grid that you are using.
    • For GEOS-FP these are timestamped for 2011/01/01 and are found in these data directories of ftp.as.harvard.edu:
      • ExtData/GEOS_0.25x0.3125/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.nc
      • ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.AS.nc
      • ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.CH.nc
      • ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.EU.nc
      • ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.NA.nc
      • ExtData/GEOS_2x2.5/GEOS_FP/2011/01/GEOSFP.20110101.CN.2x25.nc
      • ExtData/GEOS_4x5/GEOS_FP/2011/01/GEOSFP.20110101.CN.4x5.nc
    • For MERRA-2 these are timestamped for 2015/01/01 and are found in these data directories of ftp.as.harvard.edu:
      • ExtData/GEOS_0.5x0.625/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.nc4
      • ExtData/GEOS_0.5x0.625_AS/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.AS.nc4
      • ExtData/GEOS_0.5x0.625_CH/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.CH.nc4
      • ExtData/GEOS_0.5x0.625_EU/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.EU.nc4
      • ExtData/GEOS_0.5x0.625_NA/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.NA.nc4
      • ExtData/GEOS_2x2.5/MERRA2/2015/01/MERRA2.20150101.CN.2x25.nc4
      • ExtData/GEOS_4x5/MERRA2/2015/01/MERRA2.20150101.CN.4x5.nc4


Additional notes:

  • Prior to downloading GEOS-FP data, please be aware of caveats regarding use of GEOS-FP. See the GEOS-FP wiki page for more information.


Directory Description
ExtData/ Root data directory containing all meteorlogy fields, emissions data, and chemistry input data.
ExtData/CHEM_INPUTS/ Contains non-emissions data for GEOS-Chem chemistry modules
ExtData/HEMCO/ Contains emissions data for the HEMCO emissions component
ExtData/GEOSCHEM_RESTARTS/ Contains sample restart files uses to initialize GEOS-Chem simulations.
0.25° x 0.3125° Data Directories Description
ExtData/GEOS_0.25x0.3125/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP global met fields
ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the Asia domain
ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the China domain
ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the Europe domain
ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the North America domain
0.5° x 0.625° Data Directories Description
ExtData/GEOS_0.5x0.625/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 global met fields
ExtData/GEOS_0.5x0.625_AS/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the Asia domain
ExtData/GEOS_0.5x0.625_CH/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the China domain
ExtData/GEOS_0.5x0.625_EY/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the Europe domain
ExtData/GEOS_0.5x0.625_NA/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the North America domain
2° x 2.5° Data Directories Description
ExtData/GEOS_2x2.5/GEOS_FP/YYYY/MM 2° x 2.5° GEOS-FP global met fields
ExtData/GEOS_2x2.5/MERRA2/YYYY/MM 2° x 2.5° MERRA-2 global met fields
4° x 5° Data Directories Description
ExtData/GEOS_4x5/GEOS_FP/YYYY/MM/ 4° x 5° GEOS-FP global met fields
gcgrid/data/ExtData/GEOS_4x5/MERRA2/YYYY/MM/ 4° x 5° MERRA-2 global met fields

Amazon Web Services S3 storage

The Harvard data directory archive is mirrored as an Amazon Web Services S3 directory. You can get a directory listing with this command:

aws s3 ls --request-payer=requester s3://gcgrid/...

and you can copy data to your Elastic Block Storage

aws s3 cp --request-payer=requester s3:/gcgrid/DIRECTORY-NAME 

For more information, please see:

  1. Our GEOS-Chem cloud computing tutorial cloud.geos-chem.org
  2. aws s3 ls command reference
  3. aws s3 cp command reference
  4. More AWS S3 examples

--Bob Yantosca (talk) 14:31, 3 October 2019 (UTC)

1-month and 1-year benchmark output

You can find output files and evaluation plots for 1-month and 1-year benchmark simulations in the following locations:

Directory Description
http://ftp.as.harvard.edu/gcgrid/geos-chem/1mo_benchmarks/ Contains the following data from the 1-month benchmarks used to evaluate GEOS-Chem:
  • Restart files
  • Model output
  • Log files
  • Input files
  • Evaluation plots
http://ftp.as.harvard.edu/gcgrid/geos-chem/1yr_benchmarks/ Contains the following data from the 1-year benchmarks used to evaluate GEOS-Chem:
  • Restart files
  • Model output
  • Log files
  • Input files
  • Evaluation plots

Using wget to download the shared data directories

A simple way to download the GEOS-Chem emissions and met field data is to use the Unix wget utility. This allows you to download multiple files and directories at a time.

The wget utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.). See the wget manual for more information.

Syntax

Most of the time, the syntax you will use to download multiple directories is as follows:

wget -r -np -nH -R "*.html" "http://geoschemdata.computecanada.ca/DIRECTORY_NAME/"

The options to wget are as follows:

-r   Specifies recursive directory transfer (i.e. will download all subdirectories)
-np Will not allow ascent to the parent directory 
-nH  Will store all directories and subdirectories in DIRECTORY_NAME, not geoschemdata.computecanada.ca/DIRECTORY_NAME
-R "*.html" Will reject any file ending in .html

If you wish to trim the name of the downloaded directory (i.e., so it downloads as DIRECTORY_NAME, not pub/geos-chem/data/DIRECTORY_NAME), then use the --cut-dirs option:

wget -r -np -nH -R "*.html" --cut-dirs=X "http://geoschemdata.computecanada.ca/DIRECTORY_NAME/"

where X is the number of directories to trim.

NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then wget will just return a directory listing in a file named index.html without any files being downloaded.

Obtain only updated files

Prasad Kasibhatla wrote:

Maybe this is common knowledge, but I just discovered that using the -N option in wget ensures that only files with newer timestamps than what resides on my local machines are downloaded - found this very useful to update my shared data directories.

Downloading the HEMCO data directories

The GEOS-Chem Support Team has created a package called hemco_data_download. With this package, you can download the various emissions inventories and related data files for HEMCO to your own disk server. For complete instructions, please see our HEMCO data directories wiki page

--Melissa Sulprizio 16:52, 8 April 2015 (EDT)

For more information

For GCAP users

For those users who wish to run GEOS-Chem with the GISS/GCAP met fields, please contact Loretta Mickley.

--Bob Y. 09:52, 8 March 2010 (EST)



Previous | Next | Getting Started with GEOS-Chem