Downloading GEOS-Chem data directories

From Geos-chem
Jump to: navigation, search

Previous | Next | Getting Started with GEOS-Chem

  1. Minimum system requirements
  2. Configuring your computational environment
  3. Downloading source code
  4. Downloading data directories
  5. Creating run directories
  6. Configuring runs
  7. Compiling
  8. Running
  9. Output files
  10. Visualizing and processing output
  11. Coding and debugging
  12. Further reading


This page describes where you can obtain the GEOS-Chem source code and required data files.

Overview

What are the GEOS-Chem shared data directories?

In addition to the configuration files that ship with GEOS-Chem run directories, GEOS-Chem also needs to access data directories containing:

  • Meteorological data (a.k.a. the "met fields") used to drive GEOS–Chem
  • Emissions inventories used by GEOS-Chem
  • Scale factors used to scale emissions from a base year to a given year
  • Sample restart files that you can use to spin up your GEOS-Chem simulations
  • Oxidant (OH, O3) concentrations for both full-chemistry and offline simulations
  • Other GEOS–Chem specific data files.

These files are often too large to store in a single user's disk space. Therefore, they are meant to be stored in shared disk space where all GEOS-Chem users in your group can have access to them.

Do I really need to download ALL of this data?

Maybe not! If you are located at an institution that has multiple GEOS-Chem users, then your computer system might already have a copy of the GEOS-Chem shared data directories. If this is the case, you will not have to download any data (unless you need e.g. met field data for 2020 and your system only has the data up to 2019, etc.) If you are unsure whether or not the shared data directories are available to you, ask your sysadmin or IT staff.

Also, starting with GEOS-Chem 12.7.0, you can use a GEOS-Chem dry-run to download only the data files you need for a specific GEOS-Chem simulation. This can drastically reduce the number of data files that you need to download.

What if I am running GEOS-Chem on the AWS cloud?

A copy of the GEOS-Chem data directories is synced from the Harvard University FTP site (ftp.as.harvard.edu) to the Amazon Web Services s3://gcgrid bucket. You can easily download the data files you need from s3://gcgrid to the Elastic Block Storage (EBS) volume that is attached to your cloud instance. This is described in our cloud-computing tutorial cloud.geos-chem.org

To simplify matters even further, we recommend that you use a GEOS-Chem dry-run to download data from s3://gcgrid to your EBS volume.

Do I still need to use the hemco_data_download package?

The hemco_data_download has for a long time been the method of choice for downloading HEMCO emissions data. But hemco_data_download typically downloads the entire contents of HEMCO data directory folders, which can end up giving you more data than you actually need to run a given GEOS-Chem simulation.

For this reason, we have replaced the hemco_data_download package with the GEOS-Chem dry-run capability, starting in GEOS-Chem 12.7.0. With the dry-run option, you can download only those data files that your GEOS-Chem simulation needs.

I am located in China and data download speeds are slow. What can I do?

At present we are working on a better solution for our Chinese GEOS-Chem users. This will probably involve a point person located in China who can oversee and/or centralize data download activities. Stay tuned for more information.

--Bob Yantosca (talk) 18:17, 6 January 2020 (UTC)

Shared data directory archives

The GEOS–Chem shared data directories may be downloaded from the following locations:

Archive Location Description How to download?
Compute Canada http://geoschemdata.computecanada.ca This is the main GEOS-Chem data archive.
  • Use this archive to download data to your local computer system.
  1. GEOS-Chem dry-run
    • Our preferred method
    • Available in 12.7.0 or later
  2. or by manual download
Amazon Web Services S3 storage s3://gcgrid This is an AWS S3 bucket containing a mirror of the Harvard University storage server. It will not contain the complete record of met fields, but additional data may be added by submitting a request to the GCST.

See our cloud computing tutorial (cloud.geos-chem.org) for more information.

  • Use this archive for download data to your AWS cloud instance.
  • NOTE: Downloading data from this archive to your local computer system will incur an egress fee. Use with caution!
  1. GEOS-Chem dry-run
    • Our preferred method
    • Available in 12.7.0 or later
  2. or by manual download

--Bob Yantosca (talk) 18:09, 13 December 2019 (UTC)



Previous | Next | Getting Started with GEOS-Chem