Downloading data from Amazon Web Services cloud storage

From Geos-chem
Jump to: navigation, search

Previous | Next | Getting Started with GEOS-Chem

  1. Minimum system requirements
  2. Configuring your computational environment
  3. Downloading source code
  4. Downloading data directories
  5. Creating run directories
  6. Configuring runs
  7. Compiling
  8. Running
  9. Output files
  10. Visualizing and processing output
  11. Coding and debugging
  12. Further reading


On this page, we provide information about how to manually download GEOS-Chem input data (met fields, emissions, etc.) from the Amazon Web Services s3://gcgrid bucket. But we recommend downloading data with the GEOS-Chem dry-run option (which will be available in GEOS-Chem 12.7.0, as this greatly simplifies the data download process.

NOTE: If you have already used the GEOS-Chem dry-run option to download data, you can skip ahead to Creating Run Directories.

Amazon Web Services S3 directory structure

The Harvard data directory archive is mirrored as an Amazon Web Services S3 bucket:

s3://gcgrid

which has the following directory structure:

Directory Description
CHEM_INPUTS/ Contains non-emissions data for GEOS-Chem chemistry modules
HEMCO/ Contains emissions data for the HEMCO emissions component
GEOSCHEM_RESTARTS/ Contains sample restart files uses to initialize GEOS-Chem simulations.
0.25° x 0.3125° Data Directories Description
GEOS_0.25x0.3125/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP global met fields
GEOS_0.25x0.3125_AS/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the Asia domain
GEOS_0.25x0.3125_CH/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the China domain
GEOS_0.25x0.3125_EU/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the Europe domain
GEOS_0.25x0.3125_NA/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the North America domain
0.5° x 0.625° Data Directories Description
GEOS_0.5x0.625/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 global met fields
GEOS_0.5x0.625_AS/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the Asia domain
GEOS_0.5x0.625_CH/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the China domain
GEOS_0.5x0.625_EY/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the Europe domain
GEOS_0.5x0.625_NA/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the North America domain
2° x 2.5° Data Directories Description
GEOS_2x2.5/GEOS_FP/YYYY/MM 2° x 2.5° GEOS-FP global met fields
GEOS_2x2.5/MERRA2/YYYY/MM 2° x 2.5° MERRA-2 global met fields
4° x 5° Data Directories Description
GEOS_4x5/GEOS_FP/YYYY/MM/ 4° x 5° GEOS-FP global met fields
GEOS_4x5/MERRA2/YYYY/MM/ 4° x 5° MERRA-2 global met fields

NOTE: Unlike the Compute Canada archive, The s3://gcgrid bucket does not contain an ExtData root folder.

--Bob Yantosca (talk) 21:34, 12 December 2019 (UTC)

GEOS-FP and MERRA-2 constant data files

If you are downloading the GEOS-FP or MERRA-2 met data, then please note that you must also download the "CN" (constant) data files for each horizontal grid that you are using.

For GEOS-FP these are timestamped for 2011/01/01 and are found in these data directories:

  • GEOS_0.25x0.3125/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.nc
  • GEOS_0.25x0.3125_AS/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.AS.nc
  • GEOS_0.25x0.3125_CH/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.CH.nc
  • GEOS_0.25x0.3125_EU/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.EU.nc
  • GEOS_0.25x0.3125_NA/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.NA.nc
  • GEOS_2x2.5/GEOS_FP/2011/01/GEOSFP.20110101.CN.2x25.nc
  • GEOS_4x5/GEOS_FP/2011/01/GEOSFP.20110101.CN.4x5.nc

For MERRA-2 these are timestamped for 2015/01/01 and are found in these data directories :

  • GEOS_0.5x0.625/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.nc4
  • GEOS_0.5x0.625_AS/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.AS.nc4
  • GEOS_0.5x0.625_CH/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.CH.nc4
  • GEOS_0.5x0.625_EU/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.EU.nc4
  • GEOS_0.5x0.625_NA/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.NA.nc4
  • GEOS_2x2.5/MERRA2/2015/01/MERRA2.20150101.CN.2x25.nc4
  • GEOS_4x5/MERRA2/2015/01/MERRA2.20150101.CN.4x5.nc4

Additional notes:

  • Prior to downloading GEOS-FP data, please be aware of caveats regarding use of GEOS-FP. See the GEOS-FP wiki page for more information.

--Bob Yantosca (talk) 21:34, 12 December 2019 (UTC)

Data download commands

Basic Syntax

Get a complete directory listing:

aws s3 ls --request-payer=requester s3://gcgrid

Get a dictionary listing of a single directory:

aws s3 ls --request-payer=requester s3://gcgrid/DIRECTORY-NAME

Copy data from a single directory

aws s3 cp --request-payer=requester s3:/gcgrid/DIRECTORY-NAME 

Recursively copy an entire directory and all of its subdirectories

aws s3 cp --request-payer=requester --recursive s3:/gcgrid/DIRECTORY-NAME 

--Bob Yantosca (talk) 21:50, 12 December 2019 (UTC)

Examples

1. Copy a single file in a s3://gcgrid subdirectory to the /home/ubuntu/ExtData directory structure in your AWS cloud instance:

aws s3 cp --request-payer=requester s3://gcgrid/HEMCO/TIMEZONES/v2015-02/timezones_voronoi_1x1.nc /home/ubuntu/ExtData</span>/HEMCO/TIMEZONES/v2015-02/timezones_voronoi_1x1.nc

NOTE: The command will replicate the same subdirectory structure in the /home/ubuntu/ExtData folder.


2. Recursively copy all GEOS-FP 4° x 5° met field files for 2016 from s3://gcgrid to the /home/ubuntu/ExtData directory structure in your AWS cloud instance:

aws s3 cp --request-payer=requester --recursive s3://gcgrid/GEOS_4x5/GEOS_FP/2016/ /home/ubuntu/ExtData/GEOS_4x5/GEOS_FP/2016/


3. Recursively copy all GEOS-FP 4&deg x 5° met field files for January 2016 from s3://gcgrid to the >tt?/home/ubuntu/ExtData</tt> directory structure in your AWS cloud instance:

aws s3 cp --request-payer=requester --recursive s3://gcgrid/GEOS_4x5/GEOS_FP/2016/-1 /home/ubuntu/ExtData/GEOS_4x5/GEOS_FP/2016/01

--Bob Yantosca (talk) 21:47, 12 December 2019 (UTC)

Further reading

For more information, please see:

  1. Our GEOS-Chem cloud computing tutorial (cloud.geos-chem.org)
  2. aws s3 ls command reference
  3. aws s3 cp command reference
  4. More AWS S3 examples



Previous | Next | Getting Started with GEOS-Chem