Downloading GEOS-Chem data directories

From Geos-chem
Revision as of 16:26, 4 March 2010 by Bmy (Talk | contribs)

Jump to: navigation, search

This page describes where you can obtain the GEOS-Chem source code and required data files.

Data Directory Access

The GEOS-Chem source code, data and meteorological field directories may be accessed by anonymous FTP from one of the following FTP sites.

Primary download site

The primary GEOS-Chem data download site is located at:

ftp ftp.as.harvard.edu
cd pub/geos-chem

(NOTE: The wget utility may be more convenient than FTP, as it can allow you to download multiple directories at once. See below for more information.)

The geos-chem directory is further divided into the following subdirectories:

1month_plots/
1yr_benchmarks/
NRT/
NRT-ARCTAS/
NRT_archive/
beta_releases/
dao/
data/
downloads/
mean_OH/
patches/
public_releases/

Here is a quick look at the contents of these subdirectories of pub/geos-chem/:

Data Directories under pub/geos-chem/ Description
data Root Data Directory
data/aerosol_optics/ Contains files which specify the aerosol optical properties for the FAST-J photolysis mechanism.
data/GEOS_MEAN Contains P(O3), L(O3) and mean OH data for offline simulations.
0.5° x 0.666° Data Directories Description
data/GEOS_0.5x0.666_CH Emissions etc. files for the China/SE Asia 0.5° x 0.666° nested-grid simulation
data/GEOS_0.5x0.666_CH/GEOS_5/YYYY/MM/ GEOS-5 met data for the China/SE Asia 0.5° x 0.666° nested grid simulation
data/GEOS_0.5x0.666_NA Emissions etc. files for the North American 0.5° x 0.666° nested grid simulation
data/GEOS_0.5x0.666_NA/GEOS_5/YYYY/MM/ GEOS-5 met data for the North American 0.5° x 0.666° nested grid simulation
1° x 1° Data Directories Description
data/GEOS_1x1 1° x 1° emissions etc. data files for use with GEOS-Chem global simulations
data/GEOS_1x1_CH Emissions etc. data for the China/SE Asia 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
data/GEOS_1x1_CH/GEOS_3/YYYY/MM/ GEOS-3 met data for the China/SE Asia 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
data/GEOS_1x1_NA/ Emissions etc. data for the North American 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
data/GEOS_1x1_NA/GEOS_3/YYYY/MM/ GEOS-3 met data for the North American 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
2° x 2.5° Data Directories Description
data/GEOS_2x2.5/ Emissions etc. data for GEOS-chem 2° x 2.5° global simulations
data/GEOS_2x2.5/GEOS_3/YYYY/MM GEOS-3 met data for 2° x 2.5° global simulations
data/GEOS_2x2.5/GEOS_4_v4/YYYY/MM GEOS-4 met data for 2° x 2.5° global simulations (late-look)
NOTE: This is what you should use for your simulations!
data/GEOS_2x2.5/GEOS_4_flk/YYYY/MM GEOS-4 met data for 2° x 2.5° global simulations (1st-look)
NOTE: This was look-ahead data, only used during ICARTT/ITCT 2k2
data/GEOS_2x2.5/GEOS_5/YYYY/MM GEOS-5 met data for 2° x 2.5° global simulations
4° x 5° Data Directories Description
data/GEOS_4x5/ Emissions etc. data for GEOS-chem 4° x 5° global simulations
data/GEOS_4x5/GEOS_3/YYYY/MM GEOS-3 met data for 4° x 5° global simulations
data/GEOS_4x5/GEOS_4_v4/YYYY/MM GEOS-4 met data for 4° x 5° global simulations (late-look)
NOTE: This is what you should use for your simulations!
data/GEOS_4x5/GEOS_4_flk/YYYY/MM GEOS-4 met data for 4° x 5° global simulations (1st-look)
NOTE: This was look-ahead data, only used during ICARTT/ITCT 2k2
data/GEOS_4x5/GEOS_5/YYYY/MM GEOS-5 met data for 4° x 5° global simulations
   
Other Subdirectories under pub/geos-chem/ Description
/1month_plots Contains plots from selected 1-month benchmark simulations

-

/1yr_benchmarks Contains the following types of data from the 1-year benchmarks used to evaluate GEOS-Chem.
  • Restart files
  • Model output (bpch and netCDF formats)
  • Log files
  • Input files
  • Evaluation plots
/evaluation Plots from 1-yr benchmark simulations
/HDF Contains user code for reading HDF and HDF-EOS data files
/NRT-ARCTAS Contains output from the GEOS-Chem Near-Real-Time simulations for ARCTAS
/public_releases Directory containing TAR file source code and run directories for GEOS-Chem public releases
/beta_releases Directory containing TAR file source code and run directories for beta GEOS-Chem releases
patches/ Directory containing bug-fix software patches (if necessary)
mean_OH/ Directory containing 3-D mean OH fields archived from previous GEOS-Chem simulations.

Please view the catalog of met data at the Harvard archive to determine if the data period you wish to download is available.

Alternative download site

The GEOS-Chem data and meteorological fields used by Dalhousie University are also available via anonymous FTP from:

ftp rain.ucis.dal.ca

This site has overlap with many of the above directories from the Harvard site, but it is not as extensive. This site, however, additionally hosts the following unique datasets:

0.5° x 0.666° Data Directories Description
/GEOS_0.5x0.666_EU/ 1/2 x 2/3 European nested grid emission etc files
/GEOS_0.5x0.666_EU.d/ 1/2 x 2/3 European nested grid met fields (GEOS-5)
/GEOS_0.5x0.666_NA/ 1/2 x 2/3 North American nested grid emission etc files
/GEOS_0.5x0.666_NA.d/ 1/2 x 2/3 North American nested grid met fields (GEOS-5)
1° x 1.25° Data Directories Description
/GEOS_1x1.25/ 1 x 1.25 Global GEOS4 emission etc files
/GEOS_1x1.25.d/ 1 x 1.25 Global GEOS4 met fields

A catalog of available data may be found HERE.

Question about directory structure

Shanna Shaked wrote:

We are working again on trying to run GEOS-Chem. However, we are encountering some errors that may be due to the directory structure. We find a discrepancy between the directory structure described in the GEOS-Chem manual and that available on the ftp site.
The GEOS-Chem manual describes a directory structure of:
   data/GEOS_4x5/GEOS_5/YYYY/MM
However, on the ftp site, we find a directory structure with an extra '.d':
   data/GEOS_4x5.d/GEOS_5/YYYY/MM
(the GEOS_5 folder is in GEOS_4x5.d rather than GEOS_4x5). There does exist a GEOS_4x5 that contains many of the emissions data, but does not contain GEOS_5.
If we leave the structure as is, and enter ../data/GEOS_4x5/ as our root data directory in input.geos, we get a file not found error when it looks for GEOS_5 within this directory (obviously).
If we instead enter ../data/GEOS_4x5.d as our root data directory, we get a file not found error when the program looks for emissions within this directory (lightning NOx emissions, in this case).
QUESTION: To solve this problem, we have moved the GEOS_5 folder into the GEOS_4x5 directory. [Is this] okay?

Bob Yantosca replied:

The only difference on our system between e.g. GEOS_4x5 and GEOS_4x5.d is that our sysadmin (Jack Yatteau) set up the ".d" directories separately so that they only contain met data (which is much larger than the emissions etc. data). That way he could separate the disks that just had met data from the disk that have the emissions data to facilitate our configuration here. There are symbolic links from GEOS_4x5 to GEOS_4x5.d etc. (i.e. the directory GEOS_4x5/GEOS_5 is actually a symbolic link to the corresponding directory in GEOS_4x5.d/GEOS_5/ and etc. for the other met field resolutions & directories).
You don't necessarily have to do this on your end, but this is what we did here. You can just make the GEOS_4x5/GEOS_5 etc. real subdirectories and not symbolic links and store the data there. The solution you picked above is OK.
Also to facilitate FTP file transfer, you could do the following:
  • Write a script or an FTP macro
  • Use a 3rd-party GUI program like FireFTP in Mozilla Firefox.
  • Or even better yet, use the Unix wget utility (see below)
Each user is responsible for their own file transfers.

--Bob Y. 11:04, 5 February 2009 (EST)

Using wget to download files

Probably the simplest way to download the GEOS-Chem emissions and met field data is to use the Unix wget utility. This allows you to download multiple files and directories at a time.

The wget utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.). You can check out the user manual for more information.

Syntax

Most of the time, the syntax you will use to download multiple directories is as follows:

Downloading data from Harvard:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/DIRECTORY_NAME/"

Downloading data from Dalhousie:

wget -r -nH "ftp://ucis.rain.dal.ca/DIRECTORY_NAME/"

The options to wget are as follows:

-r   Specifies recursive directory transfer (i.e. will download all subdirectories)
-nH  Will store all directories and subdirectories in DIRECTORY_NAME, not ftp.as.harvard.edu/DIRECTORY_NAME

NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then wget will just return a directory listing in a file named index.html without any files being downloaded.

Examples

1. Download all emissions files in the GEOS_2x2.5 data directory structure:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5/" &

The & character will make sure the file transfer happens in the Unix background.


2. Download all available GEOS-Chem 2 x 2.5 met field data files in the GEOS_2x2.5.d directory structure:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5.d/" &

NOTE: Due to the huge volume of data involved, this is not recommended, as the file downloads may swamp your system. It's better to do this:


3. Download all GEOS-5 met data at 2 x 2.5 resolution:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5.d/GEOS_5/" &

And it may even be better to download one year at a time:


4. Download all GEOS-5 met data at 2 x 2.5 resolution for 2008:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5.d/GEOS_5/2008/" &

etc.

--Bob Y. 11:01, 3 December 2009 (EST)