Difference between revisions of "Downloading GEOS-Chem data directories"

From Geos-chem
Jump to: navigation, search
Line 274: Line 274:
 
Downloading data from Dalhousie:
 
Downloading data from Dalhousie:
  
  <nowiki>wget -r -nH "ftp://ucis.rain.dal.ca/DIRECTORY_NAME/"</nowiki>
+
  <nowiki>wget -r -nH "ftp://rain.ucis.dal.ca/DIRECTORY_NAME/"</nowiki>
  
 
The options to <tt>wget</tt> are as follows:
 
The options to <tt>wget</tt> are as follows:

Revision as of 18:12, 18 October 2010

This page describes where you can obtain the GEOS-Chem source code and required data files.

Use Git for all new source code and run directory downloads

Starting with GEOS-Chem v8-03-01, the GEOS-Chem source code and run directories are now stored as publicly available Git repositories. You can download these directly to your computer if you have Git installed locally. For detailed download instructions, please see our Downloading a new GEOS-Chem version wiki post.

The old method of distribution (i.e. *.tar.gz aka "TARBALL" files) has now been discontinued. If you are looking for GEOS-Chem source code versions prior to v8-03-01, then you can still download the TARBALL files for these versions via FTP:

ftp ftp.as.harvard.edu
cd pub/geos-chem/beta_releases
cd pub/geos-chem/public_releases

Data Directory Access

The GEOS-Chem source code, data and meteorological field directories may be accessed by anonymous FTP from one of the following FTP sites.

Primary download site

The primary GEOS-Chem data download site is located at:

ftp ftp.as.harvard.edu
cd pub/geos-chem

NOTE: The wget utility may be more convenient than FTP, as it can allow you to download multiple directories at once. See below for more information.

The pub/geos-chem directory is further divided into the following subdirectories:

1month_plots/
1yr_benchmarks/
NRT/
NRT-ARCTAS/
NRT_archive/
beta_releases/
dao/
data/
downloads/
mean_OH/
patches/
public_releases/

As described above, the beta_releases and public_releases subdirectories contain GEOS-Chem source code and run directories for versions prior to GEOS-Chem v8-03-01. For downloads of GEOS-Chem v8-03-01 and higher, you must download via the Git version control software.

Here is a quick look at the contents of these subdirectories of pub/geos-chem/:

Data Directories under pub/geos-chem/ Description
data/ Root Data Directory
data/aerosol_optics/ Contains files which specify the aerosol optical properties for the FAST-J photolysis mechanism.
data/GEOS_MEAN/ Contains P(O3), L(O3) and mean OH data for offline simulations.
0.5° x 0.666° Data Directories Description
data/GEOS_0.5x0.666_CH/ Emissions etc. files for the China/SE Asia 0.5° x 0.666° nested-grid simulation
data/GEOS_0.5x0.666_CH/GEOS_5/YYYY/MM/ GEOS-5 met data for the China/SE Asia 0.5° x 0.666° nested grid simulation
data/GEOS_0.5x0.666_NA/ Emissions etc. files for the North American 0.5° x 0.666° nested grid simulation
data/GEOS_0.5x0.666_NA/GEOS_5/YYYY/MM/ GEOS-5 met data for the North American 0.5° x 0.666° nested grid simulation
1° x 1° Data Directories Description
data/GEOS_1x1/ 1° x 1° emissions etc. data files for use with GEOS-Chem global simulations
data/GEOS_1x1_CH/ Emissions etc. data for the China/SE Asia 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
data/GEOS_1x1_CH/GEOS_3/YYYY/MM/ GEOS-3 met data for the China/SE Asia 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
data/GEOS_1x1_NA/ Emissions etc. data for the North American 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
data/GEOS_1x1_NA/GEOS_3/YYYY/MM/ GEOS-3 met data for the North American 1° x 1° nested grid simulation
NOTE: This simulation is now obsolete!
2° x 2.5° Data Directories Description
data/GEOS_2x2.5/ Emissions etc. data for GEOS-chem 2° x 2.5° global simulations
data/GEOS_2x2.5/GEOS_3/YYYY/MM/ GEOS-3 met data for 2° x 2.5° global simulations
data/GEOS_2x2.5/GEOS_4_v4/YYYY/MM/ GEOS-4 met data for 2° x 2.5° global simulations (late-look)
NOTE: This is what you should use for your simulations!
data/GEOS_2x2.5/GEOS_4_flk/YYYY/MM/ GEOS-4 met data for 2° x 2.5° global simulations (1st-look)
NOTE: This was look-ahead data, only used during ICARTT/ITCT 2k2
data/GEOS_2x2.5/GEOS_5/YYYY/MM GEOS-5 met data for 2° x 2.5° global simulations
4° x 5° Data Directories Description
data/GEOS_4x5/ Emissions etc. data for GEOS-chem 4° x 5° global simulations
data/GEOS_4x5/GEOS_3/YYYY/MM/ GEOS-3 met data for 4° x 5° global simulations
data/GEOS_4x5/GEOS_4_v4/YYYY/MM/ GEOS-4 met data for 4° x 5° global simulations (late-look)
NOTE: This is what you should use for your simulations!
data/GEOS_4x5/GEOS_4_flk/YYYY/MM/ GEOS-4 met data for 4° x 5° global simulations (1st-look)
NOTE: This was look-ahead data, only used during ICARTT/ITCT 2k2
data/GEOS_4x5/GEOS_5/YYYY/MM/ GEOS-5 met data for 4° x 5° global simulations
Other Subdirectories under pub/geos-chem/ Description
1month_plots/ Contains plots from selected 1-month benchmark simulations
1yr_benchmarks/ Contains the following types of data from the 1-year benchmarks used to evaluate GEOS-Chem.
  • Restart files
  • Model output (bpch and netCDF formats)
  • Log files
  • Input files
  • Evaluation plots
/beta_releases Directory containing TAR file source code and run directories for beta GEOS-Chem releases

NOTE: Only contains versions prior to GEOS-Chem v8-03-01!

dao/ Internal use only
downloads/ Contains the following code packages:
  • Code to read HDF4 and HDF4-EOS data files
  • Code to read HDF5 and HDF5-EOS data files
  • Code to read netCDF data files
  • Code to process GEOS-5 met data
NRT/ Contains some data from the GEOS-Chem near-real-time simulations for ICARTT/ITCT 2k2
NOTE: to save disk space, some data may have not been preserved after these missions ended.
NRT-ARCTAS/ Contains output from the GEOS-Chem Near-Real-Time simulations for ARCTAS
patches/ Directory containing bug-fix software patches (if necessary)
public_releases/ Directory containing TAR file source code and run directories for GEOS-Chem public releases

NOTE: Only contains versions prior to GEOS-Chem v8-03-01!

mean_OH/ Directory containing 3-D mean OH fields archived from previous GEOS-Chem simulations.

Please view the catalog of met data at the Harvard archive to determine if the data period you wish to download is available.

Downtime

The Harvard FTP archive will be unavailable during the weekly maintenance period every Monday between 7-10 AM ET. Please plan your data downloads accordingly.

Alternative download site

The GEOS-Chem data and meteorological fields used by Dalhousie University are also available via anonymous FTP from:

ftp rain.ucis.dal.ca

This site has overlap with many of the above directories from the Harvard site, but it is not as extensive. This site, however, additionally hosts the following unique datasets:

0.5° x 0.666° Data Directories Description
/GEOS_0.5x0.666_EU/ 1/2 x 2/3 European nested grid emission etc files
/GEOS_0.5x0.666_EU.d/ 1/2 x 2/3 European nested grid met fields (GEOS-5)
/GEOS_0.5x0.666_NA/ 1/2 x 2/3 North American nested grid emission etc files
/GEOS_0.5x0.666_NA.d/ 1/2 x 2/3 North American nested grid met fields (GEOS-5)
1° x 1.25° Data Directories Description
/GEOS_1x1.25/ 1 x 1.25 Global GEOS4 emission etc files
/GEOS_1x1.25.d/ 1 x 1.25 Global GEOS4 met fields

A catalog of available data may be found HERE.

For GCAP users

For those users who wish to run GEOS-Chem with the GISS/GCAP met fields, please contact Loretta Mickley.

--Bob Y. 09:52, 8 March 2010 (EST)

Question about directory structure

Shanna Shaked wrote:

We are working again on trying to run GEOS-Chem. However, we are encountering some errors that may be due to the directory structure. We find a discrepancy between the directory structure described in the GEOS-Chem manual and that available on the ftp site.
The GEOS-Chem manual describes a directory structure of:
   data/GEOS_4x5/GEOS_5/YYYY/MM
However, on the ftp site, we find a directory structure with an extra '.d':
   data/GEOS_4x5.d/GEOS_5/YYYY/MM
(the GEOS_5 folder is in GEOS_4x5.d rather than GEOS_4x5). There does exist a GEOS_4x5 that contains many of the emissions data, but does not contain GEOS_5.
If we leave the structure as is, and enter ../data/GEOS_4x5/ as our root data directory in input.geos, we get a file not found error when it looks for GEOS_5 within this directory (obviously).
If we instead enter ../data/GEOS_4x5.d as our root data directory, we get a file not found error when the program looks for emissions within this directory (lightning NOx emissions, in this case).
QUESTION: To solve this problem, we have moved the GEOS_5 folder into the GEOS_4x5 directory. [Is this] okay?

Bob Yantosca replied:

The only difference on our system between e.g. GEOS_4x5 and GEOS_4x5.d is that our sysadmin (Jack Yatteau) set up the ".d" directories separately so that they only contain met data (which is much larger than the emissions etc. data). That way he could separate the disks that just had met data from the disk that have the emissions data to facilitate our configuration here. There are symbolic links from GEOS_4x5 to GEOS_4x5.d etc. (i.e. the directory GEOS_4x5/GEOS_5 is actually a symbolic link to the corresponding directory in GEOS_4x5.d/GEOS_5/ and etc. for the other met field resolutions & directories).
You don't necessarily have to do this on your end, but this is what we did here. You can just make the GEOS_4x5/GEOS_5 etc. real subdirectories and not symbolic links and store the data there. The solution you picked above is OK.
Also to facilitate FTP file transfer, you could do the following:
  • Write a script or an FTP macro
  • Use a 3rd-party GUI program like FireFTP in Mozilla Firefox.
  • Or even better yet, use the Unix wget utility (see below)
Each user is responsible for their own file transfers.

--Bob Y. 11:04, 5 February 2009 (EST)

Using wget to download files

Probably the simplest way to download the GEOS-Chem emissions and met field data is to use the Unix wget utility. This allows you to download multiple files and directories at a time.

The wget utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.). You can check out the user manual for more information.

Syntax

Most of the time, the syntax you will use to download multiple directories is as follows:

Downloading data from Harvard:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/DIRECTORY_NAME/"

Downloading data from Dalhousie:

wget -r -nH "ftp://rain.ucis.dal.ca/DIRECTORY_NAME/"

The options to wget are as follows:

-r   Specifies recursive directory transfer (i.e. will download all subdirectories)
-nH  Will store all directories and subdirectories in DIRECTORY_NAME, not ftp.as.harvard.edu/DIRECTORY_NAME

NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then wget will just return a directory listing in a file named index.html without any files being downloaded.

Examples

1. Download all emissions files in the GEOS_2x2.5/ data directory structure:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5/" &

The & character will make sure the file transfer happens in the Unix background.


2. Download all available GEOS-Chem 2° x 2.5° met field data files in the GEOS_2x2.5.d directory structure:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5.d/" &

NOTE: Due to the huge volume of data involved, this is not recommended, as the file downloads may swamp your system. It's better to do this:


3. Download all GEOS-5 met data at 2° x 2.5° resolution:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5.d/GEOS_5/" &

And it may even be better to download one year at a time:


4. Download all GEOS-5 met data at 2° x 2.5° resolution for 2008:

wget -r -nH "ftp://ftp.as.harvard.edu/pub/geos-chem/data/GEOS_2x2.5.d/GEOS_5/2008/" &

etc.

--Bob Y. 14:27, 8 September 2010 (EDT)