Difference between revisions of "Downloading GEOS-Chem data directories"

From Geos-chem
Jump to: navigation, search
(Using wget to download the shared data directories)
(Using wget to download the shared data directories)
Line 382: Line 382:
 
The <tt>wget</tt> utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.).  You can check out the [http://www.editcorp.com/Personal/Lars_Appel/wget/wget_1.html user manual] for more information.
 
The <tt>wget</tt> utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.).  You can check out the [http://www.editcorp.com/Personal/Lars_Appel/wget/wget_1.html user manual] for more information.
  
'''''NOTE: Using wget will not retrieve symbolically linked directories. To download directories that are symbolically linked, use wget with the paths that the symbolically linked directories point to.  
+
'''''NOTE: Using wget will not retrieve symbolically linked directories. To download directories that are symbolically linked, use wget with the paths that the symbolically linked directories point to. See the [[Setting_up_the_ExtData_directory|Setting up the ExtData directory wiki page]] for more information.
  
 
=== Syntax ===
 
=== Syntax ===

Revision as of 15:48, 23 July 2015

This page describes where you can obtain the GEOS-Chem source code and required data files.

What you need to download before you can run GEOS-Chem

When setting up GEOS–Chem on your system, you will need install the following components:

  1. The GEOS–Chem source code directory. This is the directory where the Fortran-90 source code files (i.e. *.F, *.F90 files) and Makefiles reside. Your Fortran compiler will create an executable (geos) from these source code files.
  2. A GEOS–Chem run directory. Here is where you will run the compiled GEOS–Chem executable geos. Each run directory contains:
    1. Various input files that you can modify in order to select different options for your GEOS–Chem simulation
    2. Files that define the GEOS–Chem's chemical and photolysis mechanisms
    3. "Restart files" that hold the initial conditions for your GEOS–Chem simulation
  3. The GEOS–Chem shared data directories. This is the directory tree where the following types of data are stored:
    1. Meteorological data (a.k.a. the "met fields) used to drive GEOS–Chem
    2. Emissions inventories for the HEMCO emissions component
    3. Scale factors to used to scale emissions from a base year to a given year
    4. Oxidant (OH, O3) concentrations for both full-chemistry and specialty simulations
    5. Other GEOS–Chem specific data files.
  4. A netCDF library installation. Starting with GEOS-Chem v9-01-03, the model can read data files in netCDF format. Eventually, we shall replace all of GEOS-Chem's input and output files from binary punch file format to netCDF format. We are doing this as part of our Grid-independent GEOS-Chem project, which seeks to interface GEOS-Chem with the NASA GEOS–5 GCM.

The GEOS–Chem source code and run directories are small enough to download directly to your own disk space in your Unix account. You can download these with the Git version control software.

On the other hand, the GEOS–Chem shared data directories contain many large files that probably cannot all fit into your own personal disk quota. Therefore, you (or your IT staff) should download the shared data directories to a common disk space where all GEOS–Chem users in your group can access them. The volume of data contained in the shared data directories precludes using Git; you must instead download these files via FTP, wget, or similar file transfer programs. Minimum system requirements for GEOS-Chem wiki page]] for more information on typical disk space requirements for GEOS-Chem.

If your computer system already has a pre-built netCDF library installed, we recommend that you use that. Check with your IT staff about how to load the netCDF libraries into your Unix environment. (Usually this is done with the module command, if that exists on your system.) But if you don't have a pre-built netCDF library on your system, you (or your IT staff) can use our GEOS-Chem-Libraries netCDF installer installer to build the libraries.

For complete downloading instructions, please see:

  1. GEOS-Chem User's Guide: Chapter 2.2: Downloading the GEOS-Chem source code
  2. GEOS-Chem wiki: Creating_GEOS-Chem_run_directories
  3. GEOS-Chem wiki: Using wget to download the shared data directories
  4. GEOS-Chem wiki: Installing libraries for GEOS-Chem

For more information on the files contained in each of these directories, please see:

  1. GEOS-Chem User's Guide: Chapter 3: Compiling the GEOS-Chem source code
  2. GEOS-Chem User's Guide: Chapter 5: GEOS-Chem run directories
  3. GEOS-Chem wiki: Directories for met fields and emissions data
  4. GEOS-Chem wiki: HEMCO data directories

--Melissa Sulprizio 16:43, 8 April 2015 (EDT)

Shared data directory archives

There are currently two data archives from which you may download the GEOS–Chem shared data directories:

  1. Harvard data directory archive (ftp.as.harvard.edu)
  2. Dalhousie data directory archive (rain.ucis.dal.ca)

The Dalhousie archive is not as comprehensive as the Harvard archive. However, the Dalhousie archive stores data files for the various nested grids that are not available on the Harvard archive.

--Bob Y. 14:19, 16 January 2014 (EST)

Harvard data directory archive

You can access the primary GEOS-Chem data download site via anonymous FTP at:

ftp ftp.as.harvard.edu

We recommend that you use the wget utility to download these directories instead of anonymous FTP. Wget allows you to download multiple directories at once, with a command such as this one:

wget -r -nH --cut-dirs=4 "ftp://ftp.as.harvard.edu/DIRECTORY_NAME"

See the tables below for the DIRECTORY_NAME options.

Downtime

The Harvard FTP archive will be unavailable during the weekly maintenance period every Monday between 7-10 AM ET. Please plan your data downloads accordingly.

Directories for met fields and emissions data

If you are downloading the GEOS-FP met data, then please note the following:

  1. Make sure you only take the files named GEOSFP*. Met field files named GEOS572* are obsolete and have been replaced by the GEOSFP* files.
  2. You will need to download the "CN" (constant) data files for each horizontal grid that you are using. These are timestamped for 2011/01/01 and are found in these data directories of ftp.as.harvard.edu:
    • gcgrid/geos-chem/data/GEOS_4x5.d/GEOS_FP/2011/01/GEOSFP.20110101.4x5.nc
    • gcgrid/geos-chem/data/GEOS_2x25.d/GEOS_FP/2011/01/GEOSFP.20110101.2x25.nc
    • gcgrid/geos-chem/data/GEOS_0.25x0.3125_EU.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3125.NA.nc

(Bob Yantosca, 05 Feb 2014)

The path gcgrid/geos-chem/data/ExtData/ (also reachable as gcgrid/data/ExtData/) is the root path under which the GEOS–Chem shared data directories reside. This is where you will find the GEOS-Chem met fields and emissions data.

Starting with GEOS-Chem v10-01, the data directory naming structure has changed since we no longer need to store emissions data on multiple grids. All of the GEOS-Chem data directories are now subdirectories of the ExtData directory. For more information, please see our Setting up the ExtData directory wiki page.

Directory Description
gcgrid/geos-chem/data/ExtData/ Root Data Directory
gcgrid/geos-chem/data/ExtData/CHEM_INPUTS/ Contains non-emissions data for GEOS-Chem chemistry modules
gcgrid/geos-chem/data/ExtData/HEMCO/ Contains emissions data for the HEMCO emissions component
0.25° x 0.3125° Data Directories Description
gcgrid/geos-chem/data/ExtData/GEOS_0.25x0.3125_NA.d/GEOS_FP/YYYY/MM/ GEOS-FP met data for 1/4° x 5/16° North America nested grid simulations
0.5° x 0.666° Data Directories Description
gcgrid/geos-chem/data/ExtData/GEOS_0.5x0.666_CH/GEOS_5/YYYY/MM/ GEOS-5 met fields for 1/2° x 2/3° China nested grid simulations
gcgrid/geos-chem/data/ExtData/GEOS_0.5x0.666_NA/GEOS_5/YYYY/MM/ GEOS-5 met fields for 1/2° x 2/3° North America nested grid simulations
2° x 2.5° Data Directories Description
gcgrid/geos-chem/data/ExtData/GEOS_2x2.5/GEOS_4_v4/YYYY/MM/ GEOS-4 met fields for 2° x 2.5° global simulations
gcgrid/geos-chem/data/ExtData/GEOS_2x2.5/GEOS_5/YYYY/MM GEOS-5 met fields for 2° x 2.5° global simulations
gcgrid/geos-chem/data/ExtData/GEOS_2x2.5/GEOS_FP/YYYY/MM GEOS-FP met data for 2° x 2.5° global simulations
gcgrid/geos-chem/data/ExtData/GEOS_2x2.5/MERRA/YYYY/MM MERRA met fields for 2° x 2.5° global simulations
4° x 5° Data Directories Description
gcgrid/geos-chem/data/ExtData/GEOS_4x5/GEOS_4_v4/YYYY/MM/ GEOS-4 met fields for 4° x 5° global simulations
gcgrid/geos-chem/data/ExtData/GEOS_4x5/GEOS_5/YYYY/MM/ GEOS-5 met fields for 4° x 5° global simulations
gcgrid/geos-chem/data/ExtData/GEOS_4x5/GEOS_FP/YYYY/MM/ GEOS-FP met fields for 4° x 5° global simulations
gcgrid/geos-chem/data/ExtData/GEOS_4x5/MERRA/YYYY/MM/ MERRA met fields for 4° x 5° global simulations

Please view the catalog of met data at the Harvard archive to determine if the data period you wish to download is available.

--Melissa Sulprizio 14:48, 8 April 2015 (EDT)

Directories for 1-month and 1-year benchmarks

You can find output files and evaluation plots for 1-month and 1-year benchmark simulations in the following directories:

Directory Description
gcgrid/geos-chem/1mo_benchmarks/ Contains the following types of data from the 1-month benchmark simulations that are used to evaluate GEOS-Chem:
  • Restart files
  • Model output (bpch and netCDF formats)
  • Log files
  • Input files
  • Evaluation plots
gcgrid/geos-chem/1yr_benchmarks/ Contains the following types of data from the 1-year benchmarks used to evaluate GEOS-Chem:
  • Restart files
  • Model output (bpch and netCDF formats)
  • Log files
  • Input files
  • Evaluation plots

--Bob Y. 14:22, 16 January 2014 (EST)

Data directories to ignore

You can ignore the contents of these directories. Most of these contain obsolete data that are only preserved for archival purposes.

Directory Description
gcgrid/geos-chem/dao/ Internal use only
gcgrid/geos-chem/beta_releases Contains TARBALL files with source code and run directories for GEOS-Chem beta versions prior to v8-03-01. We now use Git to manage the GEOS-Chem source code & run directories.
gcgrid/geos-chem/data/GEOS_0.25x0.3125_NA/ Emissions etc. files for 1/4° x 5/16° North America nested grid simulations. We now read emission data at its native resolution from the HEMCO data directories.
gcgrid/geos-chem/data/GEOS_0.25x0.3125_SEAC4RS Emissions etc. data for the 0.25° x 0.3125° simulation on the SE Asia grid. This was created when the SEAC4RS mission was scheduled to occur over Thailand. This is now obsolete.
gcgrid/geos-chem/data/GEOS_0.25x0.3125_SEAC4RS/GEOS_5.7/YYYY/MM Met field data for the 0.25° x 0.3125° simulation on the SE Asia grid. This was created when the SEAC4RS mission was scheduled to occur over Thailand. This is now obsolete.
gcgrid/geos-chem/data/GEOS_0.5x0.666_CH/ Emissions etc. files for 1/2° x 2/3° China nested grid simulations. We now read emission data at its native resolution from the HEMCO data directories.
gcgrid/geos-chem/data/GEOS_0.5x0.666_NA/ Emissions etc. files for 1/2° x 2/3° North America nested grid simulations. We now read emission data at its native resolution from the HEMCO data directories.
gcgrid/geos-chem/data/GEOS_1x1_CH/ Emissions etc. data for the China/SE Asia 1° x 1° simulation. This simulation has been superseded by the GEOS-5 0.5° x 0.666° nested simulation.
gcgrid/geos-chem/data/GEOS_1x1_CH/GEOS_3/YYYY/MM/ GEOS-3 met data for the China/SE Asia 1° x 1° nested grid simulation This simulation has been superseded by the GEOS-5 0.5° x 0.666° nested simulation.
gcgrid/geos-chem/data/GEOS_1x1_NA/ Emissions etc. data for the North American 1° x 1° nested grid simulation. This simulation has been superseded by the GEOS-5 0.5° x 0.666° nested simulation.
gcgrid/geos-chem/data/GEOS_1x1_NA/GEOS_3/YYYY/MM/ GEOS-3 met data for the North American 1° x 1° nested grid simulation. This simulation has been superseded by the GEOS-5 0.5° x 0.666° nested simulation.
gcgrid/geos-chem/data/GEOS_2x2.5/ Emissions etc. data for 2° x 2.5° global simulations. We now read emission data at its native resolution from the HEMCO data directories.
gcgrid/geos-chem/data/GEOS_2x2.5/GEOS_4_flk/YYYY/MM/ GEOS-4 met data for 2° x 2.5° global simulations (1st-look). These data were only used for the ITCT/2k2 campaing. They have been replaced by the data in GEOS_4_v4.
gcgrid/geos-chem/data/GEOS_4x5/ Emissions etc. data for GEOS-chem 4° x 5° global simulations. We now read emission data at its native resolution from the HEMCO data directories.
gcgrid/geos-chem/data/GEOS_4x5/GEOS_4_flk/YYYY/MM/ GEOS-4 met data for 4° x 5° global simulations (1st-look). These data were only used for the ITCT/2k2 campaing. They have been replaced by the data in GEOS_4_v4.
gcgrid/geos-chem/data/GEOS_MEAN/ Contains P(O3), L(O3) and mean OH data for offline simulations. We now read this data from the HEMCO data directories.
gcgrid/geos-chem/data/GEOS_NATIVE/ 1° x 1° and higher resolution emissions etc. data for use with GEOS-Chem simulations. We now read emission data at its native resolution from the HEMCO data directories.
gcgrid/geos-chem/data/aerosol_optics/ Contains versions of the jv_spec_aod.dat files which specify the aerosol optical properties used for the AOD diagnostics for the FAST-J and FAST-JX photolysis mechanisms. We now read this data from gcgrid/geos-chem/data/ExtData/CHEM_INPUTS/.
gcgrid/geos-chem/downloads/ Formerly used to contain the following code packages:
  • Code to read HDF4 and HDF4-EOS data files
  • Code to read HDF5 and HDF5-EOS data files
  • Code to read netCDF data files
  • Code to process GEOS-5 met data

These are now maintained as Git repositories. Contact Bob Yantosca for more information.

gcgrid/geos-chem/NRT/ Contains some data from the GEOS-Chem near-real-time simulations for ICARTT/ITCT 2k2. We maintain this directory for archival purposes. Not all data have been preserved.
gcgrid/geos-chem/NRT-ARCTAS/ Contains output from the GEOS-Chem Near-Real-Time simulations for ARCTAS. We maintain this directory for archival purposes.
gcgrid/geos-chem/mean_OH/ Directory containing 3-D mean OH fields archived from previous GEOS-Chem simulations. We now read this data from the HEMCO data directories.
gcgrid/geos-chem/patches/ Directory that formerly contained bug-fix software patches (if necessary). We now use Git to issue GEOS-Chem software patches.
gcgrid/geos-chem/public_releases/ Contains TARBALL files with source code and run directories for GEOS-Chem public versions prior to v8-03-01. We now use Git to manage the GEOS-Chem source code & run directories.

--Melissa Sulprizio 15:00, 8 April 2015 (EDT)

Dalhousie data directory archive

The GEOS-Chem data and meteorological fields used by Dalhousie University are also available via anonymous FTP from:

ftp rain.ucis.dal.ca

We recommend that you use the wget utility to download these directories instead of anonymous FTP. Wget allows you to download multiple directories at once. The Wget command will take the form:

wget -r -nH --cut-dirs=4 "ftp://rain.ucis.dal.ca/DIRECTORY_NAME"

See the table below for the DIRECTORY_NAME options.

Directory structure

If you are downloading the GEOS-FP met data, then please note the following:

  1. Make sure you only take the files named GEOSFP*. Met field files named GEOS572* are obsolete and have been replaced by the GEOSFP* files.
  2. You will need to download the "CN" (constant) data files for each horizontal grid that you are using. These are timestamped for 2011/01/01 and are found in these data directories of rain.ucis.dal.ca:
    • ctm/GEOS_4x5.d/GEOS_FP/2011/01/GEOSFP.20110101.4x5.nc
    • ctm/GEOS_2x25.d/GEOS_FP/2011/01/GEOSFP.20110101.2x25.nc
    • ctm/GEOS_0.25x0.3125_CH.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3125.CH.nc
    • ctm/GEOS_0.25x0.3125_EU.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3125.NA.nc
    • ctm/GEOS_0.25x0.3125_NA.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3125.EU.nc
    • ctm/GEOS_0.25x0.3125_SE.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3125.SE.nc

(Bob Yantosca, 05 Feb 2014)

The Dalhousie archive overlaps with many of the above directories from the Harvard site, but it is not as extensive. This site, however, hosts several nested grid data sets that are not archived at Harvard.

0.25° x 0.3125° Data Directories Description
ctm/GEOS_0.25x0.3125_CH.d/GEOS_FP/YYYY/MM GEOS-FP met fields for 1/4° x 5/16° China nested grid simulations
ctm/GEOS_0.25x0.3125_EU.d/GEOS_FP/YYYY/MM GEOS-FP met fields for 1/4° x 5/16° Europe nested grid simulations
ctm/GEOS_0.25x0.3125_NA.d/GEOS_FP/YYYY/MM GEOS-FP met fields for 1/4° x 5/16° North America nested grid simulations
ctm/GEOS_0.25x0.3125_SE.d/GEOS_FP/YYYY/MM GEOS-FP met fields for 1/4° x 5/16° SE Asia nested grid simulations
0.5° x 0.666° Data Directories Description
ctm/GEOS_0.5x0.666_CH/ Emissions etc. files for 1/2° x 2/3° China nested grid simulations
ctm/GEOS_0.5x0.666_CH.d/GEOS_5/YYYY/MM GEOS-5 met field files for 1/2° x 2/3° China nested grid simulations
ctm/GEOS_0.5x0.666_EU/ Emissions etc. files for 1/2° x 2/3° Europe nested grid simulations
ctm/GEOS_0.5x0.666_EU.d/GEOS_5/YYYY/MM GEOS-5 met field files for 1/2° x 2/3° Europe nested grid simulations
ctm/GEOS_0.5x0.666_NA/ Emissions etc. files for 1/2° x 2/3° North America nested grid simulations
ctm/GEOS_0.5x0.666_NA.d/GEOS_5/YYYY/MM GEOS-5 met field files for 1/2° x 2/3° North America nested grid simulations
2° x 2.5° Data Directories Description
ctm/GEOS_2x2.5/ Emissions etc. files for 2° x 2.5° global simulations
ctm/GEOS_2x2.5.d/GEOS_4_v4/YYYY/MM/ GEOS-4 met fields for 2° x 2.5° global simulations
ctm/GEOS_2x2.5.d/GEOS_5/YYYY/MM GEOS-5 met fields for 2° x 2.5° global simulations
ctm/GEOS_2x2.5.d/GEOS_FP/YYYY/MM GEOS-FP met fields for 2° x 2.5° global simulations
ctm/GEOS_2x2.5.d/MERRA/YYYY/MM MERRA met fields for 2° x 2.5° global simulations
4° x 5° Data Directories Description
ctm/GEOS_4x5/ Emissions etc. files for 4° x 5° global simulations
ctm/GEOS_4x5.d/GEOS_4_v4/YYYY/MM GEOS-4 met fields for 4° x 5° global simulations
ctm/GEOS_4x5.d/GEOS_5/YYYY/MM GEOS-5 met fields for 4° x 5° global simulations
ctm/GEOS_4x5.d/GEOS_FP/YYYY/MM GEOS-FP met fields for 4° x 5° global simulations
ctm/GEOS_4x5.d/MERRA/YYYY/MM MERRA met fields for 4° x 5° global simulations

A catalog of available data may be found HERE.

--Lizzie Lundgren (talk) 15:48, 16 June 2015 (UTC)

Using wget to download the shared data directories

Probably the simplest way to download the GEOS-Chem emissions and met field data is to use the Unix wget utility. This allows you to download multiple files and directories at a time.

The wget utility is free and open-source (published by GNU), and comes standard with pretty much all builds of *nix (Linux, Ubuntu, Fedora, Centos, etc.). You can check out the user manual for more information.

NOTE: Using wget will not retrieve symbolically linked directories. To download directories that are symbolically linked, use wget with the paths that the symbolically linked directories point to. See the Setting up the ExtData directory wiki page for more information.

Syntax

Most of the time, the syntax you will use to download multiple directories is as follows:

Downloading data from Harvard:

wget -r -nH "ftp://ftp.as.harvard.edu/DIRECTORY_NAME/"

Downloading data from Dalhousie:

wget -r -nH "ftp://rain.ucis.dal.ca/DIRECTORY_NAME/"

The options to wget are as follows:

-r   Specifies recursive directory transfer (i.e. will download all subdirectories)
-nH  Will store all directories and subdirectories in DIRECTORY_NAME, not ftp.as.harvard.edu/DIRECTORY_NAME

If you wish to trim the name of the downloaded directory (i.e., so it downloads as DIRECTORY_NAME, not pub/geos-chem/data/DIRECTORY_NAME), then use the --cut-dirs option:

wget -r -nH --cut-dirs=X "ftp://ftp.as.harvard.edu/DIRECTORY_NAME/"

where X is the number of directories to trim.

NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then wget will just return a directory listing in a file named index.html without any files being downloaded.

Examples

1. Download all available GEOS-Chem 2° x 2.5° met field data files in the GEOS_2x2.5.d directory structure:

wget -r -nH "ftp://ftp.as.harvard.edu/gcgrid/geos-chem/data/GEOS_2x2.5.d/" &

Due to the huge volume of data involved, this is not recommended, as the file downloads may swamp your system. It's better to do download the data smaller chunks. For example:

2. Download all GEOS-5 met data at 2° x 2.5° resolution:

wget -r -nH "ftp://ftp.as.harvard.edu/gcgrid/geos-chem/data/GEOS_2x2.5.d/GEOS_5/" &

It may even be better to download one year at a time.:

3. Download all GEOS-5 met data at 2° x 2.5° resolution for 2008:

wget -r -nH "ftp://ftp.as.harvard.edu/gcgrid/geos-chem/data/GEOS_2x2.5.d/GEOS_5/2008/" &

--Melissa Sulprizio 16:57, 8 April 2015 (EDT)

Obtain only updated files

Prasad Kasibhatla wrote:

Maybe this is common knowledge, but I just discovered that using the -N option in wget ensures that only files with newer timestamps than what resides on my local machines are downloaded - found this very useful to update my shared data directories.

--Melissa Payer 10:39, 1 June 2012 (EDT)

Downloading the shared data directories via anonymous FTP

You can also download the shared data directories via anonymous FTP:

ftp ftp.as.harvard.edu
cd gcgrid/geos-chem/data/ExtData/GCAP_4x5
cd gcgrid/geos-chem/data/ExtData/GEOS_0.25x0.3125_CH
cd gcgrid/geos-chem/data/ExtData/GEOS_0.25x0.3125_NA
cd gcgrid/geos-chem/data/ExtData/GEOS_0.5x0.666_CH
cd gcgrid/geos-chem/data/ExtData/GEOS_0.5x0.666_NA
cd gcgrid/geos-chem/data/ExtData/GEOS_2x2.5
cd gcgrid/geos-chem/data/ExtData/GEOS_4x5
cd gcgrid/geos-chem/data/ExtData/GEOS_MEAN
cd gcgrid/geos-chem/data/ExtData/GEOS_NATIVE
cd gcgrid/geos-chem/data/ExtData/HEMCO
cd gcgrid/geos-chem/data/ExtData/CHEM_INPUTS

--Melissa Sulprizio 16:52, 8 April 2015 (EDT)

Downloading the HEMCO data directories

The GEOS-Chem Support Team has created a package called hemco_data_download. With this package, you can download the various emissions inventories and related data files for HEMCO to your own disk server. For complete instructions, please see our HEMCO data directories wiki page

--Melissa Sulprizio 16:52, 8 April 2015 (EDT)

For more information

For GCAP users

For those users who wish to run GEOS-Chem with the GISS/GCAP met fields, please contact Loretta Mickley.

--Bob Y. 09:52, 8 March 2010 (EST)

Question about directory structure

Shanna Shaked wrote:

We are working again on trying to run GEOS-Chem. However, we are encountering some errors that may be due to the directory structure. We find a discrepancy between the directory structure described in the GEOS-Chem manual and that available on the ftp site.
The GEOS-Chem manual describes a directory structure of:
   data/GEOS_4x5/GEOS_5/YYYY/MM
However, on the ftp site, we find a directory structure with an extra '.d':
   data/GEOS_4x5.d/GEOS_5/YYYY/MM
(the GEOS_5 folder is in GEOS_4x5.d rather than GEOS_4x5). There does exist a GEOS_4x5 that contains many of the emissions data, but does not contain GEOS_5.
If we leave the structure as is, and enter ../data/GEOS_4x5/ as our root data directory in input.geos, we get a file not found error when it looks for GEOS_5 within this directory (obviously).
If we instead enter ../data/GEOS_4x5.d as our root data directory, we get a file not found error when the program looks for emissions within this directory (lightning NOx emissions, in this case).
QUESTION: To solve this problem, we have moved the GEOS_5 folder into the GEOS_4x5 directory. [Is this] okay?

Bob Yantosca replied:

The only difference on our system between e.g. GEOS_4x5 and GEOS_4x5.d is that our sysadmin (Jack Yatteau) set up the ".d" directories separately so that they only contain met data (which is much larger than the emissions etc. data). That way he could separate the disks that just had met data from the disk that have the emissions data to facilitate our configuration here. There are symbolic links from GEOS_4x5 to GEOS_4x5.d etc. (i.e. the directory GEOS_4x5/GEOS_5 is actually a symbolic link to the corresponding directory in GEOS_4x5.d/GEOS_5/ and etc. for the other met field resolutions & directories).
You don't necessarily have to do this on your end, but this is what we did here. You can just make the GEOS_4x5/GEOS_5 etc. real subdirectories and not symbolic links and store the data there. The solution you picked above is OK.
Also to facilitate FTP file transfer, you could do the following:
  • Write a script or an FTP macro
  • Use a 3rd-party GUI program like FireFTP in Mozilla Firefox.
  • Or even better yet, use the Unix wget utility (see below)
Each user is responsible for their own file transfers.

--Bob Y. 11:04, 5 February 2009 (EST)