Difference between revisions of "Downloading data from WashU"

From Geos-chem
Jump to: navigation, search
(Basic syntax)
(Example 1)
Line 176: Line 176:
 
|}
 
|}
  
=== Example 1 ===
+
=== Examples ===
  
 
For example, this command will download an entire directory (and its subdirectories) from Compute Canada to your current folder:
 
For example, this command will download an entire directory (and its subdirectories) from Compute Canada to your current folder:
Line 185: Line 185:
  
 
  <nowiki>wget -r -np -nH -R "*.html" -P /your/data/root "http://geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME"</nowiki>
 
  <nowiki>wget -r -np -nH -R "*.html" -P /your/data/root "http://geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME"</nowiki>
 
The options to <tt>wget</tt> are as follows:
 
 
-r  Specifies recursive directory transfer (i.e. will download all subdirectories)
 
-np Will not allow ascent to the parent directory
 
-nH  Will store all directories and subdirectories in <tt>DIRECTORY_NAME</tt>, not <tt>geoschemdata.computecanada.ca/DIRECTORY_NAME</tt>
 
-R "*.html" Will reject any file ending in .html
 
-P specifies a local directory prefix
 
-N tells wget to only download files having newer timestamps rather than all files.
 
  
 
If you wish to trim the name of the downloaded directory (i.e., so it downloads as <tt>DIRECTORY_NAME</tt>, not <tt>pub/geos-chem/data/DIRECTORY_NAME</tt>), then use the <tt>--cut-dirs</tt> option:
 
If you wish to trim the name of the downloaded directory (i.e., so it downloads as <tt>DIRECTORY_NAME</tt>, not <tt>pub/geos-chem/data/DIRECTORY_NAME</tt>), then use the <tt>--cut-dirs</tt> option:
Line 201: Line 192:
 
where <tt>X</tt> is the number of directories to trim.   
 
where <tt>X</tt> is the number of directories to trim.   
  
<span style="color:red">'''''NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then <tt>wget</tt> will just return a directory listing in a file named <tt>index.html</tt> without any files being downloaded.'''''</span>
+
 
  
 
Prasad Kasibhatla wrote:
 
Prasad Kasibhatla wrote:

Revision as of 17:18, 12 December 2019

Compute Canada directory structure

The GEOS-Chem shared data directories may be downloaded from the Compute Canada server:

http://geoschemdata.computecanada.ca

which has the following directory structure:

Directory Description
ExtData/ Root data directory containing all meteorlogy fields, emissions data, and chemistry input data.
ExtData/CHEM_INPUTS/ Contains non-emissions data for GEOS-Chem chemistry modules
ExtData/HEMCO/ Contains emissions data for the HEMCO emissions component
ExtData/GEOSCHEM_RESTARTS/ Contains sample restart files uses to initialize GEOS-Chem simulations.
0.25° x 0.3125° Data Directories Description
ExtData/GEOS_0.25x0.3125/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP global met fields
ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the Asia domain
ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the China domain
ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the Europe domain
ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/YYYY/MM/ 0.25° x 0.3125° GEOS-FP met fields cropped to the North America domain
0.5° x 0.625° Data Directories Description
ExtData/GEOS_0.5x0.625/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 global met fields
ExtData/GEOS_0.5x0.625_AS/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the Asia domain
ExtData/GEOS_0.5x0.625_CH/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the China domain
ExtData/GEOS_0.5x0.625_EY/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the Europe domain
ExtData/GEOS_0.5x0.625_NA/MERRA2/YYYY/MM/ 0.5° x 0.625° MERRA-2 met fields cropped to the North America domain
2° x 2.5° Data Directories Description
ExtData/GEOS_2x2.5/GEOS_FP/YYYY/MM 2° x 2.5° GEOS-FP global met fields
ExtData/GEOS_2x2.5/MERRA2/YYYY/MM 2° x 2.5° MERRA-2 global met fields
4° x 5° Data Directories Description
ExtData/GEOS_4x5/GEOS_FP/YYYY/MM/ 4° x 5° GEOS-FP global met fields
ExtData/GEOS_4x5/MERRA2/YYYY/MM/ 4° x 5° MERRA-2 global met fields

--Bob Yantosca (talk) 19:21, 11 December 2019 (UTC)

GEOS-FP and MERRA-2 constant data files

If you are downloading the GEOS-FP or MERRA-2 met data, then please note that you must also download the "CN" (constant) data files for each horizontal grid that you are using.

For GEOS-FP these are timestamped for 2011/01/01 and are found in these data directories:

  • ExtData/GEOS_0.25x0.3125/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.nc
  • ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.AS.nc
  • ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.CH.nc
  • ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.EU.nc
  • ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.NA.nc
  • ExtData/GEOS_2x2.5/GEOS_FP/2011/01/GEOSFP.20110101.CN.2x25.nc
  • ExtData/GEOS_4x5/GEOS_FP/2011/01/GEOSFP.20110101.CN.4x5.nc

For MERRA-2 these are timestamped for 2015/01/01 and are found in these data directories :

  • ExtData/GEOS_0.5x0.625/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.nc4
  • ExtData/GEOS_0.5x0.625_AS/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.AS.nc4
  • ExtData/GEOS_0.5x0.625_CH/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.CH.nc4
  • ExtData/GEOS_0.5x0.625_EU/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.EU.nc4
  • ExtData/GEOS_0.5x0.625_NA/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.NA.nc4
  • ExtData/GEOS_2x2.5/MERRA2/2015/01/MERRA2.20150101.CN.2x25.nc4
  • ExtData/GEOS_4x5/MERRA2/2015/01/MERRA2.20150101.CN.4x5.nc4

Additional notes:

  • Prior to downloading GEOS-FP data, please be aware of caveats regarding use of GEOS-FP. See the GEOS-FP wiki page for more information.

--Bob Yantosca (talk) 19:20, 11 December 2019 (UTC)

Data download commands

We recommend that you use the free and open-source wget utility to download data from Compute Canada. Most modern Unix systems have wget already installed.

Basic syntax

The basic formula to download data from Compute Canada to your local server is:

wget OPTIONS "http://geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME"

Commonly used options with wget are:

wget option Description
-np Will not allow ascent to the parent directory
-nH Omits the remote root directory name from the local directory name.
  • i.e. Downloads geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME to local folder ExtData/DIRECTORY_NAME.
-N Downloads only those files having newer timestamps than any local copies.
-P path Copies data to the specified directory
  • e.g. Specifying -P /home/data will copy geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME
    to /home/data/ExtData/DIRECTORY_NAME, etc.
-r Specifies recursive directory transfer (i.e. will download all subdirectories).
-R "*.html" Skips downloading files ending in *.html.

Examples

For example, this command will download an entire directory (and its subdirectories) from Compute Canada to your current folder:

wget -r -np -nH -R "*.html" "http://geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME"

Or if you wish to download the folder to a different directory:

wget -r -np -nH -R "*.html" -P /your/data/root "http://geoschemdata.computecanada.ca/ExtData/DIRECTORY_NAME"

If you wish to trim the name of the downloaded directory (i.e., so it downloads as DIRECTORY_NAME, not pub/geos-chem/data/DIRECTORY_NAME), then use the --cut-dirs option:

wget -r -np -nH -R "*.html" --cut-dirs=X "http://geoschemdata.computecanada.ca/DIRECTORY_NAME/"

where X is the number of directories to trim.


Prasad Kasibhatla wrote:

   Maybe this is common knowledge, but I just discovered that using the -N option in wget ensures that only files with newer timestamps than what resides on my local machines are downloaded - found this very useful to update my shared data directories.

--Bob Yantosca (talk) 19:15, 11 December 2019 (UTC)