Difference between revisions of "Downloading data from WashU"
(→Basic syntax) |
|||
(26 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
+ | == Transition to http://geoschemdata.wustl.edu == | ||
− | + | The Washington University at St. Louis (WashU) server at http://geoschemdata.wustl.edu will soon be replacing the Compute Canada server at http://geoschemdata.computecanada.ca, which will be retired in late 2021 or 2022. You can access the WashU server the same way you accessed the Compute Canada server. The only thing you need to do is use the new URL. The new server has a higher throughput and it is easier for the GCST to maintain. Please report any issues to https://github.com/geoschem/geos-chem/issues/new/choose. | |
− | + | __FORCETOC__ | |
+ | '''''[[Downloading data with the GEOS-Chem dry-run option|Previous]] | [[Downloading data from Amazon Web Services cloud storage|Next]] | [[Getting Started with GEOS-Chem]]''''' | ||
+ | #[[Minimum system requirements for GEOS-Chem|Minimum system requirements]] | ||
+ | #[[Installing required software]] | ||
+ | #[[Configuring your computational environment]] | ||
+ | #[[Downloading GEOS-Chem source code|Downloading source code]] | ||
+ | #[[Downloading_GEOS-Chem_data_directories|Downloading data directories]] | ||
+ | #*[[Downloading data with the GEOS-Chem dry-run option|...with the GEOS-Chem dry-run option]] | ||
+ | #*<span style="color:blue">'''... from WashU'''</span> | ||
+ | #*[[Downloading data from Amazon Web Services cloud storage|... from Amazon Web Services cloud storage]] | ||
+ | #[[Creating GEOS-Chem run directories|Creating run directories]] | ||
+ | #[[GEOS-Chem input files|Configuring runs]] | ||
+ | #[[Compiling GEOS-Chem|Compiling]] | ||
+ | #[[Running GEOS-Chem|Running]] | ||
+ | #[[GEOS-Chem output files|Output files]] | ||
+ | #[[Python tools for use with GEOS-Chem]] | ||
+ | #[[GEOS-Chem_coding_and_debugging|Coding and debugging]] | ||
+ | #[[GEOS-Chem_overview#Further_reading|Further reading]] | ||
− | http://geoschemdata. | + | On this page, we provide information about how to manually download GEOS-Chem input data (met fields, emissions, etc.) from the WashU storage archive. But we recommend [[Downloading data with the GEOS-Chem dry-run option|downloading data with the GEOS-Chem dry-run option]] (which will be available in [[GEOS-Chem 12#12.7.0|GEOS-Chem 12.7.0]], as this greatly simplifies the data download process. |
+ | |||
+ | ''NOTE: If you have already used the GEOS-Chem dry-run option to download data, you can skip ahead to [[Creating GEOS-Chem run directories|'''Creating Run Directories.''']]'' | ||
+ | |||
+ | == Directory structure == | ||
+ | |||
+ | The GEOS-Chem shared data directories may be downloaded from the WashU server: | ||
+ | |||
+ | http://geoschemdata.wustl.edu | ||
which has the following directory structure: | which has the following directory structure: | ||
Line 133: | Line 159: | ||
== Data download commands == | == Data download commands == | ||
− | We recommend that you use the free and open-source [https://www.gnu.org/software/wget wget utility] to download data from | + | We recommend that you use the free and open-source [https://www.gnu.org/software/wget wget utility] to download data from WashU. Most modern Unix systems have wget already installed. |
=== Basic syntax === | === Basic syntax === | ||
− | The basic formula to download data from | + | The basic formula to download data from WashU to your local server is: |
− | <nowiki>wget OPTIONS "http://geoschemdata. | + | <nowiki>wget OPTIONS "http://geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME"</nowiki> |
+ | |||
+ | ''NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then <tt>wget</tt> will just return a directory listing in a file named <tt>index.html</tt> without any files being downloaded.'' | ||
Commonly used options with wget are: | Commonly used options with wget are: | ||
Line 146: | Line 174: | ||
|-bgcolor="#CCCCCC" | |-bgcolor="#CCCCCC" | ||
!width="100px"|wget option | !width="100px"|wget option | ||
− | !width=" | + | !width="900px"|Description |
|-valign="top" | |-valign="top" | ||
Line 154: | Line 182: | ||
|-valign="top" | |-valign="top" | ||
|<tt>-nH</tt> | |<tt>-nH</tt> | ||
− | | | + | |Omits the remote root directory name from the local directory name. |
+ | *i.e. Downloads <tt>geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME</tt> to local folder <tt>ExtData/DIRECTORY_NAME</tt>. | ||
|-valign="top" | |-valign="top" | ||
Line 163: | Line 192: | ||
|<tt>-P path</tt> | |<tt>-P path</tt> | ||
|Copies data to the specified directory | |Copies data to the specified directory | ||
− | *e.g. | + | *e.g. Specifying <tt>-P /home/data</tt> will copy <tt>geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME</tt><br>to <tt>/home/data/ExtData/DIRECTORY_NAME</tt>, etc. |
|-valign="top" | |-valign="top" | ||
|<tt>-r</tt> | |<tt>-r</tt> | ||
− | |Specifies recursive directory transfer (i.e. will download all subdirectories) | + | |Specifies recursive directory transfer (i.e. will download all subdirectories). |
|-valign="top" | |-valign="top" | ||
|<tt>-R "*.html"</tt> | |<tt>-R "*.html"</tt> | ||
− | |Skips downloading files ending in <tt>*.html</tt> | + | |Skips downloading files ending in <tt>*.html</tt>. |
− | + | ||
|} | |} | ||
− | === | + | === Examples === |
− | + | 1. Download remote directory <tt>geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME</tt> to <tt>./ExtData/DIRECTORY_NAME</tt>. | |
− | + | wget -r -np -nH -N -R "*.html" <nowiki>"http://geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME"</nowiki> | |
− | |||
− | < | + | 2. Similar to Example 1 above, but will download remote directory <tt>geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME</tt> to <tt>./DIRECTORY_NAME</tt>. We have used <span style="color:red"><tt>--cut-dirs=1</tt></span> to trim one level off the downloaded directory. |
+ | |||
+ | wget -r -np -nH -N -R "*.html" <span style="color:red">--cut-dirs=1</span> <nowiki>"http://geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME/"</nowiki> | ||
− | |||
− | + | 3. Similar to Example 1 above, but will download the directory <tt>geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME</tt> to <tt><span style="color:red">/pub/gcgrid/data</span>/ExtData/DIRECTORY_NAME</tt>: | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | wget -r -np -nH -N -R "*.html" <span style="color:red">-P /pub/gcgrid/data</span> <nowiki>"http://geoschemdata.wustl.eduExtData/DIRECTORY_NAME"</nowiki> | |
− | + | ||
− | + | == File transfers with Globus == | |
+ | |||
+ | To simplify downloading data from WashU, you may also utilize [https://www.globus.org/ Globus]. Globus requires a subscription, but many organizations already have access to Globus. Check with your local IT staff to determine if your institution supports Globus. For more information see also [https://www.globus.org/data-transfer ''Data transfer with Globus'']. | ||
− | + | Follow [https://docs.globus.org/how-to/get-started/#the_transfer_files_page these instructions] to log into the Globus web interface and begin your file transfer. To access the endpoint for GEOS-Chem data on WashU, type '''GEOS-Chem Data (WashU)''' in the Collection field on the File Manager page. | |
− | + | == Further reading == | |
− | + | #[https://www.gnu.org/software/wget/ <tt>wget</tt> @ GNU.org] | |
+ | #[https://www.computerhope.com/unix/wget.htm ''Linux <tt>wget</tt> command help and examples'' (Computer Hope)] | ||
+ | #[https://www.hostinger.com/tutorials/wget-command-examples/ ''What is the <tt>wget</tt> command and how to use it?'' (Hostinger tutorials)] | ||
− | |||
− | --[[ | + | ---- |
+ | '''''[[Downloading data with the GEOS-Chem dry-run option|Previous]] | [[Downloading data from Amazon Web Services cloud storage|Next]] | [[Getting Started with GEOS-Chem]]''''' |
Latest revision as of 16:35, 11 October 2021
Contents
Transition to http://geoschemdata.wustl.edu
The Washington University at St. Louis (WashU) server at http://geoschemdata.wustl.edu will soon be replacing the Compute Canada server at http://geoschemdata.computecanada.ca, which will be retired in late 2021 or 2022. You can access the WashU server the same way you accessed the Compute Canada server. The only thing you need to do is use the new URL. The new server has a higher throughput and it is easier for the GCST to maintain. Please report any issues to https://github.com/geoschem/geos-chem/issues/new/choose.
Previous | Next | Getting Started with GEOS-Chem
- Minimum system requirements
- Installing required software
- Configuring your computational environment
- Downloading source code
- Downloading data directories
- Creating run directories
- Configuring runs
- Compiling
- Running
- Output files
- Python tools for use with GEOS-Chem
- Coding and debugging
- Further reading
On this page, we provide information about how to manually download GEOS-Chem input data (met fields, emissions, etc.) from the WashU storage archive. But we recommend downloading data with the GEOS-Chem dry-run option (which will be available in GEOS-Chem 12.7.0, as this greatly simplifies the data download process.
NOTE: If you have already used the GEOS-Chem dry-run option to download data, you can skip ahead to Creating Run Directories.
Directory structure
The GEOS-Chem shared data directories may be downloaded from the WashU server:
http://geoschemdata.wustl.edu
which has the following directory structure:
Directory | Description |
---|---|
ExtData/ | Root data directory containing all meteorlogy fields, emissions data, and chemistry input data. |
ExtData/CHEM_INPUTS/ | Contains non-emissions data for GEOS-Chem chemistry modules |
ExtData/HEMCO/ | Contains emissions data for the HEMCO emissions component |
ExtData/GEOSCHEM_RESTARTS/ | Contains sample restart files uses to initialize GEOS-Chem simulations. |
0.25° x 0.3125° Data Directories | Description |
ExtData/GEOS_0.25x0.3125/GEOS_FP/YYYY/MM/ | 0.25° x 0.3125° GEOS-FP global met fields |
ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/YYYY/MM/ | 0.25° x 0.3125° GEOS-FP met fields cropped to the Asia domain |
ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/YYYY/MM/ | 0.25° x 0.3125° GEOS-FP met fields cropped to the China domain |
ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/YYYY/MM/ | 0.25° x 0.3125° GEOS-FP met fields cropped to the Europe domain |
ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/YYYY/MM/ | 0.25° x 0.3125° GEOS-FP met fields cropped to the North America domain |
0.5° x 0.625° Data Directories | Description |
ExtData/GEOS_0.5x0.625/MERRA2/YYYY/MM/ | 0.5° x 0.625° MERRA-2 global met fields |
ExtData/GEOS_0.5x0.625_AS/MERRA2/YYYY/MM/ | 0.5° x 0.625° MERRA-2 met fields cropped to the Asia domain |
ExtData/GEOS_0.5x0.625_CH/MERRA2/YYYY/MM/ | 0.5° x 0.625° MERRA-2 met fields cropped to the China domain |
ExtData/GEOS_0.5x0.625_EY/MERRA2/YYYY/MM/ | 0.5° x 0.625° MERRA-2 met fields cropped to the Europe domain |
ExtData/GEOS_0.5x0.625_NA/MERRA2/YYYY/MM/ | 0.5° x 0.625° MERRA-2 met fields cropped to the North America domain |
2° x 2.5° Data Directories | Description |
ExtData/GEOS_2x2.5/GEOS_FP/YYYY/MM | 2° x 2.5° GEOS-FP global met fields |
ExtData/GEOS_2x2.5/MERRA2/YYYY/MM | 2° x 2.5° MERRA-2 global met fields |
4° x 5° Data Directories | Description |
ExtData/GEOS_4x5/GEOS_FP/YYYY/MM/ | 4° x 5° GEOS-FP global met fields |
ExtData/GEOS_4x5/MERRA2/YYYY/MM/ | 4° x 5° MERRA-2 global met fields |
--Bob Yantosca (talk) 19:21, 11 December 2019 (UTC)
GEOS-FP and MERRA-2 constant data files
If you are downloading the GEOS-FP or MERRA-2 met data, then please note that you must also download the "CN" (constant) data files for each horizontal grid that you are using.
For GEOS-FP these are timestamped for 2011/01/01 and are found in these data directories:
- ExtData/GEOS_0.25x0.3125/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.nc
- ExtData/GEOS_0.25x0.3125_AS/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.AS.nc
- ExtData/GEOS_0.25x0.3125_CH/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.CH.nc
- ExtData/GEOS_0.25x0.3125_EU/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.EU.nc
- ExtData/GEOS_0.25x0.3125_NA/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.NA.nc
- ExtData/GEOS_2x2.5/GEOS_FP/2011/01/GEOSFP.20110101.CN.2x25.nc
- ExtData/GEOS_4x5/GEOS_FP/2011/01/GEOSFP.20110101.CN.4x5.nc
For MERRA-2 these are timestamped for 2015/01/01 and are found in these data directories :
- ExtData/GEOS_0.5x0.625/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.nc4
- ExtData/GEOS_0.5x0.625_AS/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.AS.nc4
- ExtData/GEOS_0.5x0.625_CH/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.CH.nc4
- ExtData/GEOS_0.5x0.625_EU/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.EU.nc4
- ExtData/GEOS_0.5x0.625_NA/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.NA.nc4
- ExtData/GEOS_2x2.5/MERRA2/2015/01/MERRA2.20150101.CN.2x25.nc4
- ExtData/GEOS_4x5/MERRA2/2015/01/MERRA2.20150101.CN.4x5.nc4
Additional notes:
- Prior to downloading GEOS-FP data, please be aware of caveats regarding use of GEOS-FP. See the GEOS-FP wiki page for more information.
--Bob Yantosca (talk) 19:20, 11 December 2019 (UTC)
Data download commands
We recommend that you use the free and open-source wget utility to download data from WashU. Most modern Unix systems have wget already installed.
Basic syntax
The basic formula to download data from WashU to your local server is:
wget OPTIONS "http://geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME"
NOTE: The URL must be enclosed in quotes for file transfer to occur. If you omit the quotes then wget will just return a directory listing in a file named index.html without any files being downloaded.
Commonly used options with wget are:
wget option | Description |
---|---|
-np | Will not allow ascent to the parent directory |
-nH | Omits the remote root directory name from the local directory name.
|
-N | Downloads only those files having newer timestamps than any local copies. |
-P path | Copies data to the specified directory
|
-r | Specifies recursive directory transfer (i.e. will download all subdirectories). |
-R "*.html" | Skips downloading files ending in *.html. |
Examples
1. Download remote directory geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME to ./ExtData/DIRECTORY_NAME.
wget -r -np -nH -N -R "*.html" "http://geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME"
2. Similar to Example 1 above, but will download remote directory geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME to ./DIRECTORY_NAME. We have used --cut-dirs=1 to trim one level off the downloaded directory.
wget -r -np -nH -N -R "*.html" --cut-dirs=1 "http://geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME/"
3. Similar to Example 1 above, but will download the directory geoschemdata.wustl.edu/ExtData/DIRECTORY_NAME to /pub/gcgrid/data/ExtData/DIRECTORY_NAME:
wget -r -np -nH -N -R "*.html" -P /pub/gcgrid/data "http://geoschemdata.wustl.eduExtData/DIRECTORY_NAME"
File transfers with Globus
To simplify downloading data from WashU, you may also utilize Globus. Globus requires a subscription, but many organizations already have access to Globus. Check with your local IT staff to determine if your institution supports Globus. For more information see also Data transfer with Globus.
Follow these instructions to log into the Globus web interface and begin your file transfer. To access the endpoint for GEOS-Chem data on WashU, type GEOS-Chem Data (WashU) in the Collection field on the File Manager page.
Further reading
- wget @ GNU.org
- Linux wget command help and examples (Computer Hope)
- What is the wget command and how to use it? (Hostinger tutorials)