Difference between revisions of "GEOS-Chem output files"
(→Viewing and manipulating netCDF files) |
m (→Regridding netCDF restart files) |
||
Line 131: | Line 131: | ||
:3. Jenny Fisher provided an IDL script <tt>regridh_restart_nc.pro</tt> which is now included with [http://acmg.seas.harvard.edu/gamap/doc/ GAMAP]. Jenny wrote: | :3. Jenny Fisher provided an IDL script <tt>regridh_restart_nc.pro</tt> which is now included with [http://acmg.seas.harvard.edu/gamap/doc/ GAMAP]. Jenny wrote: | ||
− | <blockquote>The one caveat that I cannot figure out is something weird with the units. I have done this in such a way that it reads in whatever attributes (including units) are in the original file and writes the same units back to the regridded file. But when GEOS-Chem reads the new file it gives an “incompatible units” error — although printing them in the code shows something identical to what the case statement is searching for, as far as I can tell. If I use ncatted to overwrite the units (with identical name) they are fine. So it must be something to do with the way IDL encodes the units string, but I have no idea what.</blockquote> | + | <blockquote>The one caveat that I cannot figure out is something weird with the units. I have done this in such a way that it reads in whatever attributes (including units) are in the original file and writes the same units back to the regridded file. But when GEOS-Chem reads the new file it gives an “incompatible units” error — although printing them in the code shows something identical to what the case statement is searching for, as far as I can tell. If I use ncatted to overwrite the units (with identical name) they are fine. So it must be something to do with the way IDL encodes the units string, but I have no idea what. The relevant nco command is <tt>ncatted -O -a units,,m,c,"mol mol-1" INFILENAME.nc</tt></blockquote> |
:4. Regridding routines written in [http://www.ncl.ucar.edu/ NCL] are also available in the [[NCL_tools_for_GEOS-Chem|NCL4GC package]] available on Bitbucket. | :4. Regridding routines written in [http://www.ncl.ucar.edu/ NCL] are also available in the [[NCL_tools_for_GEOS-Chem|NCL4GC package]] available on Bitbucket. |
Revision as of 06:52, 20 January 2017
This page contains information about files produced by GEOS-Chem simulations, including diagnostic data and restart files used for initial conditions. For information the input files that ship with the GEOS-Chem run directories, please see our GEOS-Chem Input Files wiki page.
Contents
File Formats
Binary Punch File Format
In GEOS-Chem v10-01 and earlier versions, the GEOS-Chem diagnostic output and restart files are in "binary punch" format. These binary punch files may be viewed and manipulated with the IDL-based data visualization package GAMAP which is maintained and supported by the GEOS-Chem Support Team. Several users have also developed Python software packages for reading, visualizing, and processing GEOS-Chem binary punch files.
Binary punch file format will be phased out beginning with GEOS-Chem v11-01. Default reading and writing NetCDF restart files for all simulations except mercury will be introduced in v11-01f, as described in the next section. Temporarily, users may choose to use a binary punch input restart file by specifying BPCH_RST_IN=y at compile time. Similarly, users may opt to output restart files in binary punch format by specifying compile option BPCH_RST_OUT=y. Diagnostic output will remain in binary punch format in v11-01f, automatically setting BPCH_DIAG=y at compile time.
NetCDF File Format
The HEMCO emissions component contains utility functions and linked-list objects to save diagnostic output (time-averaged or instantaneous) as well as end-of-run restart information to netCDF files. In v10-01, this feature is used only for emissions-related data. The GEOS-Chem Support Team is currently leveraging HEMCO’s I/O capabilities to replace all binary punch input/output with netCDF within GEOS-Chem v11-01. This work has started in v11-01f and will continue into future internal versions.
Migrating to netCDF format will bring the following benefits:
- Using HEMCO’s diagnostic archival functions will ensure that all quantities will be archived consistently throughout GEOS-Chem.
- Use of netCDF output will facilitate further GEOS-Chem High Performance development since binary files cannot be written efficiently in high-performance computing (HPC) environments.
- GEOS-Chem users will have more options for visualizing GEOS-Chem output since many free and open-source software packages (e.g. Python tools) are compatible with netCDF format.
This work will be done in four phases:
Phase | Description | Version | Status |
---|---|---|---|
Phase 1 |
|
v11-01f | 1-month benchmark: Approved 29 Mar 2016 1-year benchmark: |
Phase 2 |
|
v11-01g | 1-month benchmark: Approved 14 Sep 2016 1-year benchmark: |
Phase 3a |
|
TBD | In Development |
Phase 3b |
|
TBD | In the pipeline |
Phase 3c |
|
TBD | In the pipeline |
Phase 3d |
|
TBD | In the pipeline |
Phase 4 |
NOTE: This will not be done until all benchmarking utility tools are updated to work with GEOS-Chem netCDF diagnostic output. |
TBD | In the pipeline |
In order to facilitate testing, there will be a transition period in which both "binary punch" and netCDF diagnostic output formats will be available. For the time being, the default option will be to write out diagnostics to binary output (i.e. BPCH_DIAG=yes) and read/write restart files in netCDF. To select netCDF diagnostic output, you can compile GEOS-Chem with the NC_DIAG=yes option.
Viewing and manipulating netCDF files
There are many free and open-source software packages readily available for visualizing and manipulating netCDF files. These tools will reduce the need for the GEOS-Chem user community to rely on IDL (and GAMAP), which can be prohibitively expensive for some user groups. Some recommend tools are listed below.
- 1. ncdump: This command-line tool generates a text representation of netCDF data and can be used to quickly view the variables contained in a netCDF file. For example:
# Display header information (dimensions, variables, attributes) only ncdump -h GEOSChem_restart.201308010000 | less # Display header information followed by data values for variable(s) specified ncdump -v SPC_NO GEOSChem_restart.201308010000 | less
- 2. ncview: Visualization package for netCDF files, recommended for a quick and easy look at netCDF files.
- 3. Panoply: Data viewer for netCDF files. This package offers an alternative to ncview. From our experience, Panoply works nicely when installed on the desktop, but is slow to respond in the Linux environment.
- 5. Other programming languages may be used for viewing or manipulating netCDF files, including but not limited to:
Some of the tools listed above, such as ncdump and ncview, may come pre-installed on your system. Others like NCO, CDO, and Panoply may need to be installed or loaded (e.g. via the module load command). Check with your system administrator or IT staff to see what is available on your system.
--Melissa Sulprizio (talk) 15:58, 17 January 2017 (UTC)
Regridding netCDF restart files
The same netCDF tools listed above can also be used to regrid netCDF restart files produced by GEOS-Chem.
- 1. CDO: The Climate Data Operators include tools for regridding netCDF files. For example, the following command will apply distance-weighted regridding:
cdo remapdis,gridfile infile.nc outfile.nc
For gridfile, you can use the files in ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/HEMCO/grids/. See the Interpolation section in the CDO Guide for more information.
- 2. NCO: The netCDF Operators also include tools for regridding. See the Regridding section of the NCO User Guide for more information.
- 3. Jenny Fisher provided an IDL script regridh_restart_nc.pro which is now included with GAMAP. Jenny wrote:
The one caveat that I cannot figure out is something weird with the units. I have done this in such a way that it reads in whatever attributes (including units) are in the original file and writes the same units back to the regridded file. But when GEOS-Chem reads the new file it gives an “incompatible units” error — although printing them in the code shows something identical to what the case statement is searching for, as far as I can tell. If I use ncatted to overwrite the units (with identical name) they are fine. So it must be something to do with the way IDL encodes the units string, but I have no idea what. The relevant nco command is ncatted -O -a units,,m,c,"mol mol-1" INFILENAME.nc
- 4. Regridding routines written in NCL are also available in the NCL4GC package available on Bitbucket.
--Melissa Sulprizio (talk) 22:06, 19 January 2017 (UTC)
GEOS-Chem v11-02
Planned updated for this version include:
- wetscav_mod.F module-level variables H2O2s and SO2s will be output to the GEOS-Chem restart file and then initialized to the values saved in the restart file at the start of the next simulation. Prior to this update, both H2O2s and SO2s were initialized to the H2O2 and SO2 tracer concentrations at the start of every simulation. This change will impact multi-segmented runs only.
- get_ndep_mod.F variables DRY_TOTN and WET_TOTN, which represesnt total deposited nitrogen, will be output to the GEOS-Chem restart file and then initialized to the values saved in the restart file at the start of the next simulation. Prior to this update, both variables were initialized to zero at the start of every simulation. Storing them in the restart file may improve accuracy of soil NOx emissions over multi-segmented runs. This change will impact multi-segmented runs only.
GEOS-Chem v11-01
Impact of Output File Format Change
While migrating from binary punch to netCDF file formats is a structural change, several changes are also being introduced which will result in output differences. These changes include the following:
- Starting in v11-01f, species will always be written to the GEOS-Chem netCDF restart file and will always be initialized to restart file values if present in the file. In contrast, when using binary punch format restart files, species are initialized to background values stored in globchem.dat by default unless the option to read/write the species restart file is turned on in input.geos. This change will only impact multi-segmented full chemistry runs where users previously did not enable the species restart file.
- Starting in v11-01f, species units will be [mol/mol] in the netCDF restart file while they are in [molec/cm3/box] in the binary punch restart file. This change will facilitate the future replacement of tracers with species. However, due to the meteorology state-dependency of the unit conversion from [molec/cm3] to [mol/mol], there will be small differences in [molec/mc3] species values between using binary punch and Netcdf in multi-segmented runs. This is because the meteorology used at restart write is the timestep before the simulation end time while the meteorology used at restart read is the simulation start time which is the same as previous end time. This change will only impact multi-segmented full chemistry runs where users previously enabled the species restart file.
Output files in v11-01
Output files in GEOS-Chem v11-01 are listed below.
File | Data Format | Description | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
trac_avg.{met}_{grid}_{sim}.YYYYMMDDhh | BPCH | GEOS-Chem output diagnostic data | ||||||||||||||||||
GEOSChem_restart.YYYYMMDDhhmm | NetCDF | GEOS-Chem output restart file containing advected tracer concentrations in units of [mol/mol]. If running a full chemistry simulation, then the restart file will also contain species concentrations, also in units of [mol/mol]. If running a mercury simulation, then the restart file will also contain ocean mercury data. | ||||||||||||||||||
HEMCO_restart.YYYYMMDDhhmm | NetCDF | HEMCO output restart file | ||||||||||||||||||
HEMCO_diagnostics.YYYYMMDDhhmm | NetCDF | HEMCO output diagnostics data | ||||||||||||||||||
Optional output files: | ||||||||||||||||||||
stations.YYYYMMDDhh | BPCH | Diagnostic output file that contains ND48 station timeseries output as specified in input.geos. | ||||||||||||||||||
tsYYYYMMDDhh.bpch | BPCH | Diagnostic output file that contains ND49 instantaneous timeseries output as specified in input.geos. | ||||||||||||||||||
ts_24hr_avg.YYYYMMDDhh.bpch | BPCH | Diagnostic output file that contains ND50 24-hour average timeseries output as specified in input.geos. | ||||||||||||||||||
ts_satellite.YYYYMMDDhh.bpch | BPCH | Diagnostic output file that contains ND51 local-time-average (satellite) timeseries output as specified in input.geos. | ||||||||||||||||||
paranox_ts.YYYYMMDDhh.bpch | BPCH | Diagnostic output file that contains ND63 ship timeseries output as specified in input.geos. |
Legacy files made obsolete in v11-01
The following files will be rendered obsolete starting in GEOS-Chem v11-01.
File | Description | Reason for removal |
---|---|---|
smv2.log | SMVGEAR II log file containing information about reactions and species used by the ND65 prod-loss diagnostic. | This was removed when SMVGEAR was replaced by FlexChem. |
trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.mp or trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.sp |
Tracer restart file that contains instantaneous tracer concentrations at simulation end time. | The new netCDF restart file format (prefix GEOSCHEM_restart_) contains concentrations for all species (whether advected or not). |
spec_rst.geosfp_4x5_fullchem.YYYYMMDDhh | Species restart file that contains instantaneous species concentrations at simulation end time. This file saves instantaneous concentrations of all chemical species (listed in globchem.dat) on all levels and may be used to initialize another GEOS-Chem simulation. | The new netCDF restart file format (prefix GEOSCHEM_restart_) contains concentrations for all species (whether advected or not). |
ocean_rst.YYYYMMDDhhmm.nc | Ocean mercury restart file containing several quantities pertaining to the ocean mercury module saved at the end of the GEOS-Chem simulation. Like the tracer restart file, this file may be used to initialize another GEOS-Chem mercury simulation. | The new netCDF restart file format (prefix GEOSCHEM_restart_) contains concentrations for all species (whether advected or not). |
GEOS-Chem v10-01 and earlier versions
After a GEOS-Chem simulation has completed successfully, you will see several new files in the run directory. Some of these are generated for all simulations while others may be simulation-dependent.
For full-chemistry simulations, the following files are saved out:
File | Description |
---|---|
trac_avg.geosfp_4x5_fullchem.YYYYMMDDhh.mp or trac_avg.geosfp_4x5_fullchem.YYYYMMDDhh.sp |
Diagnostic output file that contains time-averaged output for diagnostics enabled in input.geos.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. |
trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.mp or trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.sp |
Tracer restart file that contains instantaneous tracer concentrations at simulation end time. This file saves instantaneous concentrations of all transported tracers (listed in the Tracer Menu of input.geos) on all levels and may be used to initialize another GEOS-Chem simulation.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. |
HEMCO_restart.YYYYMMDDhh.nc.mp or HEMCO_restart.YYYYMMDDhh.nc.sp |
HEMCO restart file containing several quantities tracked by HEMCO for use in initializing another GEOS-Chem simulation. This file includes data used to calculate soil NOx emissions, PARANOX ship plume chemistry, and MEGAN biogenic emissions.
Data in this file are in netcdf format. |
v10-01.geosfp_4x5_fullchem.mp or v10-01.geosfp_4x5_fullchem.sp |
GEOS-Chem simulation log file.
Open this file in a text editor to diagnose compile and run errors. |
HEMCO.log.mp or HEMCO.log.sp |
HEMCO log file containing detailed information about the emissions read from disk into HEMCO.
Open this file in a text editor to diagnose HEMCO errors. |
smv2.log | SMVGEAR II log file containing information about reactions and species used by the ND65 prod-loss diagnostic. This file is only produced when SMVGEAR, and thus it will not be created for the specialty simulations.
Open this file in a text editor to see if SMVGEAR read the globchem.dat file properly. |
diaginfo.dat | File generated by GEOS-Chem that contains diagnostic quantities for use with GAMAP. |
tracerinfo.dat | File generated by GEOS-Chem that contains tracer name metadata for use with GAMAP. |
Additional files may be created by GEOS-Chem, depending on your simulation type and the options specified in the input.geos file. Other optional GEOS-Chem output files include:
File | Description |
---|---|
spec_rst.geosfp_4x5_fullchem.YYYYMMDDhh | Species restart file that contains instantaneous species concentrations at simulation end time. This file saves instantaneous concentrations of all chemical species (listed in globchem.dat) on all levels and may be used to initialize another GEOS-Chem simulation.
NOTE: If you are going to be running a very long GEOS–Chem simulation and must split your simulation into several stages (i.e. in order to stay within the computational time limits of your system), then you should set LSVCSPEC to T in the Chemistry Menu of input.geos. That will make sure that the chemical species concentrations are preserved when the next run stage starts. Otherwise, GEOS-Chem use the default species concentrations specified in globchem.dat at the beginning of each run. Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. |
stations.YYYYMMDDhh | Diagnostic output file that contains ND48 station timeseries output as specified in input.geos.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. For more information on working with ND48 timeseries output, see our timeseries tutorial. |
tsYYYYMMDDhh.bpch | Diagnostic output file that contains ND49 instantaneous timeseries output as specified in input.geos.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. For more information on working with ND49 timeseries output, see our timeseries tutorial. |
ts_24hr_avg.YYYYMMDDhh.bpch | Diagnostic output file that contains ND50 24-hour average timeseries output as specified in input.geos.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. |
ts_satellite.YYYYMMDDhh.bpch | Diagnostic output file that contains ND51 local-time-average (satellite) timeseries output as specified in input.geos.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. |
paranox_ts.YYYYMMDDhh.bpch | Diagnostic output file that contains ND63 ship timeseries output as specified in input.geos.
Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. |
ocean_rst.YYYYMMDDhhmm.nc | Ocean mercury restart file containing several quantities pertaining to the ocean mercury module saved at the end of the GEOS-Chem simulation. Like the tracer restart file, this file may be used to initialize another GEOS-Chem mercury simulation.
Data in this file are in the in netcdf format. |
plane.log.YYYYMMDD | Diagnostic output file containing planeflight information scheduled via the Planeflight.dat file. This diagnostic is turned on in the Planeflight Menu of input.geos.
The data in this text file can be read and plotted using GAMAP routines CTM_READ_PLANEFLIGHT and PLANE_PLOT. |
--Melissa Sulprizio 15:42, 3 April 2015 (EDT)