GEOS-Chem Output Files

From Geos-chem
Jump to: navigation, search

This page contains information about files produced by GEOS-Chem simulations, including diagnostic data and restart files used for initial conditions. For information the input files that ship with the GEOS-Chem run directories, please see our GEOS-Chem Input Files wiki page.

File Formats

Binary Punch File Format

In GEOS-Chem v10-01 and earlier versions, the GEOS-Chem diagnostic output and restart files are in "binary punch" format. These binary punch files may be viewed and manipulated with the IDL-based data visualization package GAMAP which is maintained and supported by the GEOS-Chem Support Team. Several users have also developed Python software packages for reading, visualizing, and processing GEOS-Chem binary punch files.

Binary punch file format will be phased out beginning with GEOS-Chem v11-01. Default reading and writing NetCDF restart files for all simulations except mercury will be introduced in v11-01f, as described in the next section. Temporarily, users may choose to use a binary punch input restart file by specifying BPCH_RST_IN=y at compile time. Similarly, users may opt to output restart files in binary punch format by specifying compile option BPCH_RST_OUT=y. Diagnostic output will remain in binary punch format in v11-01f, automatically setting BPCH_DIAG=y at compile time.

NetCDF File Format

The HEMCO emissions component contains utility functions and linked-list objects to save diagnostic output (time-averaged or instantaneous) as well as end-of-run restart information to netCDF files. In v10-01, this feature is used only for emissions-related data. The GEOS-Chem Support Team is currently leveraging HEMCO’s I/O capabilities to replace all binary punch input/output with netCDF within GEOS-Chem v11-01. This work has started in v11-01f and will continue into future internal versions.

Migrating to netCDF format will bring the following benefits:

  1. Using HEMCO’s diagnostic archival functions will ensure that all quantities will be archived consistently throughout GEOS-Chem.
  2. Use of netCDF output will facilitate further GEOS-Chem High Performance development since binary files cannot be written efficiently in high-performance computing (HPC) environments.
  3. GEOS-Chem users will have more options for visualizing GEOS-Chem output since many free and open-source software packages (e.g. Python tools) are compatible with netCDF format.

This work will be done in four phases:

Phase Description Version Status
Phase 1
  • Change default input and output GEOS-Chem tracer restart files from binary punch to netCDF format for all simulations except mercury.
  • If performing a full chemistry simulation, include species concentrations in the output netCDF restart file in units of [mol/mol] along with advected tracer concentrations.
  • Include option to enable input and output binary punch restart files with BPCH_RST_IN=y and BPCH_RST_OUT=y.
v11-01f 1-month benchmark:
Approved 29 Mar 2016

1-year benchmark:
Approved 16 Apr 2016

Phase 2
  • Change default input and output GEOS-Chem tracer restart files from binary punch to netCDF format for the mercury simulation.
  • Ocean mercury restart data will be included in the output netCDF restart file along with advected tracer concentrations.
  • Remove option to enable input and output binary punch restart files with BPCH_RST_IN=y and BPCH_RST_OUT=y.
v11-01g 1-month benchmark:
Approved 14 Sep 2016

1-year benchmark:
Approved 28 Sep 2016

Phase 3a
  • Change default GEOS-Chem diagnostic output file from binary punch to netCDF format.
  • Include option to enable output binary punch diagnostic files with BPCH_DIAG=y.
TBD In Development
Phase 3b
  • Move PSC_RESTART variable from HEMCO restart file to GEOS-Chem restart file.
  • Apply restart file updates to eliminate differences between single and multi-segmented GEOS-Chem runs:
    • Archive fields DRY_TOTN, WET_TOTN, H2O2s, SO2s, and HSAVE_FOR_KPP as restart file variables.
    • Additional code updates as needed.
TBD In the pipeline
Phase 3c
  • Change default input and output GTMM restart files from binary punch to netCDF format for the mercury simulation.
TBD In the pipeline
Phase 3d
  • Change default input and output BC files from binary punch to netCDF format for the nested grid simulations.
TBD In the pipeline
Phase 4
  • Remove all binary punch format input/output capabilities from GEOS-Chem.

NOTE: This will not be done until all benchmarking utility tools are updated to work with GEOS-Chem netCDF diagnostic output.

TBD In the pipeline

In order to facilitate testing, there will be a transition period in which both "binary punch" and netCDF diagnostic output formats will be available. For the time being, the default option will be to write out diagnostics to binary output (i.e. BPCH_DIAG=yes) and read/write restart files in netCDF. To select netCDF diagnostic output you can compile GEOS-Chem with the NC_DIAG=yes option, but note that the netCDF diagnostics are still in development and have not been properly validated.

Viewing and manipulating netCDF files

There are many free and open-source software packages readily available for visualizing and manipulating netCDF files. These tools will reduce the need for the GEOS-Chem user community to rely on IDL (and GAMAP), which can be prohibitively expensive for some user groups. Some recommend tools are listed below.

1. ncdump: This command-line tool generates a text representation of netCDF data and can be used to quickly view the variables contained in a netCDF file. For example:
     # Display header information (dimensions, variables, attributes) only
     ncdump -h GEOSChem_restart.201308010000 | less	

     # Display header information followed by data values for variable(s) specified
     ncdump -v SPC_NO GEOSChem_restart.201308010000 | less
2. ncview: Visualization package for netCDF files, recommended for a quick and easy look at netCDF files.
3. Panoply: Data viewer for netCDF files. This package offers an alternative to ncview. From our experience, Panoply works nicely when installed on the desktop, but is slow to respond in the Linux environment.
4. NCO and CDO: Command-line tools for manipulating and analyzing netCDF files.
5. Other programming languages may be used for viewing or manipulating netCDF files, including but not limited to:

Some of the tools listed above, such as ncdump and ncview, may come pre-installed on your system. Others like NCO, CDO, and Panoply may need to be installed or loaded (e.g. via the module load command). Check with your system administrator or IT staff to see what is available on your system.

--Melissa Sulprizio (talk) 15:58, 17 January 2017 (UTC)

Regridding netCDF restart files

The same netCDF tools listed above can also be used to regrid netCDF restart files produced by GEOS-Chem.

1. CDO: The Climate Data Operators include tools for regridding netCDF files. For example, the following command will apply distance-weighted regridding:
     cdo remapdis,gridfile infile.nc outfile.nc
For gridfile, you can use the files in ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/HEMCO/grids/. See the Interpolation section in the CDO Guide for more information.
2. NCO: The netCDF Operators also include tools for regridding. See the Regridding section of the NCO User Guide for more information.
3. Jenny Fisher provided an IDL script regridh_restart_nc.pro which is now included with GAMAP. Jenny wrote:
The one caveat that I cannot figure out is something weird with the units. I have done this in such a way that it reads in whatever attributes (including units) are in the original file and writes the same units back to the regridded file. But when GEOS-Chem reads the new file it gives an “incompatible units” error — although printing them in the code shows something identical to what the case statement is searching for, as far as I can tell. If I use ncatted to overwrite the units (with identical name) they are fine. So it must be something to do with the way IDL encodes the units string, but I have no idea what. The relevant nco command is ncatted -O -a units,,m,c,"mol mol-1" INFILENAME.nc
4. Regridding routines written in NCL are also available in the NCL4GC package available on Bitbucket.

--Melissa Sulprizio (talk) 22:06, 19 January 2017 (UTC)

Cropping netCDF restart files to nested domains

If needed, regrid a coarse netCDF restart file to the nested grid resolution using GAMAP routine regridh_restart_nc.pro.

Cropping netCDF files can be easily achieved with tools such as CDO or NCO. For example, CDO has a SELBOX operator for selecting a box by specifying the lat/lon bounds:

cdo sellonlatbox,lon1,lon2,lat1,lat2 infile outfile

See page 44 of the CDO guide for more information.

--Melissa Sulprizio (talk) 19:04, 3 May 2017 (UTC)

GEOS-Chem v11-02

Planned updated for this version include:

  1. wetscav_mod.F module-level variables H2O2s and SO2s will be output to the GEOS-Chem restart file and then initialized to the values saved in the restart file at the start of the next simulation. Prior to this update, both H2O2s and SO2s were initialized to the H2O2 and SO2 tracer concentrations at the start of every simulation. This change will impact multi-segmented runs only.
  2. get_ndep_mod.F variables DRY_TOTN and WET_TOTN, which represesnt total deposited nitrogen, will be output to the GEOS-Chem restart file and then initialized to the values saved in the restart file at the start of the next simulation. Prior to this update, both variables were initialized to zero at the start of every simulation. Storing them in the restart file may improve accuracy of soil NOx emissions over multi-segmented runs. This change will impact multi-segmented runs only.

GEOS-Chem v11-01

Impact of Output File Format Change

While migrating from binary punch to netCDF file formats is a structural change, several changes are also being introduced which will result in output differences. These changes include the following:

  1. Starting in v11-01f, species will always be written to the GEOS-Chem netCDF restart file and will always be initialized to restart file values if present in the file. In contrast, when using binary punch format restart files, species are initialized to background values stored in globchem.dat by default unless the option to read/write the species restart file is turned on in input.geos. This change will only impact multi-segmented full chemistry runs where users previously did not enable the species restart file.
  2. Starting in v11-01f, species units will be [mol/mol] in the netCDF restart file while they are in [molec/cm3/box] in the binary punch restart file. This change will facilitate the future replacement of tracers with species. However, due to the meteorology state-dependency of the unit conversion from [molec/cm3] to [mol/mol], there will be small differences in [molec/mc3] species values between using binary punch and Netcdf in multi-segmented runs. This is because the meteorology used at restart write is the timestep before the simulation end time while the meteorology used at restart read is the simulation start time which is the same as previous end time. This change will only impact multi-segmented full chemistry runs where users previously enabled the species restart file.

Output files in v11-01

Output files in GEOS-Chem v11-01 are listed below.

File Data Format Description
trac_avg.{met}_{grid}_{sim}.YYYYMMDDhh BPCH GEOS-Chem output diagnostic data
GEOSChem_restart.YYYYMMDDhhmm NetCDF GEOS-Chem output restart file containing advected tracer concentrations in units of [mol/mol]. If running a full chemistry simulation, then the restart file will also contain species concentrations, also in units of [mol/mol]. If running a mercury simulation, then the restart file will also contain ocean mercury data.
HEMCO_restart.YYYYMMDDhhmm NetCDF HEMCO output restart file
HEMCO_diagnostics.YYYYMMDDhhmm NetCDF HEMCO output diagnostics data
Optional output files:
stations.YYYYMMDDhh BPCH Diagnostic output file that contains ND48 station timeseries output as specified in input.geos.
tsYYYYMMDDhh.bpch BPCH Diagnostic output file that contains ND49 instantaneous timeseries output as specified in input.geos.
ts_24hr_avg.YYYYMMDDhh.bpch BPCH Diagnostic output file that contains ND50 24-hour average timeseries output as specified in input.geos.
ts_satellite.YYYYMMDDhh.bpch BPCH Diagnostic output file that contains ND51 local-time-average (satellite) timeseries output as specified in input.geos.
paranox_ts.YYYYMMDDhh.bpch BPCH Diagnostic output file that contains ND63 ship timeseries output as specified in input.geos.

Legacy files made obsolete in v11-01

The following files will be rendered obsolete starting in GEOS-Chem v11-01.

File Description Reason for removal
smv2.log SMVGEAR II log file containing information about reactions and species used by the ND65 prod-loss diagnostic. This was removed when SMVGEAR was replaced by FlexChem.
trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.mp or
trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.sp
Tracer restart file that contains instantaneous tracer concentrations at simulation end time. The new netCDF restart file format (prefix GEOSCHEM_restart_) contains concentrations for all species (whether advected or not).
spec_rst.geosfp_4x5_fullchem.YYYYMMDDhh Species restart file that contains instantaneous species concentrations at simulation end time. This file saves instantaneous concentrations of all chemical species (listed in globchem.dat) on all levels and may be used to initialize another GEOS-Chem simulation. The new netCDF restart file format (prefix GEOSCHEM_restart_) contains concentrations for all species (whether advected or not).
ocean_rst.YYYYMMDDhhmm.nc Ocean mercury restart file containing several quantities pertaining to the ocean mercury module saved at the end of the GEOS-Chem simulation. Like the tracer restart file, this file may be used to initialize another GEOS-Chem mercury simulation. The new netCDF restart file format (prefix GEOSCHEM_restart_) contains concentrations for all species (whether advected or not).

GEOS-Chem v10-01 and earlier versions

After a GEOS-Chem simulation has completed successfully, you will see several new files in the run directory. Some of these are generated for all simulations while others may be simulation-dependent.

For full-chemistry simulations, the following files are saved out:

File Description
trac_avg.geosfp_4x5_fullchem.YYYYMMDDhh.mp or
trac_avg.geosfp_4x5_fullchem.YYYYMMDDhh.sp
Diagnostic output file that contains time-averaged output for diagnostics enabled in input.geos.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP.

trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.mp or
trac_rst.geosfp_4x5_fullchem.YYYYMMDDhhmm.sp
Tracer restart file that contains instantaneous tracer concentrations at simulation end time. This file saves instantaneous concentrations of all transported tracers (listed in the Tracer Menu of input.geos) on all levels and may be used to initialize another GEOS-Chem simulation.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP.

HEMCO_restart.YYYYMMDDhh.nc.mp or
HEMCO_restart.YYYYMMDDhh.nc.sp
HEMCO restart file containing several quantities tracked by HEMCO for use in initializing another GEOS-Chem simulation. This file includes data used to calculate soil NOx emissions, PARANOX ship plume chemistry, and MEGAN biogenic emissions.

Data in this file are in netcdf format.

v10-01.geosfp_4x5_fullchem.mp or
v10-01.geosfp_4x5_fullchem.sp
GEOS-Chem simulation log file.

Open this file in a text editor to diagnose compile and run errors.

HEMCO.log.mp or
HEMCO.log.sp
HEMCO log file containing detailed information about the emissions read from disk into HEMCO.

Open this file in a text editor to diagnose HEMCO errors.

smv2.log SMVGEAR II log file containing information about reactions and species used by the ND65 prod-loss diagnostic. This file is only produced when SMVGEAR, and thus it will not be created for the specialty simulations.

Open this file in a text editor to see if SMVGEAR read the globchem.dat file properly.

diaginfo.dat File generated by GEOS-Chem that contains diagnostic quantities for use with GAMAP.
tracerinfo.dat File generated by GEOS-Chem that contains tracer name metadata for use with GAMAP.


Additional files may be created by GEOS-Chem, depending on your simulation type and the options specified in the input.geos file. Other optional GEOS-Chem output files include:

File Description
spec_rst.geosfp_4x5_fullchem.YYYYMMDDhh Species restart file that contains instantaneous species concentrations at simulation end time. This file saves instantaneous concentrations of all chemical species (listed in globchem.dat) on all levels and may be used to initialize another GEOS-Chem simulation.

NOTE: If you are going to be running a very long GEOS–Chem simulation and must split your simulation into several stages (i.e. in order to stay within the computational time limits of your system), then you should set LSVCSPEC to T in the Chemistry Menu of input.geos. That will make sure that the chemical species concentrations are preserved when the next run stage starts. Otherwise, GEOS-Chem use the default species concentrations specified in globchem.dat at the beginning of each run.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP.

stations.YYYYMMDDhh Diagnostic output file that contains ND48 station timeseries output as specified in input.geos.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. For more information on working with ND48 timeseries output, see our timeseries tutorial.

tsYYYYMMDDhh.bpch Diagnostic output file that contains ND49 instantaneous timeseries output as specified in input.geos.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP. For more information on working with ND49 timeseries output, see our timeseries tutorial.

ts_24hr_avg.YYYYMMDDhh.bpch Diagnostic output file that contains ND50 24-hour average timeseries output as specified in input.geos.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP.

ts_satellite.YYYYMMDDhh.bpch Diagnostic output file that contains ND51 local-time-average (satellite) timeseries output as specified in input.geos.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP.

paranox_ts.YYYYMMDDhh.bpch Diagnostic output file that contains ND63 ship timeseries output as specified in input.geos.

Data in this file are in "binary punch" format and may be viewed and manipulated using GAMAP.

ocean_rst.YYYYMMDDhhmm.nc Ocean mercury restart file containing several quantities pertaining to the ocean mercury module saved at the end of the GEOS-Chem simulation. Like the tracer restart file, this file may be used to initialize another GEOS-Chem mercury simulation.

Data in this file are in the in netcdf format.

plane.log.YYYYMMDD Diagnostic output file containing planeflight information scheduled via the Planeflight.dat file. This diagnostic is turned on in the Planeflight Menu of input.geos.

The data in this text file can be read and plotted using GAMAP routines CTM_READ_PLANEFLIGHT and PLANE_PLOT.

--Melissa Sulprizio 15:42, 3 April 2015 (EDT)

Previous issues that are now resolved

Enable compression in netCDF-4 output files

This update will be included in v11-02a.

For more information about this issue, please see this post on our The NcdfUtilities package wiki page.

--Bob Yantosca (talk) 19:04, 8 March 2017 (UTC)

Improve write speed of netCDF output files

This update was included in GEOS-Chem v11-01 public release

For more information about this issue, please see this post on our The NcdfUtilities package wiki page.

--Bob Yantosca (talk) 19:05, 8 March 2017 (UTC)

GAMAP can now read GEOS-Chem restart files in netCDF format

This update was included in GEOS-Chem v11-01 public release

Starting in GEOS-Chem v11-01, all restart files are now saved in COARDS-compliant netCDF format. We have had to make some minor modifications to both GEOS-Chem and GAMAP in order to allow GAMAP to read these files. The table below gives a summary of these modifications.

GEOS-Chem or GAMAP? File Modification
GEOS-Chem GeosCore/gamap_mod.F In routine INIT_TRACERINFO, we now write metadata for all species (advected or not) to the tracerinfo.dat file under the ND45 tracer concentration diagnostic section. Because the netCDF restart file contains concentrations for both advected and non-advected species, we need to make sure that the tracerinfo.dat file created by GEOS-Chem contains metadata for all species.
GAMAP internals/ctm_open_file.pro The prior algorithm always assumed that a netCDF file would end in either .nc or .nc4. We now have removed this restriction. We now split the filename string on . and then examine the substrings for nc or nc4 (case-insensitive).
GAMAP internals/ctm_read_coards.pro Added some minor modifications to read netCDF restart files:
  1. Determine vertical grid from the number of layers (if the Model global attribute is not specified).
  2. Assign category DXYP to variable AREA, which is included in the netCDF restart file.
  3. Assign category IJ-AVG-$ to variables beginning with either SPC_ or TRC_. Also remove the SPC_ and TRC_ from the variable name internally so that the variable name will match the metadata in tracerinfo.dat.

--Bob Yantosca (talk) 19:29, 23 January 2017 (UTC)