GEOS-Chem restart files

From Geos-chem
Revision as of 19:21, 29 November 2018 by Bmy (Talk | contribs) (New fields in GEOS-Chem restart file)

Jump to: navigation, search

This page contains information about files produced by GEOS-Chem simulations, including diagnostic data and restart files used for initial conditions. For information the input files that ship with the GEOS-Chem run directories, please see our GEOS-Chem Input Files wiki page.

Restart files in GEOS-Chem 12

These updates were included in GEOS-Chem 12.1.0, which was released on 26 Nov 2018.

In GEOS-Chem 12.1.0, several restart file updates were introduced. These include:

New fields and other modifications to the GEOS-Chem restart file

Several modifications have been added to the GEOS-Chem restart file in an attempt to remove differences between single and multi-segmented GEOS-Chem simulations.

Item Description
1 GEOS-Chem restart file names now are now named: GEOSChem.Restart.YYYYMMDD_hhmmz.nc4.

For example, the restart file that was created at 00:00 UTC on 20160801 is named: GEOSChem.Restart.20160801_0000z.nc4. The z indicates that the timestamp is in Universal Coordinated Time (UTC), which is sometimes referred to as "Zulu" or "Z" time.

2 Module-level variables H2O2s and SO2s from wetscav_mod.F have been added to State_Chm (as State_Chm%H2O2AfterChem, StateChm%SO2AfterChem). These fields will be output to the GEOS-Chem restart file and then initialized to the values saved in the restart file at the start of the next simulation. Prior to this update, both H2O2s and SO2s were initialized to the H2O2 and SO2 tracer concentrations at the start of every simulation. This change will impact multi-segmented runs only.
3 Module-level variables DRY_TOTN and WET_TOTN from get_ndep_mod.F have been added to State_Chm (as State_Chm%DryDepNitrogen, StateChm%WetDepNitrogen). These fields will be output to the GEOS-Chem restart file and then initialized to the values saved in the restart file at the start of the next simulation. Prior to this update, both variables were initialized to zero at the start of every simulation. Storing them in the restart file may improve accuracy of soil NOx emissions over multi-segmented runs. This change will impact multi-segmented runs only.
4 Move State_PSC from the HEMCO restart file to the GEOS-Chem restart file.
5 Save out instantaneous met fields TMPU1, SPHU1, PS1DRY, PS1WET, DELPDRY to the GEOS-Chem restart file. These will be used to initialize the met fields at the start of the timestep, otherwise they will be set to the values of those fields at the end of the timestep.

While this update was added to GEOS-Chem Classic in 12.1.0, it will be added to GCHP in 12.2.0.

Restart collection in History.rc

GEOS-Chem restart files are now saved out via the history component. A new Restart collection has been defined in HISTORY.rc and fields saved out to the restart file can be modified in that file.

Read restart file via HEMCO

GEOS-Chem restart files are now read in via HEMCO. The entries listed below have been added to HEMCO_Config.rc (and may vary slightly for different simulation types). These fields are obtained from HEMCO and copied to the appropriate State_Chm and State_Met fields in the new routine Get_GC_Restart (found in GeosCore/hcoi_gc_main_mod.F90).

  # --- GEOS-Chem restart file ---
  # PSC state only needed for UCX
  * SPC_           ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 SpeciesRst_?ALL?    $YYYY/$MM/$DD/$HH CS xyz 1 * - 1 1
  * TMPU1          ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Met_TMPU1           $YYYY/$MM/$DD/$HH E  xyz 1 * - 1 1
  * SPHU1          ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Met_SPHU1           $YYYY/$MM/$DD/$HH E  xyz 1 * - 1 1
  * PS1DRY         ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Met_PS1DRY          $YYYY/$MM/$DD/$HH E  xy  1 * - 1 1
  * PS1WET         ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Met_PS1WET          $YYYY/$MM/$DD/$HH E  xy  1 * - 1 1
  * DELPDRY        ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Met_DELPDRY         $YYYY/$MM/$DD/$HH E  xyz 1 * - 1 1
  * KPP_HVALUE     ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Chem_KPPHvalue      $YYYY/$MM/$DD/$HH E  xyz 1 * - 1 1
  * WETDEP_N       ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Chem_WetDepNitrogen $YYYY/$MM/$DD/$HH E  xy  1 * - 1 1
  * DRYDEP_N       ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Chem_DryDepNitrogen $YYYY/$MM/$DD/$HH E  xy  1 * - 1 1
  * SO2_AFTERCHEM  ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Chem_SO2AfterChem   $YYYY/$MM/$DD/$HH E  xyz 1 * - 1 1
  * H2O2_AFTERCHEM ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Chem_H2O2AfterChem  $YYYY/$MM/$DD/$HH E  xyz 1 * - 1 1
  * STATE_PSC      ./GEOSChem.Restart.$YYYY$MM$DD_$HH$MNz.nc4 Chem_StatePSC       $YYYY/$MM/$DD/$HH E  xyz count       * - 1 1

The CS cycle flag was added as an option to HEMCO in GEOS-Chem 12.1.0 to tell HEMCO to skip fields that aren't found in the provided file. This is useful when certain species aren't found in the restart file and, in that case, GEOS-Chem will initialize that species to the background concentrations specified in the species database.

--Melissa Sulprizio (talk) 16:30, 7 November 2018 (UTC)

Viewing and manipulating restart files in netCDF format

There are many free and open-source software packages readily available for visualizing and manipulating netCDF files. These tools will reduce the need for the GEOS-Chem user community to rely on IDL (and GAMAP), which can be prohibitively expensive for some user groups. Some recommend tools are listed below.

1. ncdump: This command-line tool generates a text representation of netCDF data and can be used to quickly view the variables contained in a netCDF file. For example:
     # Display header information (dimensions, variables, attributes) only
     ncdump -h GEOSChem_restart.201308010000 | less	

     # Display header information followed by data values for variable(s) specified
     ncdump -v SPC_NO GEOSChem_restart.201308010000 | less
2. ncview: Visualization package for netCDF files, recommended for a quick and easy look at netCDF files.
3. Panoply: Data viewer for netCDF files. This package offers an alternative to ncview. From our experience, Panoply works nicely when installed on the desktop, but is slow to respond in the Linux environment.
4. NCO and CDO: Command-line tools for manipulating and analyzing netCDF files.
5. Other programming languages may be used for viewing or manipulating netCDF files, including but not limited to:

Some of the tools listed above, such as ncdump and ncview, may come pre-installed on your system. Others like NCO, CDO, and Panoply may need to be installed or loaded (e.g. via the module load command). Check with your system administrator or IT staff to see what is available on your system.

--Melissa Sulprizio (talk) 15:58, 17 January 2017 (UTC)

Regridding netCDF restart files

The same netCDF tools listed above can also be used to regrid netCDF restart files produced by GEOS-Chem.

1. CDO: The Climate Data Operators include tools for regridding netCDF files. For example, the following command will apply distance-weighted regridding:
     cdo remapdis,gridfile
For gridfile, you can use the files in See the Interpolation section in the CDO Guide for more information.
Bram Maasakkers wrote:
I have noticed a problem regridding a 4x5 restart file to 2x2.5 using cdo 1.9.4. When I use:
cdo remapdis,geos.2x25.grid
The last latitudinal band (-89.5) remains empty and gets filled with the standard missing value of cdo, which is really large. This leads to immediate problems in the methane simulation as enormous concentrations enter the domain from the South Pole. For now I’ve solved this problem by just using bicubic interpolation:
cdo remapbic,geos.2x25.grid
2. NCO: The netCDF Operators also include tools for regridding. See the Regridding section of the NCO User Guide for more information.
3. Jenny Fisher provided an IDL script which is now included with GAMAP. Jenny wrote:
The one caveat that I cannot figure out is something weird with the units. I have done this in such a way that it reads in whatever attributes (including units) are in the original file and writes the same units back to the regridded file. But when GEOS-Chem reads the new file it gives an “incompatible units” error — although printing them in the code shows something identical to what the case statement is searching for, as far as I can tell. If I use ncatted to overwrite the units (with identical name) they are fine. So it must be something to do with the way IDL encodes the units string, but I have no idea what. The relevant nco command is ncatted -O -a units,,m,c,"mol mol-1"
4. Regridding routines written in NCL are also available in the NCL4GC package available on Bitbucket.

--Melissa Sulprizio (talk) 22:06, 19 January 2017 (UTC)

Creating a netCDF restart file or adding new species to a netCDF restart file

You have a few options for adding new species to a netCDF restart file:

1. In GEOS-Chem v11-01, if the model cannot find a species in the netCDF restart file, it should set the initial concentration to the background concentration. The default background concentration is 1e-20 v/v. If you want to change that to a different value, you can add BackgroundVV = {VALUE}_fp to the call to Spc_Create for that species in Headers/species_database_mod.F90. To confirm your species is getting set to the background value you can turn on ND70 debug output, which will print the initial species concentrations (min/max) read from the restart file to the log file. At the end of your simulation, a new restart file will be saved out with the new species included. We recommend spinning up GEOS-Chem for an appropriate duration to achieve reasonable concentrations for the new species.
2. You can use CDO and NCO to copy the restart field for one species to a new species. For example:
   module load nco
   module load cdo

   # Extract field SPC_PMN from the original restart file
   cdo selvar,SPC_PMN

   # Rename selected field to SPC_NPMN
   ncrename -h -v SPC_PMN,SPC_NPMN

   # Append new species to existing restart file
   ncks -h -A -M
3. Sal Farina wrote a simple Python script for adding a new species to the netCDF restart file:
   #!/usr/bin/env python
   import netCDF4 as nc
   import sys
   import os

   for nam in sys.argv[1:]:
       f = nc.Dataset(nam,mode='a')
               o = f['SPC_OCPI']
               print "SPC_OCPI not defined"

       soap = f['SPC_SOAP']
       soap[:] = 0.0
       soap.long_name= 'SOAP tracer'
       soap.units =  o.units
       soap.add_offset = 0.0
       soap.scale_factor = 1.0
       soap.missing_value = 1.0e30


--Melissa Sulprizio (talk) 23:33, 7 November 2017 (UTC)

Cropping netCDF restart files to nested domains

If needed, regrid a coarse netCDF restart file to the nested grid resolution using GAMAP routine

Cropping netCDF files can be easily achieved with tools such as CDO or NCO. For example, CDO has a SELBOX operator for selecting a box by specifying the lat/lon bounds:

cdo sellonlatbox,lon1,lon2,lat1,lat2 infile outfile

See page 44 of the CDO guide for more information.

--Melissa Sulprizio (talk) 19:04, 3 May 2017 (UTC)

Vertical coordinates in netCDF files produced by GEOS-Chem

All netCDF files produced by GEOS-Chem (i.e. diagnostic files and restart files) adhere to the the COARDS netCDF convention for the lon, lat, and time dimensions.

For the vertical dimension, we have chosen to use the following coordinate variables, emulating the file format of the NCAR Community Earth System Model (CESM):

     double lev(lev) ;
         lev:long_name = "hybrid level at midpoints (1000*(A+B))" ;
         lev:units = "level" ;
         lev:positive = "down" ;\
         lev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
         lev:formula_terms = "a: hyam b: hybm p0: P0 ps: PS" ;
     double hyam(lev) ;
         hyam:long_name = "hybrid A coefficient at layer midpoints" ;
     double hybm(lev) ;
         hybm:long_name = "hybrid B coefficient at layer midpoints" ;
     double ilev(ilev) ;
         ilev:long_name = "hybrid level at interfaces (1000*(A+B))" ;
         ilev:units = "level" ;
         ilev:positive = "down" ;
         ilev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
         ilev:formula_terms = "a: hyai b: hybi p0: P0 ps: PS" ;
     double hyai(ilev) ;
         hyai:long_name = "hybrid A coefficient at layer interfaces" ;
     double hybi(ilev) ;
         hybi:long_name = "hybrid B coefficient at layer interfaces" ;
     double P0 ;
         P0:long_name = "reference pressure" ;

The lev variable is used for data that is placed on the midpoints between vertical levels. This is an "approximate" eta coordinate, which is close to 1 at the surface and close to zero at the atmosphere top.

 lev = 0.99250002413, 0.97749990013, 0.962499776, 0.947499955, 0.93250006, 
    0.91749991, 0.90249991, 0.88749996, 0.87249996, 0.85750006, 0.842500125, 
    0.82750016, 0.8100002, 0.78750002, 0.762499965, 0.737500105, 0.7125001, 
    0.6875001, 0.65625015, 0.6187502, 0.58125015, 0.5437501, 0.5062501, 
    0.4687501, 0.4312501, 0.3937501, 0.3562501, 0.31279158, 0.26647905, 
    0.2265135325, 0.192541016587707, 0.163661504087706, 0.139115, 0.11825, 
    0.10051436, 0.085439015, 0.07255786, 0.06149566, 0.05201591, 0.04390966, 
    0.03699271, 0.03108891, 0.02604911, 0.021761005, 0.01812435, 0.01505025, 
    0.01246015, 0.010284921, 0.008456392, 0.0069183215, 0.005631801, 
    0.004561686, 0.003676501, 0.002948321, 0.0023525905, 0.00186788, 
    0.00147565, 0.001159975, 0.00090728705, 0.0007059566, 0.0005462926, 
    0.0004204236, 0.0003217836, 0.00024493755, 0.000185422, 0.000139599, 
    0.00010452401, 7.7672515e-05, 5.679251e-05, 4.0142505e-05, 2.635e-05, 
    1.5e-05 ;

The lev variable may be used for quick plotting. To compute the actual pressure at the midpoint of the grid box (I,J,L), you will need to supply your own 2-D surface pressure field (e.g. saved from another diagnostic file):

 Pmid = ( hyam(L) * PS(I,J) ) + hybm(L)

The ilev variable is used for data that is placed on vertical level edges or "interfaces" (hence the "i" in ilev). This is also an "approximate" eta coordinate.

 ilev = 1, 0.98500004826, 0.969999752, 0.9549998, 0.94000011, 0.92500001, 
    0.90999981, 0.89500001, 0.87999991, 0.86500001, 0.85000011, 0.83500014, 
    0.82000018, 0.80000022, 0.77499982, 0.75000011, 0.7250001, 0.7000001, 
    0.6750001, 0.6375002, 0.6000002, 0.5625001, 0.5250001, 0.4875001, 
    0.4500001, 0.4125001, 0.3750001, 0.3375001, 0.28808306, 0.24487504, 
    0.208152025, 0.176930008175413, 0.150393, 0.127837, 0.108663, 0.09236572, 
    0.07851231, 0.06660341, 0.05638791, 0.04764391, 0.04017541, 0.03381001, 
    0.02836781, 0.02373041, 0.0197916, 0.0164571, 0.0136434, 0.0112769, 
    0.009292942, 0.007619842, 0.006216801, 0.005046801, 0.004076571, 
    0.003276431, 0.002620211, 0.00208497, 0.00165079, 0.00130051, 0.00101944, 
    0.0007951341, 0.0006167791, 0.0004758061, 0.0003650411, 0.0002785261, 
    0.000211349, 0.000159495, 0.000119703, 8.934502e-05, 6.600001e-05, 
    4.758501e-05, 3.27e-05, 2e-05, 1e-05 ;

To compute the actual pressure at the bottom and top edges of the grid box (I,J,L), you will need to supply your own 2-D surface pressure field (e.g. saved from another diagnostic file):

 Pbot = ( hyai(L  ) * PS(I,J) ) + hybi(L  )
 Ptop = ( hyai(L+1) * PS(I,J) ) + hybi(L+1)

Seb Eastham wrote:

Daniel Rothenberg has been in touch regarding how vertical co-ordinates are set up in CESM. It's mostly the same (as COARDS), with the exception that EVERY file which includes a vertical co-ordinate also includes both lev and ilev dimensions. In this case, lev is the cell mid-points, and ilev is cell edges ("i"nterfaces). Both lev and ilev are set up just as we discussed before, although CESM doesn't include the formula_terms attribute, which I think we should use. In this case formula_terms would be ap: hyai b: hybi p0: P0 ps: ps for the ilev dimension and ap: hyam b: hybm p0: P0 ps: ps for the lev dimension.
  1. For lev and ilev, they should be set to A/1000 + B, where P0 = 1000.
  2. Both ilev and lev should be in all 3D files, even though only one of the two dimensions is used in a given file).
  3. I think we can leave PS out of the files. It will make them non-compliant but I think it's one of those "accepted" non-compliances. If it turns out to be a problem we can revisit it later, but for plotting the expectation is that it's up to the user to figure out where they're getting PS from, rather than automatically archiving it in every file.

--Bob Yantosca (talk) 19:27, 5 September 2018 (UTC)

Previous issues that are now resolved

Enable compression in netCDF-4 output files

This update was included in v11-02a (approved 12 May 2017).

For more information about this issue, please see this post on our The NcdfUtilities package wiki page.

--Bob Yantosca (talk) 17:50, 24 May 2018 (UTC)

Improve write speed of netCDF output files

This update was included in GEOS-Chem v11-01 public release

For more information about this issue, please see this post on our The NcdfUtilities package wiki page.

--Bob Yantosca (talk) 19:05, 8 March 2017 (UTC)

GAMAP can now read GEOS-Chem restart files in netCDF format

This update was included in GEOS-Chem v11-01 public release

Starting in GEOS-Chem v11-01, all restart files are now saved in COARDS-compliant netCDF format. We have had to make some minor modifications to both GEOS-Chem and GAMAP in order to allow GAMAP to read these files. The table below gives a summary of these modifications.

GEOS-Chem or GAMAP? File Modification
GEOS-Chem GeosCore/gamap_mod.F In routine INIT_TRACERINFO, we now write metadata for all species (advected or not) to the tracerinfo.dat file under the ND45 tracer concentration diagnostic section. Because the netCDF restart file contains concentrations for both advected and non-advected species, we need to make sure that the tracerinfo.dat file created by GEOS-Chem contains metadata for all species.
GAMAP internals/ The prior algorithm always assumed that a netCDF file would end in either .nc or .nc4. We now have removed this restriction. We now split the filename string on . and then examine the substrings for nc or nc4 (case-insensitive).
GAMAP internals/ Added some minor modifications to read netCDF restart files:
  1. Determine vertical grid from the number of layers (if the Model global attribute is not specified).
  2. Assign category DXYP to variable AREA, which is included in the netCDF restart file.
  3. Assign category IJ-AVG-$ to variables beginning with either SPC_ or TRC_. Also remove the SPC_ and TRC_ from the variable name internally so that the variable name will match the metadata in tracerinfo.dat.

--Bob Yantosca (talk) 19:29, 23 January 2017 (UTC)