Developing GCHP

From Geos-chem
Revision as of 17:26, 20 October 2020 by Lizzie Lundgren (Talk | contribs)

Jump to: navigation, search

Previous | Next | Getting Started with GCHP | GCHP Main Page

  1. Hardware and Software Requirements
  2. Downloading Source Code and Data Directories
  3. Obtaining a Run Directory
  4. Setting Up the GCHP Environment
  5. Compiling
  6. Running GCHP: Basics
  7. Running GCHP: Configuration
  8. Output Data
  9. Developing GCHP
  10. Run Configuration Files


Please note that documentation on this page primarily reflects the latest GCHP public release which is currently the GCHP 12 series. The documentation will be updated for the GCHP 13.0.0 release over the coming months.

Overview

GCHP works as a layer around GEOS-Chem, simulating the more complex environment of a full atmospheric global circulation model (AGCM). Most model updates will involve editing GEOS-Chem source code as you would with GEOS-Chem Classic (GCC). However, certain updates such as specifying output variables or adding new input fields may require development within the GCHP-specific source code. In addition, sometimes debugging will lead you into the MAPL source code. This page provides an overview of the code structure to help navigate and debug GCHP.

This page is a work in progress. Please send feedback/comments to GEOS-Chem Support Team specifying what you would like to see clarified or added.

GCHP Architecture

High-level Execution of GEOS-Chem Classic

GCC primarily consists of a single, monolithic code. When running the GEOS-Chem executable geos, the main routine in GeosCore/main.F performs the following functions:

  1. Read in simulation settings from input.geos
  2. Set up arrays to hold data such as species concentrations and meteorological data
  3. Loop though the following steps until the simulation is complete:
    1. Read meteorological data into the State_Met object from pre-determined locations
    2. Calculate emissions in each grid box via HEMCO
    3. Calculate chemistry in each grid box via FlexChem
    4. Calculate transport between grid boxes

Although the code for each of these functions is found in different files (e.g. chemistry_mod.F, transport_mod.F), all of the routines are called from main.F.

High-level Execution of GCHP

The primary difference with GCHP is that main.F is replaced by the GMAO MAPL framework. MAPL provides an interface with ESMF which allows the different components to be entirely unaware of each other's existence and all communication is standardized. The functional flow now looks more like this:

  1. Initialize the MAPL_Cap process. MAPL will:
    1. Establish a generic input component, called ExtData
    2. Establish a generic output component, called History
    3. Establish a generic CTM component, called GCHP
  2. Determine which modules will be performing which function. In GCHP, GEOS-Chem will calculate chemistry, FV3Dycore will calculate transport, and emissions are completed through HEMCO
  3. For each component, send an “Initialize” command
  4. Send the “Run” command to the CTM component. The CTM component will loop through the following steps:
    1. Request input data from ExtData
    2. Send a “Run” command to each component in the CTM
    3. Send output data to History
  5. Once the CTM component is done, send a “Finalize” command to all components and exit

All ancillary operations, such as data regridding and parallelization, are handled by MAPL. Each core is unaware of the existence of each other core. This means that, in a 6-CPU run, there are six distinct instances of the GEOS-Chem component running; each one will see ⅙ of the available domain, and be fed data as if that domain were all that existed. Components can request data from other domains (e.g. the transport core will request data from adjacent domains) but this communication is all handled through MAPL.

ESMF/MAPL Gridded Component Hierarchy

Figure 1. A basic hierarchy of Gridded Components in the GCHP.

The presence and structure of the GCHP configuration files is due to the ESMF and MAPL structure on which GCHP is built. The basic element of an ESMF program is the "component" of which there are two different types: gridded and coupler. Components are organized and interact with one another hierarchically, as parent and child. The GCHP is built exclusively of gridded components, often denoted "GC" or "GridComp", with the top level, or Cap component, simply denoted "Cap" (Figure 1). The Cap is a simple MAPL program that initializes ESMF, MAPL, and associated resources. It has three children: Root, History, and ExtData. Below are brief descriptions of each:

  • Root: The Root component controls the operation and interaction of all of the components comprising the model system. Hierarchically, it is the parent or ancestor of all scientific operations. The only operations that occur outside of Root are the initialization by Cap, ExtData, and History. The Root component can be given a name, specified in GCHP configuration file Cap.rc. The Root name for GCHP is simply "GCHP".
  • ExtData: ExtData stands for "External Data" and is an internal MAPL gridded component used to read data from netcdf files on disk. More specifically, ExtData populates fields in the "Import" states within the MAPL hierarchy. Only fields designated as part of a component's import state can be filled with ExtData. Information in GCHP configuration file ExtData.rc provides the ExtData component information about the input data such as file path and read frequency.
  • History: The History component is an internal MAPL gridded component used to manage output streams from a MAPL hierarchy. For GCHP, it is used for writing output to NetCDF files. History is able write variables to file that exist in the "Export" states of any component in the GCHP hierarchy. It also has some limited capability to interpolate the fields horizontally writing them. Information about what variables to write are specified in the GCHP configuration file HISTORY.rc

Source Code Structure

GCHP source code can be sub-divided into five parts:

  1. GEOS-Chem
  2. GCHP wrapper interface routines (GCHP)
  3. Cubed-Sphere Finite Volume Dynamical Core (FVDycore)
  4. Earth System Modeling Framework (ESMF)
  5. NASA-GMAO Mapping, Analysis and Prediction Layer (MAPL)

The GEOS-Chem source code is the same as you would download for running GCC. It contains C preprocessor directives specifying which parts of the code should be compiled in a high performance computing (HPC) environment. You can enable HPC in GEOS-Chem by additionally downloading the GCHP wrapper and storing it in the top-level GEOS-Chem source code directory as a sub-directory called GCHP. The GCHP directory is designed to use GEOS-Chem source code within a set of wrapper functions that interface GEOS-Chem's routines to the ESMF using, in part, MAPL and the HPC-capable cubed-sphere dynamics core FVDycore.

The GCHP directory contains four subdirectories and several Fortran-90 and header files. Description of each are as follows:

  • FVdycoreCubed_GridComp: The HPC-capable cubed-sphere dynamics core. GEOS-Chem's serial dynamical core is not capable of operating in a distributed environment, requiring that an HPC-capable version be included within the GCHP system. In 2014, NASA GMAO made available a stand-alone version of the Finite-Volume Cubed-Sphere Dynamical Core used in GEOS, which was adapted to the GCHP and resides within the FVdycoreCubed_GridComp subdirectory. The FV dycore is able to read in meteorological fields in either lat-lon or cubed sphere formats.
  • Shared: Contains NASA GMAO's MAPL and Shared library packages used to facilitate coupling between components and provide the primary interface with ESMF. Problems with MAPL will lead you into this directory. However, be aware that error traceback for many run directory problems will lead you to MAPL code and the problem is usually not the code itself. Carefully check that your configuration files (*.rc) are properly set before attempting to change MAPL code to fix the issue.
  • ESMF: Contains ESMF infrastructure source code for version v5.2.0rp2. See ESMF/README in the source code for more information.
  • Registry: Contains information used by MAPL at compile time to generate the Fortran interface between the various quantities needed by GEOS-Chem and ESMF, MAPL, and FVdycore.
  • *.F90 and *.H files in the GCHP directory: These Fortran routines and header files replace the GEOS-Chem classic main.F functionality. They also consist of ESMF and MAPL interface code that couples GEOS-Chem routines in an ESMF environment. gigc_chunk_mod.F90 calls the various methods within GEOS-Chem necessary to input, initialize, calculate, and output GEOS-Chem scientific quantities. gigc_history_exports_mod.F90 handles the GCHP diagnostics. Files that contain GridComp in the name are ESMF gridded components which can be thought of as the building blocks of an ESMF application, each with imports, exports, and an internal state.

GCHP Updates Required with GEOS-Chem Classic Updates

GEOS-Chem Update Implications for GCHP GCHP Update Required Notes
Add 3D State_Met field GCHP compatibility with gfortran 6+ requires nullifying rather than deallocating 3D State_Met fields. Edit subroutine cleanup_state_met_mod in Headers/state_met_mod.F90 to use NULLIFY for GCHP and DEALLOCATE for GEOS_Chem. A C-preprocessor ifdef is already in the subroutine to do this for existing fields. If you deallocate rather than nullify a 3D State_Met field in GCHP then your run will hang during finalization if using gfortran 6+.
Add or remove external met-field of any dimension GCHP State_Met fields that are read from file are specified as imports in configuration file ExtData.rc and in source code file GCHP/Includes_Before_Run.H. 1. Add or remove the meteorology source to ExtData.rc in the run directory
2. Add or remove setting the State_Met field to the import in GCHP/Includes_Before_Run.H
3. Add or remove the meteorology field from the Import State in GCHP/Registry/Chem_Registry.rc
If you do not import the met-field via ExtData.rc then later references to it will cause a run fail. If you omit the met-field from both ExtData.rc and Includes_Before_Run.H then you will introduce a silent bug where the State_Met field is always zero.
Add or remove field in State_Met, State_Chm, or State_Diag All state fields are individually listed in HISTORY.rc as potential diagnostic outputs since wildcards are not used in GCHP History. Update configuration file HISTORY.rc based on your changes to state variables. Omitting an existing state variable from HISTORY.rc will not cause an error but the field will not be included in any output collections. Including a state variable that no longer exists will cause an error during run-time when metadata for the field is not found in GEOS-Chem.
Add or remove advected species All advected species are listed in HISTORY.rc for diagnostic output since GCHP does not use wildcards Update the SpeciesConc collections in configuration file HISTORY.rc when changing the set of advected species. Omitting an existing species from HISTORY.rc will not cause an error but the species concentration will not be output in the diagnostics. Including an obsolete species in the HISTORY.rc species list will result in an error during run-time.
Change arguments list passed to subroutine called in main.F GCHP file gigc_chunk_mod.F90 is the equivalent of GEOS-Chem Classic main.F and therefore contains many of the same calls to subroutines. Check if the subroutine you are modifying is called within GCHP and update accordingly. Failing to update the arguments passed to a subroutine called in GCHP will result in a compile error.
Add or remove subroutines called in main.F GCHP file gigc_chunk_mod.F90 is the equivalent of GEOS-Chem Classic main.F and therefore contains many of the same calls to subroutines. Add or remove the call to the subroutine within GCHP module gigc_chunk_mod.F90, where appropriate. Be sure to understand whether the functionality is relevant or required for GCHP prior to adding or removing a subroutine call. Omitting a new GEOS-Chem functionality in GCHP may introduce a silent bug and result in diverging model output. Removal of a GEOS-Chem subroutine that is still called in GCHP will result in a compile error.
Add, remove, or modify emissions in HEMCO_Config.rc GEOS-Chem Classic uses HEMCO for emissions I/O, regridding, and computation and gets all its information from HEMCO_Config.rc alone. In contrast, GCHP uses two configuration files for emissions: HEMCO_Config.rc and ExtData.rc. The former is used for HEMCO names, scaling, category, and hierarchy and the latter is used for file I/O and regridding handled by MAPL/ESMF. For consistency, all GEOS-Chem emissions updates should be incorporated into the GCHP HEMCO_Config.rc file, even for fields that are not used such as filename, dimensions, units, and time information. HEMCO_Config.rc fields not used in GCHP may still be checked for errors in HEMCO and therefore must conform to HEMCO rules. Units, dimensions, filename, HEMCO name, and file variable name, among other settings, must be incorporated into ExtData.rc. Inconsistencies between HEMCO_Config.rc and ExtData.rc, and format issues in either, will result in GCHP run-time errors. Issues in HEMCO_Config.rc will cause GCHP to crash immediately, while issues in ExtData.rc will occur later on during initialization. Isolate errors by turning off or commenting out emissions in HEMCO_Config.rc which will prompt GCHP to ignore them in ExtData.rc.
Update GEOS-Chem initial restart files GCHP restart files are GEOS-Chem Classic restart files regridded to cubed sphere. For reasonable GEOS-Chem Classic versus GCHP benchmark comparisons, the GCHP and GEOS-Chem initial restarts should be updated in sync. Regrid GEOS-Chem Classic 4x5 standard restart file to c24, c48, c90, c180, and c360 using csregridtool. Diverging GEOS-Chem Classic and GCHP restart files will result in benchmark results that are not comparable.

--Lizzie Lundgren (talk) 20:53, 28 August 2018 (UTC)

Previous | Next | Getting Started with GCHP | GCHP Main Page