Getting Started with GCHP

From Geos-chem
Revision as of 16:45, 13 April 2017 by Lizzie Lundgren (Talk | contribs) (Created page with "''Last major update: April 6th, 2016'' ==Objectives== GEOS-Chem High Performance (GCHP) represents the next generation of GEOS-Chem and is currently in development. The GEOS...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Last major update: April 6th, 2016

Objectives

GEOS-Chem High Performance (GCHP) represents the next generation of GEOS-Chem and is currently in development. The GEOS-Chem community mission is to advance understanding of human and natural influences on the environment through a comprehensive, state-of-the-science, readily accessible global model of atmospheric chemistry. We are releasing the GCHP development kit (Dev Kit) to (1) promote user engagement by encouraging users to explore the future of GEOS-Chem early in its development, and (2) encourage active community involvement in GCHP development as it advances. If you are getting started with GCHP, please take the time to join the GCHP Working Group mailing list so that you can stay up-to-date on the latest developments.

Engagement

The first objective of the GCHP Dev Kit release is to give users the opportunity to get familiar with GCHP’s capabilities. GCHP is intended to be able to replicate all the abilities of GEOS-Chem, from coarse resolution to "transport-only" sensitivity analyses, while providing the user enhanced performance capabilities, such as generating multi-thousand-CPU jobs simulating full chemistry at 7 km resolution. Capabilities that distinguish GCHP from GEOS-Chem "classic" include:

  • Flexible-resolution simulations, from ~4⨉5 (C24) to ~0.25⨉0.3125 (C360) without the need for code edits or recompilation
  • Use of the “cubed-sphere” (CS) grid, which eliminates the need for polar filtering and brings GEOS-Chem into line with the GMAO GEOS AGCM
  • MPI parallelization, allowing users to spread a single job across multiple machines

Whether or not users are interested in developing GCHP, familiarity with the Dev Kit release will provide an opportunity to be ahead of the curve.

Development

GEOS-Chem Classic performs parallelization using the OpenMP model, which implements shared-memory parallelization for a single machine. This allows GEOS-Chem Classic to be easily parallelized, but means that it can only run on a single machine. By contrast, GCHP uses the MPI model which implements distributed-memory parallelization. This model is what allows GCHP to run on more than one machine at a time.

The only people to use GCHP thus far have been the small team which has been developing it. We need people who don’t spend all day thinking about these things to try their hand at installing and running GCHP so that we can get a feel for which parts of the process feel easy, and which parts are problematic. We also want to encourage people to try modifying the installation; we know that we can build and run GCHP with a very narrow set of libraries, but the larger GEOS-Chem community will likely have (or want to use) other libraries such as MVAPICH2 and the PGIFORTRAN compiler.

Furthermore, GCHP is still a somewhat immature product. Of the sub-modules that have been implemented, many have only undergone the most basic testing, and we are already aware of a host of minor errors (see GCHP Development Roadmap). The more users who interact with GCHP, the more likely we are to pick up problems before the release of GCHP v1.0. If you need help getting set up or run into any issues, please let us know by e-mailing Lizzie Lundgren at elundgren@seas.harvard.edu.

Obtaining GCHP Source Code

GCHP is built to wrap around a standard GEOS-Chem installation. As such, the first step to obtaining GCHP is to download a standard GEOS-Chem code and run directory. Upgrading to GCHP is simply a case of:

  • Adding another directory into the GEOS-Chem code directory; and
  • Adding an additional set of files to the GEOS-Chem run directory.

Once this has been done, the same run and code directories can be used to for either GEOS-Chem Classic or GCHP. GCHP is under active development, and is still only an alpha release. As such, if you find that something is wrong, please consult the GCHP Development Roadmap and see if the issue you have found has a fix or workaround. If it does, then redownloading GCHP using the instructions below should resolve the issue.

Note: On the GEOS-Chem wiki, some of the git clone targets are shown as “git://git.as.harvard.edu/bmy/…”, whereas throughout these instructions clones are made from “https://bitbucket.org/gcst/…”. These two addresses previously mirrored one another, but the git.as.harvard.edu addresses are now being phased out. As such, these instructions use the new BitBucket repositories instead of the Harvard ones.

Step 1: Downloading GEOS-Chem Classic branch GCHP_master

GCHP is built off of a preliminary version of GEOS-Chem Classic v11-01 in branch GCHP_master. The code directory is maintained by the GEOS-Chem Support Team on BitBucket. To get a clean copy of GEOS-Chem v11-01 source code, go somewhere where you have a lot of free space - let’s say, '/mypath/GCHP' - and run the following command:

git clone -b GCHP_master https://bitbucket.org/gcst/gc_bleeding_edge.git Code.GCHP

Take note of where the code was downloaded to; in this example the source code is saved to directory /mypath/GCHP/Code.GCHP.

Step 2: Enabling GCHP within GEOS-Chem Classic

Now, move into your GEOS-Chem code directory (/mypath/GCHP/Code.GCHP) and run the following command:

git clone https://bitbucket.org/gcst/GCHP.git GCHP

This will clone the GCHP subdirectory into your primary GEOS-Chem source code directory. You now have all the code you need to run GCHP. Note that downloading the GCHP subdirectory does not change anything in the standard GEOS-Chem code; you can use /mypath/GCHP/Code.GCHP to run regular non-GCHP GEOS-Chem simulations and can generate run directories for them using your downloaded GEOS-Chem v11-01 Unit Tester (/mypath/UT).

Obtaining a GCHP Run Directory

Step 1: Downloading the GCHP Run Directory

After downloading the source code, you will need to set up a run directory for GCHP. The GCHP Dev Kit run directory is set up for the tropchem simulation with 4x5 degree resolution output. Note that the input meteorology can be a different resolution than the GCHP output which is different than for GEOS-Chem Classic. For the Dev Kit runs, we recommend using 2x25 input meteorology which you can specify during run directory set-up.

Run directories can be created by the GEOS-Chem unit tester, as outlined on the GEOS-Chem wiki. First, download the preliminary GEOS-Chem v11-01 unit tester to /mypath/UT by cloning the GCHP_master branch of the GEOS-Chem Unit Test development repository ut_bleeding_edge from BitBucket. This branch is synced to work with the GCHP_master branch of the source code repository.

git clone -b GCHP_master https://bitbucket.org/gcst/ut_bleeding_edge.git UT

If you already have a clone of the GEOS-Chem ut_bleeding_edge repository then you can simply checkout the GCHP_master branch and pull the latest updates:

git checkout GCHP_master
git pull

Next, complete the following steps to create a GCHP 4x5 tropchem run directory:

  1. Change directories into /mypath/UT/perl
  2. Copy the configuration text file CopyRunDirs.input to a new file called CopyRunDirs.GCHP
  3. Make the following edits to the new configuration file:
    1. Update DATA_ROOT as needed to point to your system's shared data directory
    2. Change variable UNIT_TEST_ROOT as needed to match the directory to which you cloned the unit tester (/mypath/UT)
    3. Change variable COPY_PATH to be the target directory where you want to store your run directories (e.g. /mypath/GCHP/rundirs)
    4. Uncomment the GCHP 4x5 tropchem run directory ("gchp" will be in the MET column)
  4. Run the following perl program to create the run directory /mypath/GCHP/rundirs/gchp_4x5_tropchem:
./gcCopyRunDirs CopyRunDirs.GCHP

Step 2: Setting up the GCHP Run Directory

Additional steps are required after downloading the GCHP run directory to set it up for GCHP compilation and runs. Steps include setting symbolic links to your source code, excecutable, and input data (restart, meteorology, emissions (HEMCO), and chemistry). Note that data paths in input.geos are ignored by GCHP and paths set in HEMCO_Config.rc must be consistent with those in the ExtData.rc configuration file. ExtData.rc is unique to GCHP and is automatically set during this stage. See the GEOS-5 wiki for more information about ExtData.rc.

Run directory setup is done by the initialSetup.sh shell script that is included in the GCHP run directory which guides you through the setup process. To start, note the full path of your GCHP code directory, navigate to your GCHP run directory, and then execute the following command:

./initialSetup.sh

You will first enter the path of your code directory which will be stored as symbolic link CodeDir. Note that you must write out the full path that does not include any symbolic links. Subsequent steps are dependent on whether you are on the Harvard Odyssey compute cluster or elsewhere.

Odyssey Users

  1. Enter y when asked if you are on Odyssey.
  2. Enter 2 for the input meterology field resolution. This will result in setting up symbolic link MetDir which points to the default Odyssey GEOS-FP 2⨉2.5 meteorological data directory. Recall that the 4x5 degree resolution in the run directory name indicates GCHP output resolution only and therefore is independent of the resolution you specify here.
  3. Based on the input path you specify in the previous step, the ExtData.rc file will automatically be updated with input data paths needed for your GCHP run.

Non-Odyssey Users

  1. Enter n when asked if you are on Odyssey.
  2. Enter the path to the source code you downloaded.
  3. You will go through a series of prompts asking you to set up the paths to your data directories. You should have the primary path for your group's shared data directories handy for this step. Within that directory should be subdirectories called HEMCO, CHEM_INPUTS, and NC_RESTARTS, as well as several GEOS met data subdirectories of format GEOS.{res}.d. If you are missing any of these subdirectoires, exit out of initialSetup.sh to find out where they are located prior to continuing.
    1. For path containing met data, specify the location of the folder containing GEOS-FP formatted meteorological data for the input data resolution you want to use. For example, enter /{shared_data_path}/GEOS_2x2.5.d/GEOS_FP if you want to input 2x2.5 met data that that directory is where the data is stored. Note that the input resolution can be different from the output resolution, and the default output resolution of the GCHP run directory is 4x5.
    2. For path to HEMCO data directory, enter {shared_data_path}/HEMCO.
    3. For path to GEOS-Chem restart files, enter {shared_data_path>/NC_RESTARTS.
    4. For path to CHEM_INPUTS directory, enter {shared_data_path}/CHEM_INPUTS.
  4. After the script has completed, you will need to set up your ExtData.rc file manually. You can do this by simply moving the ExtData_{res}.rc file for the input resolution you are using to ExtData.rc. That file uses the symbolic links you just created as well as the correct input resolution-dependent filenames.

Setting Up Your Environment

You should now have the following GEOS-Chem Dev Kit directories:

  1. Source code from gc_bleeding_edge repository branch GCHP_master at /mypath/GCHP/Code.GCHP
  2. Source code from GCHP repository branch master at /mypath/GCHP/Code.GCHP/GCHP
  3. Unit Tester from ut_bleeding_edge repository branch GCHP_master at /mypath/UT
  4. GCHP tropchem run directory at /mypath/GCHP/rundirs/gchp_4x5_tropchem

You are now in possession of a fully functioning copy of GCHP. That is, if you have the right libraries. More on that in this section.

Library and Resource Requirements

GCHP requires the following libraries to compile and run:

  • FORTRAN and C++ compiler (IFORT, PGIFORTRAN…)
  • NetCDF for FORTRAN and C++
  • MPI implementation (OpenMPI, MVAPICH2....)

Furthermore, to compile GCHP you need to meet all the following requirements:

  • Have your compiler, NetCDF and MPI implementation loaded
  • Be in your GCHP run directory
  • If working on Odyssey, be in an interactive SLURM session
  • Have at least 1 GB of memory available
  • [Optional] Have 4+ processors available

Information on how to set up your environment with the required libraries is given on the required software page. If you are working on Odyssey, see instructions on loading your environment in the section for Odyssey users below.

Harvard Odyssey Users

Each time you compile or run GCHP interactively, you must have your set up your environment setup for GCHP (see also Compiling GCHP). This involves setting up an interactive session and loading all libraries required for GCHP by sourcing a GCHP-specific bashrc file. Several sample bashrc files are included in the run directory and are custom for each combination of fortran compiler, MPI implementation, and compute cluster. Sourcing the bashrc is automatically done for you if you are compiling with build.sh, running interactively using the local Makefile, or submitting a run to SLURM. See later sections on compiling and running GCHP for more information.

Setting Up an Interactive Session

All CPU- or memory-intensive work on Odyssey must be performed with a SLURM session. This includes not only simulation but also compilation and even copying large files, including the git clone command. Throughout this document, it will be assumed that you are operating within an interactive SLURM session.

If you are using the env repository set up by the GEOS-Chem Support Team and available on Bitbucket for your environment, you can use the shortcut script interactive_gchp located in the /env/bin directory to simplify SLURM commands. Simply pass (1) # nodes, (2) # CPUs, (3) memory per CPU in MB, (4) requested session duration in minutes, and (5) partition name in that order to the interactive_gchp shell script. The last two arguments are optional, with the defaults being 60 minutes and the jacob partition.

For example, to request 2 nodes, 12 CPUs total, and 2500 MB per CPU on the Jacob node for 3 hours, use the following command:

interactive_gchp 2 12 1500 180 jacob

CPUs will be evenly spread across nodes so be sure to specify a number of CPUs divisible by # nodes. You will get an error message reminding you about this if you forget.

Alternatively, you may use the srun command directly rather than use the shortcut utility script. To create a 3-hour interactive session on the Jacob queue with 6 cores on 1 node and 3000 MB of RAM per CPU, run the following command:

srun -p jacob --pty --x11=first --mem-per-cpu=3000 -N 1 -n 6 -t 00-03:00 /bin/bash

If you want to run on more than one node, say 12 cores, distributed evenly across 2 nodes, use the following format:

srun -p jacob --pty --x11=first --ntasks-per-node=6 --mem-per-cpu=1500 -N 2 -n 12 -t 0-03:00 /bin/bash

The new argument “--ntasks-per-node=6” guarantees that the cores are evenly distributed over the nodes (a requirement for GCHP). Note also that the memory per CPU has been decreased to yield the same total memory request.

Loading GCHP Libraries

Once you have an interactive session, you can load all libraries required for GCHP. On Odyssey this can be achieved by changing directories into your GCHP run directory and sourcing one of the Odyssey-specific bashrc files that are included. For example:

cd gchp_4x5_tropchem
source GCHP.ifort13_openmpi_odyssey.bashrc

This will replace your current loaded Odyssey modules with IFORT 13, NetCDF, and OpenMPI implementations which were built together and have been tested with GCHP. Note that if you are compiling with build.sh, using the Makefile for compiling and running interactively, or submitting a GCHP job to SLURM, the GCHP bashrc file is sourced automatically. However, you MUST update the BASHRC variable within each of these files prior to use. You must make sure that the MPI implementation you choose is consistent with the GCHP source code. By default, the run directory and source code are set up to use ifort13 and OpenMPI.

Compiling GCHP

Once you have obtained the GCHP source code and run directory and have set up your environment, you are ready to compile GCHP. First time compilation is different than subsequent compilation as it requires building large modules that will generally be static. This is due to the structure of GCHP.

GCHP is made up of many different modules, all communicating via a system called MAPL. MAPL is itself built on the Earth System Modeling Framework (ESMF), and both systems are highly flexible but highly complex, with significant compilation times. Fortunately, MAPL and ESMF (as well as other components such as FV3 dycore) do not need to be recompiled unless parts of their code have been modified.

In this section we provide instructions on how to perform a full ("clean") compilation as well as a partial ("standard") compilation. The clean compilation must be done first since it cleans and compiles all modules, while the standard compilation may be used for subsequent recompilation as long as only GEOS-Chem code has changed.

For Odyssey users, there are two shell scripts within the run directory that will perform basic compilation tasks. Both of these scripts may be invoked from the command line using make. For all other users, see the Manual Compilation section below.

Odyssey Users

There is a shell script called build.sh available in the top-level GCHP run directory to facilitate source code cleaning and compiling. You may use this script by passing it an argument indicating clean and/or compile options upon execution, or you may use several make commands that invoke it and include additional functionalities such as writing to a log file. To view all available build.sh options, type the following in the terminal:

./build.sh help

This will display the following:

Arguments:
  Accepts single argument indicating clean and/or compile settings.
  Currently implemented arguments include:
     clean_gc         - classic only
     clean_nuclear    - GCHP, ESMF, MAPL, FVdycore (be careful!)
     clean_all        - classic, GCHP, ESMF, MAPL, FVdycore (be careful!)
     clean_mapl       - mapl and fvdycore only
     compile_debug    - turns on debug flags, no cleaning
     compile_standard - no cleaning
     compile_mapl     - includes fvdycore
     compile_clean    - cleans and compiles everything (be careful!)
Example usage:
  ./build.sh compile_standard

The build.sh script is for interactive compilation, meaning you must have an interactive session open to clean and compile with it. The more cores you have for your interactive session, the faster the compilation will be. With one core, a complete clean compile can take over an hour, so multiple cores are highly recommended if you are running ./build.sh compile_clean. Recompilations using ./build.sh compile_standard will have much shorter duration.

Prior to executing the build.sh script, you must open the file and specify (1) your bashrc,(2) your compiler for export as ESMF_COMPILER, and (3) your MPI implementation for export as ESMF_COMM. The default settings in the file are as follows (note the warning about source code):

###############################
###  Configurable Settings  ###
###############################

# Set bashrc (see run directory for samples)
BASHRC=GCHP.ifort13_openmpi_odyssey.bashrc

# Set compiler
export ESMF_COMPILER=intel

# Set MPI implementation
export ESMF_COMM=openmpi
#export ESMF_COMM=mvapich2

# WARNING: Code changes are necessary if switching from openmpi to MVAPICH2
#  To run GCHP with MVAPICH2, you must have the following updates:
#    (1) In GCHP/GIGC.mk, the OpenMPI lines for setting MPI_LIB are
#        uncommented out and the MVAPICH line are commented out
#    (2) In GCHP/Makefile, "export ESMF_COMM=openmpi" is uncommented
#        and "export ESMF_COMM=mvapich2" are commented out
#    (3) In build.sh within the run directory, BASHRC is set to a
#        bashrc that includes "openmpi" in the filename (such as this)
#        and the ESMF_COMM export is set to openmpi
#   NOTE: eventually these changes will be automatic

To perform a clean build for the first time in your run directory, consisting of a total clean and compile of all GCHP code including MAPL, ESMF, and FVdycore, navigate to your run directory and type the command:

make compile_clean

This command runs ./build.sh compile_clean and sends stderr and stdout to both the terminal window and to a log file called compile.log. At the very end of the log file, git information for both source code repositories are recorded for future reference of the code that was built. Note that you can check the source code status at any time with the make command make printbuildinfo within the top-level run directory GCHP.

The default compile commands for compile_clean, compile_standard, and compile_mapl are as follows:

   make -j${SLURM_NTASKS} NC_DIAG=y   CHEM=tropchem EXTERNAL_GRID=y   \
                          DEBUG=y     DEVEL=y       TRACEBACK=y       \
                          MET=geos-fp GRID=4x5      NO_REDUCED=y      \
                          UCX=n       hpc

The compile command for compile_debug is the same as above with BOUNDS=yes and FPEX=yes added for debugging purposes.

If you have only changed GEOS-Chem code and have not touched the GCHP structural files (e.g. MAPL), then you can recompile with ./build.sh compile_standard to save time. Running this will recompile only changed files (excluding MAPL, FVdycore, and ESMF). To run it and also output stderr and stdout to log file compile.log, run the following make command:

make compile_standard

If you find that you need to fully recompile GEOS-Chem, however, you can still avoid recompiling the MAPL framework code (as well as components such as the FV3 dycore). Do this simply by calling make realclean within your top-level source code directory prior to running make compile_standard. This will perform a clean build of the core GEOS-Chem components without necessitating a full recompilation of the more static components of GCHP.

Manual Compilation

If you are working outside of the Odyssey compute cluster then you must compile manually. Non-Odyssey users must also set up several environment variables. Manual compilation involves cleaning both the GEOS-Chem Classic and GCHP source code, and compiling using make from within both code directories.

First, check that your environment is properly set up for GCHP. If your compiler is not IFORT or your MPI implementation is not OpenMPI, you must modify these appropriately.

Next, make sure that you have the following environment variables set:

  • Required:
    • ESMF_COMPILER=intel
    • ESMF_COMM=openmpi
  • Optional:
    • ESMF_BOPT=O

Once you are certain that your environment is correctly set up, navigate to your code directory (e.g. /mypath/GCHP/Code.GCHP) and run the following command to clean the base GEOS-Chem Classic directory:

 make HPC=yes realclean 

After the process completes, enter the GCHP/ subdirectory and clean it using the following command:

 make EXTERNAL_GRID=yes DEVEL=yes the_nuclear_option 

Calling make with the_nuclear_option removes all compiled code in the directory. As implied by the name, this is not to be used lightly! The nuclear option is an alias, calling three commands in turn: wipeout_esmf, wipeout_mapl, and wipeout_fvdycore. Each of these components can take a long time (up to an hour) to compile, and all are structural components that are not affected by small changes in GEOS-Chem.

To compile GCHP, navigate back to the GEOS-Chem Classic directory and run:

 make -j4 {options} hpc 2>&1 | tee compile.log 

where you include the following options:

NC_DIAG=yes 
CHEM=tropchem 
EXTERNAL_GRID=yes 
DEVEL=yes 
TRACEBACK=yes 
MET=geos-fp 
GRID=4x5 
NO_REDUCED=yes 
UCX=no 

Note that we assume you are running with at least four processors available as indicated by the inclusion of -j4 in the make command. If you do not have four processors available, do not include the argument "-j4" in any of the compile commands.

Also note that the GRID=4x5 argument has no effect on the resolution at which the model runs since GCHP functions on the cubed-sphere grid. However, passing GRID is required so that components of GEOS-Chem Classic are functional.

All Users

Compiling GCHP produces a GCHP executable called geos that is stored in the bin/ sub-directory of your GEOS-Chem source code directory. Compiler output, including errors, will be stored in the log file if you choose to use one. If you run into problems and decide to contact the GEOS-Chem Support Team for help, please include the log file!

Note that even a successful compilation of the MAPL structural component of GCHP will usually throw a large number of warnings and even some non-fatal errors. As long as the compilation continues, these errors can almost always be ignored, and they will have no effect on the operation of GCHP. Many of these errors are due to leftover components in MAPL which are no longer used but which still try to compile. In future versions of MAPL, we hope to resolve most or all of these issues.

If at any time during compilation you run into an error specifying that the mpicxx command does not exist, then you do not have the proper libraries loaded. See the section above on setting up your environment. If you are an Odyssey user, you can check if you have the necessary libraries loaded by typing module list at the prompt and looking for the openmpi library in the printed list. If you do not have it loaded, simply source the GCHP bashrc file that comes with the run directory.

Running GCHP

Now that you have GCHP downloaded and compiled, you’re ready to start running things. First, however, it is a good idea to double-check that everything is setup properly. To start, check that your compilation was successful and you have an executable by looking for the geos file in your run directory. Next, check that your input meteorology resolution is set in ExtData.rc as you intend. Each meteorological data entry is collected at the top of the file and should end with a filename that includes resolution. For example, your file when using 2x25 inputs should include:

./MetDir/%y4/%m2/GEOSFP.%y4%m2%d2.A1.2x25.nc

The resolution entry on each MetDir line in ExtData.rc must be consistent with the input Met resolution you would like to use. Edit ExtData.rc if they do not match, or, if you are an Odyssey user, repeat the steps to set up your run directory using the initialsetup.sh script provided.

Last, check that you have the proper modules loaded on your system. Forgetting to source your GCHP bashrc for a new session if running from the command line is a common error. For Odyssey users, you can check that you have sourced your GCHP bashrc by using the module list command at the prompt and checking that you have your preferred MPI implementation loaded (e.g. openmpi). If you intend to run the Makefile options or a SLURM run script, you can skip this option. Instead open the file and set your bashrc and other environment variables.

There are several options in the GCHP run directory Makefile for running GCHP interactively, or you may directly call the run command from the command line. If you already have an executable, you can use the following command to run interactively and send output to both the terminal window and a log file called MAPL.log.

make run_interactive

If you want to recompile non-MAPL code and then run, you can use

make gchp_standard

Similarly, gchp_debug is available to recompile with floating point error and out-of-bounds checking turned on. See the Makefile for additional options.

Alternatively, you may submit your run to SLURM using the GCHP.run file if you are on the Odyssey compute cluster. Note that if you are using the GCHP Makefile or run script, open the file first and edit the configurable settings to reflect the environment you want to run in.

The rest of this section specifies how to run at the command line rather than use the Makefile. Once you understand the basics, you may switch to using the make options to streamline your work.

Quick start: 1-hr tropchem

The default GCHP run directory is set for a 1-hour tropchem simulation at resolution c24 with 2x25 input Met resolution and 4x5 output concentration resolution. c24 is approximately the cubed sphere equivalent of 4x5. If you followed the above instructions for setting up your run directory, you should have specified 2x2.5 input resolution for meteorology fields for your initial devkit run and edited ExtData.rc to include paths that reflect this resolution. For Odyssey users, the ExtData.rc update occurred automatically while for all non-Odyssey users this required manual editing.

For this quick test, you will need an environment with the following:

  • 6 CPUs (minimum - see model description)
  • 1 node
  • At least 2500 MB of memory per CPU
  • Your compiler, NetCDF and MPI implementation loaded

Odyssey users should refer to the Harvard Odyssey Users Environment Setup section of this page for a refresher on how to set up your environment prior to running GCHP.

WARNING! It appears that GFED might not be handled correctly by GCHP. We recommend that users disable fire emissions (GFED) in HEMCO_Config.rc for now. You will also need to disable the BOND emissions dataset as this interacts with the GFED dataset.

Once your run directory is all set up, start the simulation by typing the following:

mpirun -n 6 ./geos 2>&1 | tee 1hr_mapl.log

This command can be broken down as follows:

  • mpirun executes an MPI-enabled executable and associates the necessary resources. This is a version-dependent executable; some MPI implementations use other commands, such as mpiexec.
  • -n 6 specifies how many individual CPU cores are requested for the run. The number given here should always be the total number of cores, regardless of how many nodes they are spread over, and must be a multiple of 6 (at least one for each of the cubed sphere faces, and the same number for each face).
  • ./geos is the local GEOS-Chem executable, as with GEOS-Chem Classic
  • 2>&1 | tee 1hr_maple.log is a bash shell-specific means of collecting all MAPL output (standard and error) that is written to the screen into a file.

Note that the output collected to the log file specified when invoking mpirun is created by MAPL. The more traditional GEOS-Chem log output is automatically sent to a file defined in configuration file GCHP.rc. By default, its format is PET%%%%%.GEOSCHEMchem.log, where %%%%% is replaced during run-time with a processor id (typically 00000. PET stands for persistent execution thread. Unlike MAPL, which sends output to the log from ALL threads, GEOS-Chem only outputs from a single thread.

Once the simulation is complete, there should be two netcdf output files in the OutputDir sub-directory. To get started with manipulating output data, see GCHP output data.

Basic test cases

The next step is to try running some things of your own! These cases will require minor modifications to various elements of GCHP’s run directory, and should help to familiarize you with how exactly GCHP works. If you run into problems, please e-mail Lizzie Lundgren (elundgren@seas.harvard.edu).

Basic case 1: Changing resolution and moving to multiple nodes

Re-run the validation case at C48 (cube face side length N = 48 grid cells) resolution, with a shorter timestep (Δt = 600 s). This time, use a larger number of CPUs; say 12 cores, spread evenly across 2 nodes (C = 12, M = 2). Note that GCHP requires that cores are always distributed evenly across nodes. You will then need to change the following files to complete the change of resolution, timestep and core layout:

File (.rc) Changes for grid resolution CN Changes for timestep Δt Changes for core layout CxM
GCHP IM = N
JM = 6N
GRIDNAME=PENx6N-CF
HEARTBEAT_DT=Δt
*_DT=Δt
NX=M
NY=C/M
CAP HEARTBEAT_DT=Δt
fvcore_layout npx=N

npy=N

dt=Δt
HISTORY CoresPerNode=C/M

For the specific case of C48 with a timestep of 600 s, distributed as shown, using 12 cores, you should have:

File (.rc) Changes for grid resolution CN Changes for timestep Δt Changes for core layout CxM
GCHP IM = 48
JM = 288
GRIDNAME=PE48x288-CF
HEARTBEAT_DT=600
*_DT=600
NX=2
NY=6
CAP HEARTBEAT_DT=600
fvcore_layout npx=48
npy=48
dt=600
HISTORY CoresPerNode=6

Finally, use mpirun as normal (mpirun -n 12 ./geos 2>&1 | tee log).

A note regarding NX and NY: NX and NY specify the domain decomposition; that is, how the surface of the cubed sphere will be split up between the cores. NX corresponds to the number of processors to use per N cells in the X direction, where N is the cube side length. NY corresponds to the number of processors per N cells in the Y direction, but must also include an additional factor of 6, corresponding to the number of cube faces. Therefore any multiple of 6 is a valid value for NY, and the only other rigid constraint is that (NX*NY) = NP, where NP is the total number of processors assigned to the job. However, if possible, specifying NX = NY/6 will provide an optimal distribution of cores as it minimizes the amount of communication required. The number of cores requested should therefore ideally be 6*C*C, where C is an integer factor of N. For example, C=4 would give:

  • NX = C = 4
  • NY = 6*C = 24
  • NP = 6*C*C = 64

This layout would be valid for any simulation where N is a multiple of 4. The absolute minimum case, C=1, provides NX=1, NY=6 and NP=6.

Basic case 2: Running a coarse simulation with high-resolution met data

So far, all the examples have been run using coarse (4x5) meteorological data, for the purposes of getting things started quickly. However, a major feature of GCHP is that it can read native-resolution meteorological data without regridding or preprocessing. To allow tests of this feature, a small archive of native-resolution meteorological data has been archived for 2015-07-01 through to 2015-07-10, and can be found at

/n/seasasfs01/gcgrid/data/GEOS_0.25x0.3125.d

To use this data in place of the standard GEOS-Chem meteorological data, you need to perform two changes. First, you need to change the target of your MetDir link. To remove your existing link, type

 unlink MetDir 

Then, to establish a new link, type (on Odyssey)

 ln -s /n/seasasfs01/gcgrid/data/GEOS_0.25x0.3125.d/GEOS_FP MetDir 

This will establish a link to the native resolution meteorological data. Now, open ExtData.rc and perform a find/replace, changing 2x25.nc to Native.nc for all the meteorological data input files (collected at the top of ExtData.rc). Your GCHP run should now use the higher-resolution meteorological data. Note that this does come at a computational cost; however, this will have significantly reduced the artefacts that are associated with using coarse-resolution meteorological data on a foreign grid.

Advanced test cases

The following cases are deliberately light on setup information, to see how easy or difficult users find modifying the GCHP run and code directories to convince it to do what is needed. If you succeed in running any of these cases (or, possibly more importantly, if you find that you can’t), please e-mail Lizzie Lundgren at elundgren@seas.harvard.edu with details. The more detail the better; but please include at least the following:

  • The test case name (if applicable)
  • The resolution(s) you ran it at
  • Whether the run completed or not

Advanced case 1: Run GCHP with a restart file

Run GCHP once for at least ten days in any chemically-active configuration, generate a restart file, and run GCHP again from that restart file. To help you get started, Odyssey users can find some non-zero restart files at

/n/regal/jacob_lab/seastham/GCHP_Restarts

Copy one of the files to your run directory and change GCHP.rc to read

GIGCchem_INTERNAL_RESTART_FILE: +gcchem_internal_checkpoint_c24.nc
GIGCchem_INTERNAL_CHECKPOINT_FILE: gcchem_internal_checkpoint_c24.nc

The + means that any missing values will be ignored rather than causing the simulation to fail. Note that the restart file has no date or time markers and will be overwritten at the end of the run, so make sure to back your restart files up if you wish to reuse them!

Advanced case 2: GCHP speedrun

Initialize GCHP with a non-zero restart file. Run with 66 tracers but with no processes except advective transport. Reduce run time as much as possible by setting unnecessary filepaths in ExtData.rc to “/dev/null”. This will result in them being set to default values without spending time reading files.

Advanced case 3: Changing tracer count

Add a new, passive tracer to GCHP. To do this, you will need to:

  • Remove all tracers in input.geos and replace them with one tracer called PASV. Set the listed number of tracers to 1
  • Modify the file Chem_Registry.rc in CodeDir/GCHP/Registry so that all tracers (TRC_XYZ) are removed, and replace them with the entry TRC_PASV. NOTE: You must perform a clean compile after changing Chem_Registry.rc or the changes will not take effect!
  • Disable chemistry, deposition and emissions in input.geos
  • Remove all tracer outputs in HISTORY.rc and replace them with one output giving TRC_PASV

If you find that you can run GCHP with these modifications, all that remains is to obtain a non-zero restart file. This will be available soon - if you reach this point, please contact Lizzie Lundgren.

Return to GCHP MainPage