Getting Started with GCHP
GEOS-Chem High Performance (GCHP) represents the next generation of GEOS-Chem. Capabilities that distinguish GCHP from standard GEOS-Chem (GEOS-Chem "classic" or GCC) include:
- Flexible-resolution simulations, from ~4° x 5° (C24) to ~0.25° x 0.3125° (C360) without the need for code edits or recompilation
- Use of the cubed-sphere (CS) grid, which eliminates the need for polar filtering and brings GEOS-Chem into line with the NASA GEOS AGCM
- MPI distributed-memory parallelization which enables running a single job across multiple machines
This page will guide you through the wiki pages to get started with the latest release of GCHP. Whether you plan on doing research in the near-term with GCHP or not, we encourage you to set up the model, play with it, and join the GCHP community by signing up for the GCHP Working Group mailing list. GCHP is very much a team effort and we invite you to contribute by testing it out and telling us how it goes by contacting the GEOS-Chem Support Team (geos-chem-support [at] as.harvard.edu). We also welcome you to join our GCHP Slack Workspace for easy contact with other GCHP users. Contact Lizzie Lundgren (elundgren [at] seas.harvard.edu) to join.
GCHP Quick Start Guide
- Hardware and Software Requirements
- Downloading Source Code and Data Directories
- Obtaining a Run Directory
- Setting Up the GCHP Environment
- Running GCHP: Basics
- Running GCHP: Configuration
- Output Data
- Developing GCHP
- Run Configuration Files
The GCHP Quick Start Guide always reflects the most recent release of GCHP. If you are an experienced GCHP user, however, it is unlikely you will reread the entire manual for every update. Instead, check this section regularly to see what is new logistically for getting GCHP up and running. Then use the Quick Start Guide to hone in on the details. These notices will periodically be archived elsewhere (TBD) so that only the most recent updates are featured here.
New in 12.5.0
This version includes a major update to MAPL and associated changes in FV3 advection, along with a few usability improvements in the run directory.
- This version retires external tile files, replacing them with online calculation of regridding weights using ESMF. This means the TileFiles symbolic link is gone from the run directory and you no longer need to generate new tile files for input files.
- Compiling MAPL requires the Goddard Fortran Template Library (gFTL) which contains various utility types. Downloading and building this library is a one-time step. You will be guided on how to do it when you create a run directory with this version for the first time.
- New environment variable gFTL containing path to the library is now expected. Setting it automatically is included in all sample environment scripts. See those files for how to set the variable in your own environment files.
- A previous bug preventing running GCHP after building with GNU fortran compiler gfortran is now fixed. Beware, however, that there is a performance hit in advection when using gfortran rather than ifort.
- The final output restart file (checkpoint) is now renamed by all sample run scripts to include datetime in the filename and the string 'restart'. While renaming the file has the benefit of not overwriting restart files across multiple runs this was not the primary motivation. In the new version of MAPL, if the checkpoint file is already present at the start of the run then MAPL will crash when it tries to write it out at the end.
- Domain size parameters NX and NY are now calculated automatically in run directory file runConfig.sh based on the total number of cores you set in that file.
- Timesteps are now automatically updated to values recommended by the GCHP Working Group. This includes decreasing the timestep when running at c180 or higher.
- Diagnostics can be regridded online to lat-lon prior to output. You may define your own lat-lon grid, specifying polar edge/center, dateline edge/center, grid resolution, and regridding scheme, either bilinear interpolation or conservative. See the HISTORY.rc file in the run directory for details.
- Diagnostics may be output on a regional grid if a lat-lon output grid is specified. See the documentation in the HISTORY.rc file in the run directory for more information.
- If you are interested in memory usage during a GCHP run, you can enable additional memory prints per timestep using MEMORY_DEBUG_LEVEL in runConfig.sh. Value 0 corresponds to minimal prints, while 1 corresponds to maximum. MAPL_DEBUG_LEVEL has been updated to also have a maximum of 1.
There are a few open issues with this version:
- MAPL fails on an ESMF domain decomposition issue with coarse resolution input files at high core counts. The issue occurs if using more than 600 cores and inputting 4x5 files. A work-around was implemented in the run directory to remove all 4x5 input emissions files, replacing them with 2x2.5 grid resolution or higher. However, users may run into problems if trying to run GCHP with default emissions and thousands of cores due to the 2x2.5 files. This will be fixed in the next MAPL version update.
- The new version of MAPL contains a slow memory leak. This issue is being investigated by NASA GMAO. Beware that you may run out of memory if doing full chemistry simulations with duration greater than a few months. To avoid this you can use the GCHP multi-run option which breaks up a single run into multiple smaller jobs. Please note, however, that there are differences observed between using single run versus multi-run. This is an open issue we are looking into. We are documenting all single versus multi-run differences at http://ftp.as.harvard.edu/gcgrid/geos-chem/validation/multi_vs_single_run/. Look there to assess if the scale of the differences would be a problem for you.
Frequently Asked Questions
I do not have a high performance compute cluster. Can I run GCHP?
Yes, you can run GCHP on as little as a single node as long as there are at least 6 cores available. We are working to make serial runs on a single core possible in the future.
Is GCHP only for high resolution runs?
No! You can run GCHP at the cubed-sphere grid equivalent of 4x5 (c24) and 2x2.5 (c48). The GCHP standard simulation run directory is set up for c48 by default starting in GCHP v11-02e but can easily be changed to whatever resolution you would like to run at by updating the runConfig.sh bash script in the run directory. Unlike GEOS-Chem classic, a single GCHP run directory can be used for any run resolution.
What are the different cubed-sphere resolutions and how do they compare to lat/lon grids?
Below we summarize the standard resolution specifications and the recommended timesteps (based on the timesteps used in GEOS-Chem "classic").
Standard resolution specification Approximate CS equivalent(s) Suggested timestep (s) 4° x 5° C24 1200 2° x 2.5° C48, C45 900 1° x 1.25° C96, C90 600 0.5° x 0.625°1 C192, C180 300 0.25° x 0.3125°2 C384, C360 300 0.125° x 0.15625° C7203 180
I changed my model resolution and my run completed successfully. Are there any scientific considerations I should be aware of before drawing conclusions from my results?
Yes, there are several resolution-dependent settings or inputs that will result in degraded output quality if the model resolution increases from the default c24. These include lightning scale factors, dust parameterization, regridded cloud variables, and use of the Voronoi timezones. The GCHP Development Team is working to resolve these dependencies. Until then, please contact the GEOS-Chem Support Team for more information.
I don't have met fields at the lat/lon equivalent of the cubed-sphere resolution I want to run at. What should I use?
As a general rule of thumb you may run GCHP at cubed-sphere resolutions up to a factor of two larger than your input met resolution equivalent. For example, you could use 2x2.5 input met for both c24 and c48 runs. GCHP will generally run successfully with even higher ratios, such as 2x2.5 input met for c180. However, in these cases you should use caution in your interpretation of results. In general it is always best to use the finest met resolution available and 0.25x0.3125 is the default input met resolution starting in v11-02d. See the [Shared Data Directories wiki page] for instructions on how to access fine resolution meteorology files.
Should I change my simulation timestep if I run at very high resolution?
You may increase your timestep to improve speed but this is not required. A 30 minute dynamics timestep and 60 minute chemistry timestep may be adequate for your purposes. We recommend that you never reduce your timestep from the default of 10 minutes for dynamics and 20 minutes for chemistry/emissions.
How can I get a restart file for a different cubed-sphere resolution?
All GCHP run directories come with symbolic links to five resolutions of restart files stored in ExtData/SPC_RESTARTS: c24, c48, c96, c180, and c360. You can generate restart files for other resolutions by regridding an existing restart file using the Fortran tool csregridtool for created by Sebastian Eastham.
Python regridding tools are also available and will be included in the gcpy package currently in development. Contact GCST member Lizzie Lundgren for more information.
I can now compile and run GCHP. Do you have tips that might help me optimize my use of GCHP?
Below are several time-saving tips to keep in mind when you start regularly using GCHP. This by no means an exhaustive list. Please submit your own to GCST if you think something should be included!
- Only recompile code you changed. Use make help to see your options for compiling.
- Changing simulations does not require recompilation. You can copy the geos executable to different run directories for the same code version.
- runConfig.sh overwrites settings in other config files. Beware!
- Do not simply run the executable; use or adapt run scripts provided. Using gchp.local.run to run from the command line with 6 cores for quick tests.
- Run-time MAPL/ESMF errors are nearly always from bad config. Check your run directory *.rc files and runConfig.sh.
- Errors in CAP usually inidicate bad simulation dates; check runConfig.sh.
- ExtData errors indicate a problem in input. Set MAPL_DEBUG_LEVEL to 20 in runConfig.sh to maximize ExtData debug prints to the log file.
- MAPL History error is problem with diagnostics; check HISTORY.rc.
- Search for “making install” in file compile.log to find MAPL build errors.
- Create a GCHP GitHub issue if you think you need to edit MAPL or ESMF.