GEOS-Chem coding and debugging

From Geos-chem
Jump to: navigation, search

On this page we provide information about GEOS-Chem coding and debugging, as well as the procedure by which new features are added.

How GEOS-Chem Code Development works

The GEOS-Chem Steering Committee (GCSC) encourages updates to the GEOS-Chem model. Updates to the GEOS-Chem model benefit both the developer and the entire GEOS-Chem community. The developer benefits through coauthorship and citations. Priority development needs are identified at GEOS-Chem users' meetings with updates between meetings based on GCSC input through working groups.

When should you submit updates to the GEOS-Chem code? Bug fixes should be submitted as soon as possible and will trump all other priorities. Code related to model developments should be submitted when it is mature. Your working group chair can offer guidance on the timing of submitting code to the GCST.

The practical aspects of submitting updates to the GEOS-Chem Support Team are outlined below.

Submitting updates for inclusion into GEOS-Chem

  1. First download the GEOS-Chem source code and run directories
  2. Add your modifications into the code
  3. Test your code thoroughly and make sure that it works.
  4. Contact the GEOS-Chem Support Team and your Working Group leaders to request that your changes be included in the standard code.
  5. Send the GEOS-Chem Support Team your code updates.
    • We ask that you provide the Support Team with a Git patch file with your revisions. This will minimize errors in the code transfer process.
    • Be sure to include the following:
      1. The Git patch file
        • Please include the name of the parent commit. This is the commit that immediately preceded where the patch was made.
      2. Data files in COARDS-compliant netCDF format
      3. Documentation such as
        • A brief summary of the update for posting on the GEOS-Chem wiki
        • A description of how the update affects model output
        • The reference paper
        • README files
  6. The GEOS-Chem Support Team will add your changes to the standard code.
    • Your update will be benchmarked with a 1-month full-chemistry simulation.
    • Upon completion of the 1-month full-chemistry benchmark, you will be asked to view the output from the simulation and fill in a Benchmark Assessment Form.
    • The 1-month full-chemistry benchmark results must be approved by Model Scientist Daniel Jacob and the GEOS-Chem Steering Committee before the benchmark simulation is approved.
  7. If the update is for an offline chemistry simulation (e.g. CO2, CH4, Hg), then a further benchmark may be conducted by the appropriate Working Group.
  8. 1-year full-chemistry benchmark simulations will be conducted by the GEOS-Chem Support Team before the version is officially released to the GEOS-Chem user community:
    • The GEOS-Chem Support Team may run one or more 1-year full-chemistry benchmarks (global 4° x 5° grid). The GEOS-Chem Steering Committee must approve the 1-year benchmark results before the version is released.
    • The Nested Model Working Group will performed a North-American Nested Grid benchmark simulation, in order to ensure that all software updates made for the forward model do not affect the nested model.

--Bob Yantosca (talk) 17:02, 3 November 2016 (UTC)

GEOS-Chem coding tips

As a new GEOS-Chem user, you will more than likely be adding your own modifications to GEOS-Chem. The following sections contain some helpful hints pertaining to source code style and management.

Coding style

In order to improve the readibility and appearance of the GEOS-Chem source code, we ask all users to conform to the GEOS-Chem Style Guide. The readability of your code can be greatly enhanced by simple steps, like adding white space between words and by indenting DO loops and IF statements.

The more comments, the better!

It is good practice to add copious comments to your source code. This will make sure that current and future GEOS-Chem users will be able to understand the modifications that you have added.

Keep in mind that many GEOS-Chem users (including the GEOS-Chem Support Team) will probably not be as familiar with your specific area of research as you are. While most users will probably get the "big picture" of what you are doing, they will be less knowledgeable about the little details (i.e. why does this reaction rate have a value of X, why is this parameter set to Y, why was this IF statement deleted, etc.). Providing sufficient source code documentation will eliminate any such guesswork.

Automatic documentation with ProTeX

Protex is a useful Perl script that can strip information from a standard Fortran document header and save that to a LaTeX file. The LaTeX file can then be converted into both PostScript and PDF output. (Protex was developed at the Goddard Space Flight Center by Arlindo Da Silva, Will Sawyer, and others.)

Since GEOS-Chem v8-01-03, the GEOS-Chem Support Team has been actively replacing the older GEOS-Chem source code documentation with ProTeX documentation headers. This will allow a "GEOS-Chem Reference Manual" (i.e. a list of all GEOS-Chem routines and their inputs and outputs) to be generated automatically when you invoke the make doc Makefile option.

For more information, please see:

  1. GEOS-Chem wiki: Automatic documentation with protex
  2. GEOS-Chem User's Guide: A7.2 Headers, declarations, indentations, and white space

Get familiar with Git

We use the Git source code management software to facilitate the version control for GEOS-Chem. Git allows us to easily merge and track codes from GEOS-Chem developers into the "mainline" standard source code repository.

When submitting source code updates for GEOS-Chem, we ask that you send us your update in Git-ready form.

For more information about Git, please see the following pages:

  1. Version control with Git
  2. Using Git with GEOS-Chem

When your GEOS-Chem run dies with an error, the most important thing is to try to isolate the place where the error occurs (i.e. does it die in chemistry, in transport, in dry deposition, etc.)

Here we list several things you can try if your run dies with an error.

GEOS-Chem debugging tips

If your GEOS-Chem simulation dies unexpectedly with an error or takes much longer to execute than it should, the most important thing is to try to isolate the source of the error or bottleneck right away. Here are some of our favorite debugging tips that you can use:

Use profiling tools to determine the source of computational bottlenecks

If you think your GEOS-Chem simulation is taking too long to run, consider using profiling tools to generate a list of the time that is spent in each routine. This can help you identify badly written or parallelized code that is causing GEOS-Chem to slow down. For more information, please see our Profiling GEOS-Chem wiki page.

--Bob Yantosca (talk) 18:38, 14 December 2016 (UTC)

Have you looked at the log files?

If your GEOS-Chem simulation stopped with an error, but you cannot tell where, turn on the ND70 diagnostic (debug output) in input.geos and rerun your simulation. The ND70 diagnostic will print debug output at several locations in the code (after transport, chemistry, emissions, dry deposition, etc.). This should let you pinpoint the location of the error.

If the log file indicates your run stopped in emissions, you can check the HEMCO.log file for additional information (GEOS-Chem v10-01 and later versions only). We recommend setting both the Verbose and Warnings options in HEMCO_Config.rc to 3 to print all debug statements and warning messages to your HEMCO.log file.

--Bob Yantosca (talk) 17:30, 3 November 2016 (UTC)

Did your run stop with an error message?

Check if someone else has already reported the bug

Before trying to debug your code, we recommend that you check our Bugs and fixes wiki page to see if your error is a known issue, and if someone has already submitted a fix.

--Bob Yantosca (talk) 17:30, 3 November 2016 (UTC)

Recompile GEOS-Chem with debug options turned on

Check for common problems like array-out-of-bounds errors, floating-point exceptions, and parallelization issues by turning on debug compiler switches:

Debugging flag Description
DEBUG=yes This option turns off all optimization. It also prepares GEOS-Chem so that it can be run in a debugger like TotalView.
BOUNDS=yes This option turns on runtime array-out-of-bounds checking, which looks for instances of invalid array indices (i.e. If the A array only has 10 elements but you try to reference A(11).)
TRACEBACK=yes This option turns on the -traceback option (ifort only) and will print a list of routines that were called when the error occurred.
NOTE: This option will always be turned on by default in GEOS-Chem v11-01 and newer versions.
FPEX=yes or
FPE=yes
This option turns on error checking for floating-point exceptions (i.e. div-by-zero, NaN, floating-invalid, and similar errors).
OMP=no This option turns off OpenMP parallelization (which is turned on by default). This will check for parallelization issues.

--Bob Yantosca (talk) 16:17, 3 November 2016 (UTC)

GEOS-Chem Unit Tester

The GEOS-Chem Unit Tester is an external package that can run several test GEOS-Chem simulations with a set of very strict debugging options. The debugging options are designed to detect issues such as floating-point math errors, array-out-of-bounds errors, inefficient subroutine calls, and parallelization errors. You can use this tool to find many common numerical errors and programming issues in your GEOS-Chem code.

For complete instructions on how the GEOS-Chem Unit Tester can assist your debugging efforts, please see our Debugging with the GEOS-Chem unit tester wiki page.

--Bob Yantosca (talk) 16:44, 3 November 2016 (UTC)

Run GEOS-Chem in a debugger to find the source of error

If you have access to a debugger (e.g. GDB, IDB, DBX, Totalview), you can save a lot of time and hassle by learning the basic commands such as how to:

  • Examine data when a program stops
  • Navigate the stack when a program stops
  • Set break points

To run GEOS-Chem in a debugger, you should add the DEBUG=yes option to the make command. This will compile GEOS-Chem with the -g flag that tells the compiler to generate symbolic debug information. The DEBUG=yes option also uses the -O flag, which switches off compiler optimization that can modify the sequence in which individual instructions occur. To apply these options, type:

     make -j4 DEBUG=yes OMP=no    # Without parallelization
     make -j4 DEBUG=yes           # With parallelization

--Bob Yantosca (talk) 16:24, 3 November 2016 (UTC)

Did you max out your alloted time or use too much memory?

If you are running GEOS-Chem in on a shared computer system, chances are you will have used a scheduler (such as LSF, PBS, Grid Engine, or SLURM) to submit your GEOS-Chem job to a computational queue. You should be aware of the run time and memory limits for each of the queues on your system.

If your GEOS-Chem job uses more memory or run time than the computational queue allows, your job can be cancelled by the scheduler. You will usually get an error message printed out to the stderr stream. Be sure to check all of the log files created by your GEOS-Chem jobs for such error messages.

The solution will usually be to submit your GEOS-Chem simulation to a queue with a longer run-time limit, or larger memory limit. You can also split up your GEOS-Chem simulation into several smaller stages that take less time to complete.

--Bob Yantosca (talk) 16:23, 3 November 2016 (UTC)

Have you modified the standard GEOS-Chem code?

If you have made modifications to a "fresh out-of-the-box" GEOS-Chem version, then you should look over your changes to search for the source of error.

You can also use Git to revert to the last known error-free state of GEOS-Chem, and use that as a reference.

--Bob Yantosca (talk) 16:26, 3 November 2016 (UTC)

Can you isolate the error to a particular operation?

Can you tell if the error happens in transport, chemistry, emissions, dry dep, etc? Try turning off these operations one at a time in input.geos to see if you get past the error.

Also try turning on the ND70 diagnostic, which will add additional debug print statements to the output. This will help you to see the last subroutine that was exited before the error occurred.

--Bob Yantosca (talk) 16:33, 3 November 2016 (UTC)

Does the error happen consistently?

If the error happens at the same model date & time, it could indicate bad input data. Please see our List of reprocessed met fields to make sure there is not a known issue with the met fields or emissions for that date.

If the error happened only once, it could be caused by a network problem or other such transient condition.

--Bob Yantosca (talk) 16:38, 3 November 2016 (UTC)

Check our list of reprocessed met fields

If you suspect an problem with one of the met field data files that GEOS-Chem reads as input, check our our List of reprocessed met fields wiki page. This is a list of met field data files that had to be regenerated due to known issues (i.e. incomplete data or other such problems). You might be able to fix your problem by simply re-downloading the affected file or files.

--Bob Yantosca (talk) 17:33, 3 November 2016 (UTC)

Check for math errors

If you suspect that a floating-point math error, such as:

  • Division by zero
  • Logarithm of a negative number
  • Numerical overflow or underflow
  • Infinity

Then make clean and recompile with the FPEX=yes flag. This will turn on additional error checking that will stop your GEOS-Chem run if a floating-point error is encountered.

You can often detect numerical errors by adding debugging print statements into your source code:

  • Check the minimum and maximum values of an array with the MINVAL and MAXVAL intrinsic functions:
     PRINT*, '### Min, Max: ', MINVAL( ARRAY ), MAXVAL( ARRAY )
     CALL FLUSH( 6 )
  • Check the sum of an array with the SUM intrinsic function:
     PRINT*, '### Sum of X : ', SUM( ARRAY )
     CALL FLUSH( 6 )

See our Floating point math issues wiki page for information on how to avoid some common pitfalls.

--Bob Yantosca (talk) 16:32, 3 November 2016 (UTC)

When in doubt, print it out!

Print out the values of variables in the area where you suspect the error lies. You can also add call flush(6) to flush the output buffer after writing. Maybe you will see something wrong in the output.

--Bob Yantosca (talk) 16:38, 3 November 2016 (UTC)

When all else fails, use the brute force method

If the bug is difficult to locate, then comment out a large section of code and run GEOS-Chem. If the error does not occur, then uncomment some more code and run GEOS-Chem again. Repeat the process until you find the location of the error. The brute force method may be tedious, but it will usually lead you to the source of the problem.

--Bob Yantosca (talk) 16:41, 3 November 2016 (UTC)

Reporting GEOS-Chem bugs to the GEOS-Chem Support Team

If you have tried to solve your code problem but cannot, then please report it to the GEOS-Chem Support Team. Please include the following information:

Item Description
GEOS-Chem Version Number (e.g. v10-01)
Met field type (e.g. GEOS-5, GEOS-FP, MERRA, MERRA2)
Horizontal Resolution (e.g. 4° x 5°, 2° x 2.5°, 0.5° x 0.666°, 0.25° x 0.3125°)
Type of Simulation
  • Standard
  • Tropchem
  • UCX
  • SOA
  • SOA-SVPOA
  • Specialty simualations
    • Aerosols only
    • CH4
    • CO2
    • Hg
    • POPs
    • Tagged CO
    • Tagged O3
Platform & OS (e.g. CentOS, MacOS, etc.)
CPU type Type cat /proc/cpuinfo at the Unix prompt and look for the "model name" field, e.g.
model name      : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Compiler and version Use the --version option to your compiler. For example, type:
ifort --version
 

which will print

ifort (IFORT) 15.0.0 20140723   
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.

and similarly for the gfortran and pgfortran compilers.

Number of Processors  
Diagnostics Requested (e.g. ND28, ND43, ND45, etc.)
Description of problem  
The commands that are required to produce/reproduce the problem  
Error Message  
GEOS-Chem logfile output  
HEMCO logfile output  

Make sure that you send the GEOS-Chem Support Team the log files from the simulation. It is very difficult to diagnose the problem without seeing the log file output. Also include the steps that are necessary to produce (or reproduce) the problem. This will help us debug the issue.

--Bob Yantosca (talk) 15:13, 13 April 2017 (UTC)