GEOS-Chem coding and debugging
- 1 GEOS-Chem coding
- 2 Have you looked at the log file?
- 3 Did your run stop with an error message?
- 4 Recompile GEOS-Chem with debug options turned on
- 5 Use a debugger
- 6 Did you run out of time in the queue?
- 7 Did you modify the standard code?
- 8 Can you isolate the error to a particular operation?
- 9 Does the error happen consistently?
- 10 Check for math errors
- 11 When in doubt, print it out!
- 12 When all else fails, use the brute force method
As a new GEOS-Chem user, you will more than likely be adding your own modifications to GEOS-Chem. The following sections contain some helpful hints pertaining to source code style and management.
In order to improve the readibility and appearance of the GEOS-Chem source code, we ask all users to conform to the GEOS-Chem Style Guide. The readability of your code can be greatly enhanced by simple steps, like adding white space between words and by indenting DO loops and IF statements.
The more comments, the better!
It is good practice to add copious comments to your source code. This will make sure that current and future GEOS-Chem users will be able to understand the modifications that you have added.
Keep in mind that many GEOS-Chem users (including the GEOS-Chem Support Team) will probably not be as familiar with your specific area of research as you are. While most users will probably get the "big picture" of what you are doing, they will be less knowledgeable about the little details (i.e. why does this reaction rate have a value of X, why is this parameter set to Y, why was this IF statement deleted, etc.). Providing sufficient source code documentation will eliminate any such guesswork.
Automatic documentation with ProTeX
Protex is a useful Perl script that can strip information from a standard Fortran document header and save that to a LaTeX file. The LaTeX file can then be converted into both PostScript and PDF output. (Protex was developed at the Goddard Space Flight Center by Arlindo Da Silva, Will Sawyer, and others.)
Since GEOS-Chem v8-01-03, the GEOS-Chem Support Team has been actively replacing the older GEOS-Chem source code documentation with ProTeX documentation headers. This will allow a "GEOS-Chem Reference Manual" (i.e. a list of all GEOS-Chem routines and their inputs and outputs) to be generated automatically when you invoke the make doc Makefile option.
For more information, please see:
- GEOS-Chem wiki: Automatic documentation with protex
- GEOS-Chem User's Guide: A7.2 Headers, declarations, indentations, and white space
Get familiar with Git
We use the Git source code management software to facilitate the version control for GEOS-Chem. Git allows us to easily merge and track codes from GEOS-Chem developers into the "mainline" standard source code repository.
When submitting source code updates for GEOS-Chem, we ask that you send us your update in Git-ready form.
For more information about Git, please see the following pages:
When your GEOS-Chem run dies with an error, the most important thing is to try to isolate the place where the error occurs (i.e. does it die in chemistry, in transport, in dry deposition, etc.)
Here we list several things you can try if your run dies with an error.
Have you looked at the log file?
- Can you tell where the run stopped?
- If not, turn on the ND70 diagnostic (debug output) in input.geos and rerun
- ND70 will print debug output at several locations in the code (after transport, chemistry, emissions, dry deposition, etc.). This should let you pinpoint the location of the error.
- If the log file indicates your run stopped in emissions, you can check the HEMCO.log file for additional information (GEOS-Chem v10-01 and later versions only). We recommend setting both the Verbose and Warnings options in HEMCO_Config.rc to 3 to print all debug statements and warning messages to your HEMCO.log file.
Did your run stop with an error message?
- See our Common GEOS-Chem error messages wiki page for detailed information on common errors and how to resolve them.
Recompile GEOS-Chem with debug options turned on
Check for common problems like array-out-of-bounds errors, floating-point exceptions, and parallelization issues by turning on debug compiler switches:
- This option turns on runtime array-out-of-bounds checking, which looks for instances of invalid array indices (i.e. If the A array only has 10 elements but you try to reference A(11).)
- This option turns on the -traceback option (ifort only) and will print a list of routines that were called when the error occurred.
- This option turns on error checking for floating-point exceptions (i.e. div-by-zero, NaN, floating-invalid, and similar errors).
- This option turns off OpenMP parallelization (which is turned on by default). This will check for parallelization issues.
Use a debugger
- If you have access to a debugger (e.g. IDB, DBX, Totalview), you can save a lot of time and hassle by learning the basic commands such as how to:
- Examine data when a program stops
- Navigate the stack when a program stops
- Set break points
- To run GEOS-Chem in a debugger, you should add the DEBUG=yes option to the make command. This will compile GEOS-Chem with the -g flag that tells the compiler to generate symbolic debug information. The DEBUG=yes option also uses the -O flag, which switches off compiler optimization that can modify the sequence in which individual instructions occur. To apply these options, type:
make -j4 DEBUG=yes OMP=no # Without parallelization make -j4 DEBUG=yes # With parallelization
Did you run out of time in the queue?
- Did you submit your job to a queue that only has 1 hour or less of wall-clock time?
- PBS error #143 is usually the tell-tale sign of an "out-of-time" error.
- If so, then submit to a queue with a longer time limit.
Did you modify the standard code?
- If so, then focus on your most recent changes.
- You should keep a **clean** (unmodified) version for comparison.
Can you isolate the error to a particular operation?
- Can you tell if the error happens in transport, chemistry, emissions, dry dep, etc?
- You can try turning off these operations one at a time in input.geos to see if you get past the error.
Does the error happen consistently?
- If the error happens at the same model date & time, it could indicate bad input data.
- Check our List of reprocessed met fields to make sure there is not a known issue with the met fields for that date.
- If it happened only once, it could be caused by a network problem or other such transient condition.
Check for math errors
- Is there a division by zero, logarithm of negative number, etc?
- Add calls to routine CHECK_STT to check for NaN, infinity, and negative values in the tracer concentrations. For example:
! Put this at the top of the subroutine where you are calling CHECK_STT USE TRACER_MOD, ONLY: CHECK_STT . . . ! Check for NaN, infinity, negatives: CALL CHECK_STT( 'Where I think there is a problem' )
- Check the minimum and maximum values of an array with the MINVAL and MAXVAL intrinsic functions:
PRINT*, '### Min, Max: ', MINVAL( ARRAY ), MAXVAL( ARRAY ) CALL FLUSH( 6 )
- Check the sum of an array with the SUM intrinsic function:
PRINT*, '### Sum of X : ', SUM( ARRAY ) CALL FLUSH( 6 )
- See our Floating point math issues wiki page for information on how to avoid some common pitfalls.
When in doubt, print it out!
- Print out the values of variables in the area where you suspect the error lies.
- Also use "call flush(6)" to flush the output buffer after writing.
- Maybe you will see something wrong in the output.
When all else fails, use the brute force method
- Comment out code until you find where the failure occurs.