Understanding the different categories of errors

From Geos-chem
Revision as of 18:25, 20 September 2022 by Bmy (Talk | contribs) (Overview)

Jump to: navigation, search

We have migrated bug reports and support requests to our Github issue tracker.



Previous | Next | Guide to GEOS-Chem error messages

  1. Understanding the different categories of errors
  2. Compile-time warnings and errors
  3. Run-time crashes and abnormal exits
  4. Segmentation faults
  5. Other less-common errors


Overview

On this page, we provide information about the different types of errors that your GEOS-Chem simulation might encounter.

GEOS-Chem died with an error. What can I do?

You should try to understand the type of error that has occurred before taking action.

Categories of errors

There are several different classes of errors, such as:

  1. File I/O errors
    • Errors caused by incorrect input options
    • Errors caused by missing data files
    • Errors caused by corrupted data files
    • Running GEOS-Chem at the wrong resolution for the data files
  2. Abnormal exits
  3. Technical errors

Once you have understood the type of GEOS-Chem error that has occurred, you can take steps to fix it.

In some cases, the fixes will be simple (i.e. selecting the proper option and starting over, or replacing missing or corrupted data files, etc.). In other cases, the error may be more difficult to diagnose (such as as an error in the chemical solver. In that case you may have to "dig in" to the code so that you can modify its behavior (e.g. modify convergence criteria, add error checks, etc.).

Here are a few useful recommendations that you can use in determining the cause of your error:

Try to isolate the error to a particular routine

Try to isolate the error to a particular GEOS-Chem routine. You can use a debugger such as idb or Totalview, or you can turn on the ND70 debug printout option. (ND70 will print debug messages to the log file after key operations have been completed.) You may also try turning off operations (e.g. wet deposition, dry deposition, chemistry) one at a time in the input.geos file to isolate the error. Once you know where the error is occurring, try to print out values for a given grid box and tracer, either in the debugger or by adding PRINT statements to the code. You will gain great insight into what is happening by using this technique.

Determine if the error is persistent

Try to determine if the error is persistent (i.e. if it always occurs at the same model time/date or if it occurs at different times and dates in the simulation). A persistent error could indicate a missing or corrupted data file, or a flaw in the scientific algorithm being used. Non-persistent errors (i.e. those that don't happen at the same model time and date) may indicate memory errors, such as array-out-of-bounds, segmentation faults, or that the code is using more memory than is available.

Compile with debugging options

If you still cannot determine the error from the traceback output, recompile with the BOUNDS=yes compiler option. This will check to see if an array is being accessed incorrectly

For further assistance

Please see Debugging GEOS-Chem, which contains strategies for recognizing and fixing several commonly-encountered error conditions.

If you still cannot determine the source of your error, please contact the GEOS-Chem Support Team for assistance.

--Bob Yantosca (talk) 14:34, 17 June 2019 (UTC)

Where does GEOS-Chem error output get printed?

GEOS-Chem, like all Unix programs, sends its output to two streams:

  1. stdout
  2. stderr

The stdout stream

Most GEOS-Chem output will go to the stdout stream, which takes I/O from the Fortran WRITE and PRINT commands. If you run GEOS-Chem by just typing the executable name at the Unix prompt:

geos

then the stdout stream will be printed to the terminal window. You can also redirect the stdout stream to a log file with the Unix redirect command:

geos > log

We recommend that you create GEOS-Chem log files so that you can reexamine the output from your run at a later time.

Most GEOS-Chem errors will be printed to stdout (and hence, to the log file). Most errors flagged by GEOS-Chem use a standard error message format, such as:

==============================================================
GEOS-CHEM ERROR: No output scheduled on last day of run!
STOP at IS_LAST_DAY_GOOD ("input_mod.f")
==============================================================

The stderr stream

The stderr stream takes I/O from various Unix system commands, including exit. If your GEOS-Chem run died as a result of a system problem (i.e. you ran up against a system time or memory limit, you are over disk quota, etc.), then the error message will more than likely go to stderr instead of stdout. As a result, these error messages will not be printed to the GEOS-Chem log file output.

If you use a queue system then the stderr output may be printed to a file. For example, if you submit a GEOS-Chem job to the SGE queue system, and your job script is named run.geos, your job scheduler will send the stderr output to a file.

--Bob Yantosca (talk) 14:34, 17 June 2019 (UTC)



Previous | Next | Guide to GEOS-Chem error messages