Run-time crashes and abnormal exits
- Understanding the different categories of errors
- Compile-time warnings and errors
- Run-time crashes and abnormal exits
- Segmentation faults
- Other less-common errors
- 1 Overview
- 2 Run-time errors originating in GEOS-Chem code
- 2.1 No output scheduled on last day of run
- 2.2 List-directed I/O syntax error
- 2.3 Error reading the input.geos file
- 2.4 Errors reading the GEOS-Chem restart file
- 2.5 NetCDF: HDF Error
- 2.6 Permission denied error
- 2.7 UCX not defined at compile time
- 2.8 Floating invalid or floating-point exception error
- 2.9 KPP "Step size too small" error
- 2.10 Mixed file access modes error
- 2.11 Negative tracer found in WETDEP
- 3 Run time errors originating in the HEMCO emissions component
We have been migrating bug reports to our GEOS-Chem issue tracker, which is located on our Github repository: https://github.com/geoschem/geos-chem/issues/. We recommend that you also look through both the open and closed issues on this page, as your issue might be listed there.
In this section, we provide information about some commonly-reported run-time errors that cause GEOS-Chem to halt executing.
Run-time errors originating in GEOS-Chem code
No output scheduled on last day of run
If you encounter this error at the start of your GEOS-Chem simulation:
========================================================================== GEOS-CHEM ERROR: No output scheduled on last day of run! STOP at IS_LAST_DAY_GOOD ("input_mod.f") ==========================================================================
This means that you have not told GEOS-Chem to save out diagnostic data on the day that your simulation ends. GEOS-Chem adds this error check in order to prevent you from running a long simulation only to have no diagnostics printed out at the end of the run.
For more information on how to schedule diagnostic output in GEOS-Chem, please see the OUTPUT MENU section of the input.geos file on the GEOS-Chem wiki.
List-directed I/O syntax error
The above error message indicates that the simulation crashed at line 871 in GeosCore/input_mod.F. This means there was an issue while reading the input.geos file (located in the run directory). For example, GEOS-Chem might have expected numeric input, but instead a character was read from input.geos, thus causing a read error. This type of error can occur the input.geos corresponds to a version of GEOS-Chem that is different from yours.
This error is not limited to input.geos; it can happen for any text file that is being read from disk (both in GEOS-Chem and in any other Fortran programs you may write).
Error reading the input.geos file
If you should encounter this type of error:
READ_INPUT_FILE: Reading input.geos SPLIT_ONE_LINE: error at ___ Expected __ substrs but found __ STOP in SPLIT_ONE_LINE (input_mod.F)
then you are probably using an input.geos file that does not correspond to the same version as your GEOS-Chem source code directory. Please check the git history for both your code directory and unit tester directory to make sure they are for the same version (marked by tags). If necessary, update your unit tester directory and create a new run directory.
Errors reading the GEOS-Chem restart file
Please see the following posts for more information about errors that may occur when reading GEOS-Chem restart files:
NetCDF: HDF Error
If you should encounter this error message:
NetCDF: HDF error
Then this usually means GEOS-Chem was trying to read an incomplete or corrupted netCDF file. The quickest solution is to re-download the netCDF file from the original source.
Permission denied error
If you receive this error:
v11-01.run: Permission denied.
after having submitted a run script to a queue system (such as SLURM or Grid Engine), then doublecheck the Unix permissions of your v11-01.run script. If the script does not have the Unix "execute" permission then the queue system will not be able to run it.
Use the Unix chmod command to make your script executable
chmod 755 v11-01.run
and then re-submit the script to the queue system.
UCX not defined at compile time
This error may occur if you are compiling GEOS-Chem (v11-01 and prior versions) and directly from the source code directory and not a run directory. Whenever you compile GEOS-Chem in the source code directory, you must remember to use the following Makefile options:
CHEM=Standard UCX=y # Standard simulation (used for benchmarking GC) CHEM=UCX UCX=y # UCX simulation (i.e. Standard simulation w/o SOA species)
To avoid these errors, we STRONGLY RECOMMEND that you always compile GEOS-Chem from a run directory. This will ensure that the Makefile switches relevant to your simulation will always be activated. For more information, please see these wiki posts:
NOTE: In GEOS-Chem v11-02 and higher versions, the Makefile option UCX=y will automatically be set whenever you select CHEM=Standard or CHEM=UCX.
Floating invalid or floating-point exception error
You can check for several common floating-point math errors by compiling with the FPEX=y option. This will halt the simulation with an error message such as:
forrtl: error (65): floating invalid # Error message from Intel Fortran Compiler Floating point exception (core dumped) # Error message from GNU Fortran Compiler
This error typically means that a division-by-zero occurred, or a NaN value was encountered in one of your variables.
A common way to prevent these types of errors is to ensure "safe" divisions (i.e. to make sure that the denominator is nonzero). You can do this manually with an IF statement, or use the routines SAFE_DIV or IS_SAFE_DIV in GeosCore/error_mod.F.
KPP "Step size too small" error
The following abnormal exit from the KPP chemical solver:
Forced exit from Rosenbrock due to the following error: --> Step size too small: T + 10*H = T or H < Roundoff T= 3044.21151383269 and H= 1.281206877135470E-012 ... 1 Forced exit from Rosenbrock due to the following error: --> Step size too small: T + 10*H = T or H < Roundoff T= 3044.21151383269 and H= 1.281206877135470E-012 failed twice !!!
indicates that the chemistry could not converge to a solution in the given grid box. Possible reasons for this could be:
- A particular tracer has numerically underflowed or overflowed. This can happen especially in the aerosol chemistry and equilibrium routines, where many exponentials and logarithms are used in the algorithms.
- The restart file is not appropriate for the given simulation. For example, if the restart file was created using the Synoz O3 flux boundary condtion, but you have turned on the Linoz stratospheric O3 chemistry, then this mismatch can cause the solver not to converge. You can try switching to a restart file generated from a simulation with the same input options as the simulation that you wish to perform.
You may have to manually adjust the convergence criteria in the GEOS-Chem code to fix this condition.
--Bob Y. 11:30, 9 November 2010 (EST)
Mixed file access modes error
This error is particular to the Intel Fortran Compiler. You might encounter this in conjunction with the ND49 timeseries diagnostic.
According to the Intel website:
severe (31): Mixed file access modes FOR$IOS_MIXFILACC. An attempt was made to use any of the following combinations: * Formatted and unformatted operations on the same unit * An invalid combination of access modes on a unit, such as direct and sequential * An Intel® Fortran RTL I/O statement on a logical unit that was opened by a program coded in another language
Here are a few suggestions to try if you haven’t already:
- Make sure you don’t already have a ND49 file in the location that you’re writing. In other words, make sure GEOS-Chem isn’t trying to write to a file that already exists.
- Do “make realclean”, recompile, and run again to see if the error is persistent.
- Are you using an out-of-the-box version of the code or a modified version? If the latter, do you still get this error with a “clean” copy of the code?
- Is there a particular READ/WRITE format statement that is causing the problem? You could try compiling with BOUNDS=y TRACEBACK=y FPE=y DEBUG=y and running in totalview (do “module load totalview” first on Odyssey) to locate the problematic line.
Negative tracer found in WETDEP
If your simulation encounters negative (or NaN) tracer concentrations in the WETDEP routine, then this can be an indication of a problem further upsteam, perhaps in the aerosol routines (highly probable if the tracer is SO4, SO4s, HNO3, SO2, or NH3). We have fixed some of these bugs by making the code more robust. If you are using a GEOS-Chem version prior to v8-01-01, then you should get ftp://ftp.as.harvard.edu/pub/geos-chem/patches/v8-01-01/ these patches. (These patches have been added to the standard GEOS-Chem code in versions higher than v8-01-01.) Please see the following links for more information:
- Values of F_PRIME > 1 in routine WETDEP
- Negative tracer due to negative RH values in the met field data
- Negative tracer in routine WETDEP
- Negative tracer in routine WETDEP #2
- A bug in RPMARES was also leading to a crash in WETDEP.
If the fixes above do not solve your problem, you will need to debug. The first step is to use few calls to CHECK_STT (from tracer_mod.f) to isolate the part of the code where negative tracers are created. This can be done quite fast if the code dies early enough in the run.
--Bob Y. 12:50, 15 July 2011 (EDT)
Run time errors originating in the HEMCO emissions component
HEMCO Error: Cannot find file for current simulation time
If you see an error such as this in your HEMCO.log file:
HEMCO ERROR: Cannot find file for current simulation time: ./GEOSChem.Restart.17120701_0000z.nc4 - Cannot get field SPC_NO. Please check file name and time (incl. time range flag) in the config. file
Then this can have a couple of causes:
- HEMCO cannot find the file because it is missing on disk.
- HEMCO will try to look back in time starting with the current year and going all the way back to the year 1712 or 1713. So if you see 1712 or 1713 in the error message, that is a tip-off that the file is missing.
- HEMCO cannot find an expected variable name within a file.
HEMCO Run Error
Errors messages containing "HCO" originate in the HEMCO emissions component. For example:
=============================================================================== GEOS-CHEM ERROR: HCO_RUN STOP at HCOI_GC_RUN (hcoi_gc_main_mod.F90) ===============================================================================
Additional helpful diagnostic information can be found in the HEMCO log file, which is usually named HEMCO.log.
Updated error message for v11-01
In GEOS-Chem v11-01 and higher versions, additional text instructs the user to also check the HEMCO log file.
=============================================================================== GEOS-CHEM ERROR: HCO_RUN HEMCO ERROR: Please check the HEMCO log file for error messages! STOP at HCOI_GC_RUN (GeosCore/hcoi_gc_main_mod.F90) ===============================================================================
HEMCO time stamps may be wrong
HEMCO reads the files but gives zero emissions and shows the following time step error: HEMCO WARNING: ncdf reference year is prior to 1901 - time stamps may be wrong! --> LOCATION: GET_TIMEIDX (hco_read_std_mod.F90)
Lizzie Lundgren wrote:
That HEMCO error occurs if the reference time for the netCDF file time dimension is prior to 1901. If you do ncdump –c filename you will be able to see the metadata for the time dimension as well as the time variable values. The time units should include the reference date.
You can get around this issue by changing the reference time within the file. You can do this with CDO (climate data operators) using the setreftime command.
Here is a bash script example (by GCST member Melissa Sulprizio) that updates the calendar and reference time for all files ending in *.nc within a directory. Support team member developed this for a user very recently who ran into the same issue. In that case the first file was for Jan 1, 1950, so that was made the new reference time. I would recommend doing the same for your dataset so that the first time variable value would be 0. This script also compresses the file which we recommend doing.
#!/bin/bash for file in *nc; do echo "Processing $file" cdo setcalendar,standard $file tmp.nc mv tmp.nc $file cdo setreftime,1950-01-01,0 $file tmp.nc mv tmp.nc $file nccopy -d1 -c "time/1" $file tmp.nc mv tmp.nc $file done
After you update the file you can then again do ncdump –c filename to check the time dimension. For the case above it looks like this after processing.
double time(time) ; time:standard_name = "time" ; time:long_name = "time" ; time:bounds = "time_bnds" ; time:units = "days since 1950-01-01 00:00:00" ; time:calendar = "standard" ; . . . time = 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365, 396, 424, 455, 485, 516, 546, 577, 608, 638, 669, 699, 730, 761, 790, 821, 851, 882, 912, 943, 974, 1004, 1035, 1065, 1096, 1127, 1155, 1186, 1216, 1247, etc