Other less-common errors

From Geos-chem
Jump to: navigation, search

Previous | Next | Guide to GEOS-Chem error messages

  1. Understanding the different categories of errors
  2. Compile-time warnings and errors
  3. Run-time crashes and abnormal exits
  4. Segmentation faults
  5. Other less-common errors


Overview

We have been migrating bug reports to our GEOS-Chem issue tracker, which is located on our Github repository: https://github.com/geoschem/geos-chem/issues/. We recommend that you also look through both the open and closed issues on this page, as your issue might be listed there.


The errors listed below, which occur infrequently, are related to invalid memory operations. These can especially occur with POINTER-based variables.

Bus Error

A bus error means that you are trying to reference memory that cannot possibly be there. The website StackOverflow.com has a definition of bus error and how it differs from a segmentation fault.

One cause of a bus error can be if you are trying to call a subroutine with the wrong number of arguments (i.e. usually too many arguments).

--Bob Y. 12:27, 19 October 2012 (EDT)

Dwarf subprogram entry error

The error message:

 Dwarf subprogram entry L_ROUTINE-NAME__LINE-NUMBER__par_loop2_2_576 has high_pc < low_pc. 
 This warning will not be repeated for other occurrences.

can occur when you try to use a pointer variable that is unassociated (i.e. that is not currently pointing to any other variable) from within an OpenMP parallel loop, where:

  1. ROUTINE-NAME is the name of the routine where the error occurred, and
  2. LINE-NUMBER is the line where the error occurred.

We recently discovered that this error can be caused if you have a pointer declaration such as this:

 TYPE(Species), POINTER :: ThisSpc => NULL()

where the pointer ThisSpc is later used to point to another variable from within an OpenMP parallel loop. As it turns out, the above declaration statement will inadvertently cause pointer ThisSpc to be declared with the SAVE attribute. This can cause a segmentation fault, because all pointers used within an OpenMP parallel region must be created and destroyed on the same thread.

This type of problem can usually be fixed by removing the nullification from the declaration statement. In other words, you can rewrite the above line of code with:

 TYPE(Species), POINTER :: ThisSpc
 . . .
 ThisSpc => NULL()

For more information, please see this article.

--Bob Yantosca (talk) 19:27, 29 April 2016 (UTC)

IFORT error: Relocation truncated to fit

Please see this wiki post on our Intel Fortran Compiler page which describes how to work around an Relocation truncated to fit error message.

--Bob Y. 10:46, 24 February 2012 (EST)

IFORT error: Out of memory asking for NNNNN

This is not a common error message, but it may occur if you are compiling a version of GEOS-Chem for a high-resolution horizontal grid, or with one of the available microphysics packages (i,e. APM or TOMAS). Please see this wiki post on our Intel Fortran Compiler page which describes this error in detail.

--Bob Y. 10:42, 26 July 2013 (EDT)

Memory error: "munmap_chunk: invalid pointer"

The following error is not common but can happen:

Error *** glibc detected *** ./geos: munmap_chunk(): invalid pointer: 0x00000000059aac30 ***
Reference http://stackoverflow.com/questions/6199729/how-to-solve-munmap-chunk-invalid-pointer-error-in-c
Explanation This happens when the pointer passed to (C-library language routine free(), which is called from Fortran routine NULLIFY()) is not valid or has been modified somehow. I don't really know the details here. The bottom line is that the pointer passed to free() must be the same as returned by (C-library routines) malloc(), realloc() and their friends.
The free() function frees the memory space pointed  to  by  ptr,  which
must  have  been  returned  by a previous call to malloc(), calloc() or
realloc().  Otherwise, or if free(ptr) has already been called  before,
undefined behavior occurs.  If ptr is NULL, no operation is performed.
GNU                               2012-05-10                         MALLOC(3)
Simpler explanation This can happen if you are trying to deallocate or nullify a pointer variable that has already been deallocated or modified.

--Bob Yantosca (talk) 21:32, 6 January 2017 (UTC)

Memory error: "free: invalid size"

The following error is not common but can happen:

Error *** Error in `./geos': free(): invalid size: 0x000000000662e090 ***
Reference http://stackoverflow.com/questions/4729395/error-free-invalid-next-size-fast
Explanation It means that you have a memory error. You may be trying to free a pointer that wasn't allocated (or delete an object that wasn't created) or you may be trying to nullify/delete such an object more than once. You may be overflowing a buffer or otherwise writing to memory to which you shouldn't be writing, causing heap corruption.

Any number of programming errors can cause this problem. You need to use a debugger, get a backtrace, and see what your program is doing when the error occurs. If that fails and you determine you have corrupted the heap at some previous point in time, you may be in for some painful debugging (it may not be too painful if the project is small enough that you can tackle it piece by piece).

--Bob Yantosca (talk) 21:32, 6 January 2017 (UTC)

Memory error: "double free or corruption"

The following error is not common, but can occur under some circumstances:

Error *** glibc detected *** ./geos: double free or corruption (out):
Reference http://stackoverflow.com/questions/2902064/how-to-track-down-a-double-free-or-corruption-error-in-c-with-gdb
Explanation There are at least two possible situations:
  1. You are deleting the same entity twice
  2. You are deleting something that wasn't allocated

For the first one I strongly suggest NULL-ing all deleted pointers.

You have [some] options:

  1. Overload new and delete and track the allocations
  2. Use a debugger -- then you'll get a backtrace from your crash, and that'll probably be very helpful

Three basic rules:

  1. Set pointer to NULL after free
  2. Check for NULL before freeing.
  3. Initialize pointer to NULL in the start.

Combination of these three works quite well.

This error can also occur if a library that GEOS-Chem needs (e.g. netCDF or netCDF-Fortran) is not installed on your system. GEOS-Chem will try to make function calls to the missing library, which will result in this error. In this case, the solution is to install the missing library.

--Bob Yantosca (talk) 21:02, 10 June 2019 (UTC)



Previous | Next | Guide to GEOS-Chem error messages | Getting Started with GEOS-Chem