Bugs and fixes

From Geos-chem
Revision as of 20:23, 11 April 2008 by Bmy (Talk | contribs) (11-Apr-2008)

Jump to: navigation, search

On this page we list the GEOS-Chem bugs that users have recently encountered, and how to fix them.

ISORROPIA and RPMARES

11-Apr-2008

Please see the discussion about the bugs & fixes for ISORROPIA and RPMARES on the Code Developer's Forum for Aerosol Thermodynamical Equilibrium.

NaN's in SMVGEAR

05-Jul-2007

From Lok Lamsal (lok.lamsal@fizz.phys.dal.ca)

I ran into a problem while running GEOS-4, version v7-04-09, at 2x2.5. The simulation stops on 15th July 2006 with different error messages on two of our machines tuque and beret. One of the error messages on tuque is like this:

  sum of rrate =  Infinity
  SMVGEAR: CNEW is NaN!
  Species index :            1
  Grid Box      :          121          15           1
  STOP in smvgear.f!
      - CLEANUP: deallocating arrays now...
 forrtl: severe (174): SIGSEGV, segmentation fault occurred

And on beret the message is like this:

      - CLEANUP: deallocating arrays now...
  forrtl: severe (174): SIGSEGV, segmentation fault occurred
  Image              PC                Routine            Line
  Source
  geos_4_nv          400000000059EA10  Unknown               Unknown
  Unknown
  libguide.so        200000000039C1A0  Unknown               Unknown
  Unknown
  etc.

The message which is repeated in either case is like this:

  SMVGEAR: DELT= 9.96E-16 TOO LOW DEC YFAC. KBLK, KTLOOP, NCS, TIME, TIMREMAIN, YFAC, EPS =

Could you suggest me what the problem could be? Just to inform you: while trying to figure out the problem, I noticed from Bastien that he did not have problem on that day with version v7-04-10, which stopped on September 13 2006.

Response by Bob Yantosca (yantosca@seas.harvard.edu):

I think there is a division by zero somewhere that is causing SMVGEAR to choke. It could be a couple of things:

(1) Make sure that in your In a6_read_mod.f (routine READ_A6) you have the following code to prevent Q from going to zero. This can make logarithms blow up in other places in the code:

          !--------------------------------
          ! Q: 6-h avg specific humidity
          ! (GEOS-4 only)
          !--------------------------------
          CASE ( 'Q' )
             READ( IU_A6, IOSTAT=IOS ) XYMD, XHMS, Q3
             IF ( IOS /= 0 ) CALL IOERROR( IOS, IU_A6, 'read_a6:16' )
     
             IF ( CHECK_TIME( XYMD, XHMS, NYMD, NHMS ) ) THEN
                IF ( PRESENT( Q ) ) CALL TRANSFER_3D( Q3, Q )
                NFOUND = NFOUND + 1
     
                ! NOTE: Now set negative Q to a small positive #
                ! instead of zero, so as not to blow up logarithms
                ! (bmy, 9/8/06)
                WHERE ( Q < 0d0 ) Q = 1d-32
             ENDIF


(2) In fvdas_convect_mod.f, make SMALLEST a smaller number (i.e. 1d-60):

    !=================================================================
    ! MODULE VARIABLES
    !=================================================================
     
    ! Variables
    INTEGER            :: LIMCNV              ! Constants
    LOGICAL, PARAMETER :: RLXCLM   = .TRUE.
    REAL*8,  PARAMETER :: CMFTAU   = 3600.d0
    REAL*8,  PARAMETER :: EPS      = 1.0d-13       
    REAL*8,  PARAMETER :: GRAV     = 9.8d0
    !-------------------------------------------------------
    ! Prior to 12/19/06:
    ! Make SMALLEST smaller (bmy, 12/19/06)
    !REAL*8,  PARAMETER :: SMALLEST = 1.0d-32
    !-------------------------------------------------------
    REAL*8,  PARAMETER :: SMALLEST = 1.0d-60
    REAL*8,  PARAMETER :: TINYALT  = 1.0d-36           
    REAL*8,  PARAMETER :: TINYNUM  = 2*SMALLEST


(3) In "fvdas_convect_mod.f", avoid division by zero in routine CONVTRAN:


             IF ( CDIFR > 1.d-6 ) THEN
     
                ! If the two layers differ significantly.
                ! use a geometric averaging procedure
                CABV = MAX( CMIX(I,KM1), MAXC*TINYNUM, SMALLEST )
                CBEL = MAX( CMIX(I,K),   MAXC*TINYNUM, SMALLEST )
  !-----------------------------------------------------------------
  !  Prior to 12/19/06:
  ! Avoid division by zero (bmy, 12/19/06)
  !                  CHAT(I,K) = LOG( CABV / CBEL)
  !     &                       /   ( CABV - CBEL)
  !     &                       *     CABV * CBEL
  !-----------------------------------------------------------------
     
                ! If CABV-CBEL is zero then set CHAT=SMALLEST
                ! so that we avoid div by zero (bmy, 12/19/06)
                IF ( ABS( CABV - CBEL ) > 0d0 ) THEN
                   CHAT(I,K) = LOG( CABV / CBEL )
   &                         /    ( CABV - CBEL )
   &                         *      CABV * CBEL
                ELSE
                   CHAT(I,K) = SMALLEST
                ENDIF
     
             ELSE                           
                ! Small diff, so just arithmetic mean
                CHAT(I,K) = 0.5d0 * ( CMIX(I,K) + CMIX(I,KM1) )
             ENDIF


(4) Also I had to rewrite the parallel DO loops in the routine HACK_CONV since this was causing some kind of a memory fault.

You may just want to get the most recent version of fvdas_convect_mod.f, which has all of these fixes installed. See:

  ftp ftp.as.harvard.edu
  cd pub/exchange/bmy
  get fvdas_convect_mod.f

So I would recommend trying to implement these fixes and see if this solves your problem.

NOTE: These fixes have been introduced into GEOS-Chem v7-04-10.

regrid_1x1_mod.f

16 Oct 2007

From Mike Barkley (mbarkley@staffmail.ed.ac.uk)

I think I've found an error in the regrid_1x1_mod.f subroutine (attached in the text file):

SUBROUTINE REGRID_MASS_TO_2x25( I1, J1, L1, IN, I2, J2, OUT )

There is a do loop over longitude with the upper limit defined as the input latitude (J1) instead what should (?) be the output longitude (I2) - I've indicated where this in the program, Which is correct? We didn't notice this until we were running multi-processor 2x2.5 simulations on different servers.

The bug was:

  !-----------------------
  ! Non-polar latitudes
  !-----------------------
  DO J = 2, J2-1    
     ...          
     DO I = 1, J1

which needs to be replaced by:

  !-----------------------
  ! Non-polar latitudes
  !-----------------------
  DO J = 2, J2-1    
     ...          
     DO I = 1, I1

This bug has now been fixed in GEOS-Chem v7-04-13.

smvgear.f

02 Nov 2007

Bob Yantosca (yantosca@seas.harvard.edu) wrote

Some of you have reported a weird error in SMVGEAR that causes GEOS-Chem simulations to die unexpectedly. The main symptom of this error is that concentrations of some species (e.g CO) appear to go to zero, while other species (e.g. Ox) seem to reach unphysically high values, all within a single chemistry timestep. Then the simulation dies shortly thereafter.
May Fu and Philippe Le Sager have isolated the cause of the problem. They found that in some instances it is possible (e.g. due to locally low OH) to get into a regime where the first derivative of a species goes very negative during SMVGEAR's internal iteration loop. This then causes the new species concentration to be negative. This can sometimes happen even if the local & global error tolerance checks have passed. Then upon exiting the internal iteration loop, SMVGEAR would automatically reset any negative species concentrations to zero (actually a small positive number like 1e-99). A species with zero concentration can adversely affect other species within the SMVGEAR solver process. Furthermore, sometimes these zero concentrations were propagating out of SMVGEAR and into the STT tracer array, which caused problems in other areas of the code.
May & Philippe implemented a fix into the file "smvgear.f" which does the following: If a negative species concentration value is found during an internal iteration, then we don't set it to zero. We instead reduce the internal iteration timestep and do another iteration (i.e. re-evaluate the Jacobian matrix) to solve for the new species concentration. This process is repeated until SMVGEAR converges onto a non-negative solution. May & Philippe also added an extra error trap to stop the simulation if any negative species concentrations still persist upon exiting the subroutine. So the entire process should now be more robust.
You may download the updated "smvgear.f" file from our anonymous FTP site:
   ftp ftp.as.harvard.edu
   cd pub/geos-chem/patches/v7-04-12
   get README
   get smvgear.f
Then copy the "smvgear.f" file to your own source code directory and recompile. Please see the README file for more information on how to locate the places in "smvgear.f" that were modified.
This is not really a "bug" but more of a "design flaw" in the original SMVGEAR package.


This bug has now been fixed in GEOS-Chem GEOS-Chem v7-04-13.