Difference between revisions of "Bugs and fixes"

From Geos-chem
Jump to: navigation, search
(New page: == NaN's in SMVGEAR == === 05-Jul-2007 === From Lok Lamsal (lok.lamsal@fizz.phys.dal.ca) <blockquote> I ran into a problem while running GEOS-4, version v7-04-09, at 2x2.5. The simulati...)
(No difference)

Revision as of 20:16, 26 March 2008



From Lok Lamsal (lok.lamsal@fizz.phys.dal.ca)

I ran into a problem while running GEOS-4, version v7-04-09, at 2x2.5. The simulation stops on 15th July 2006 with different error messages on two of our machines tuque and beret. One of the error messages on tuque is like this:

  sum of rrate =  Infinity
  Species index :            1
  Grid Box      :          121          15           1
  STOP in smvgear.f!
      - CLEANUP: deallocating arrays now...
 forrtl: severe (174): SIGSEGV, segmentation fault occurred

And on beret the message is like this:

- CLEANUP: deallocating arrays now... forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source geos_4_nv 400000000059EA10 Unknown Unknown Unknown libguide.so 200000000039C1A0 Unknown Unknown Unknown etc.

The message which is repeated in either case is like this:


Could you suggest me what the problem could be? Just to inform you: while trying to figure out the problem, I noticed from Bastien that he did not have problem on that day with version v7-04-10, which stopped on September 13 2006.

Response by Bob Yantosca (yantosca@seas.harvard.edu):

I think there is a division by zero somewhere that is causing SMVGEAR to choke. It could be a couple of things:

(1) Make sure that in your In a6_read_mod.f (routine READ_A6) you have the following code to prevent Q from going to zero. This can make logarithms blow up in other places in the code:

 !--------------------------------  ! Q: 6-h avg specific humidity  ! (GEOS-4 only)  !-------------------------------- CASE ( 'Q' ) READ( IU_A6, IOSTAT=IOS ) XYMD, XHMS, Q3 IF ( IOS /= 0 ) CALL IOERROR( IOS, IU_A6, 'read_a6:16' )


 ! NOTE: Now set negative Q to a small positive #  ! instead of zero, so as not to blow up logarithms  ! (bmy, 9/8/06) WHERE ( Q < 0d0 ) Q = 1d-32 ENDIF

(2) In fvdas_convect_mod.f, make SMALLEST a smaller number (i.e. 1d-60):

 !=================================================================  ! MODULE VARIABLES  !=================================================================

 ! Variables INTEGER  :: LIMCNV  ! Constants LOGICAL, PARAMETER :: RLXCLM = .TRUE. REAL*8, PARAMETER :: CMFTAU = 3600.d0 REAL*8, PARAMETER :: EPS = 1.0d-13 REAL*8, PARAMETER :: GRAV = 9.8d0  !-------------------------------------------------------  ! Prior to 12/19/06:  ! Make SMALLEST smaller (bmy, 12/19/06)  !REAL*8, PARAMETER :: SMALLEST = 1.0d-32  !------------------------------------------------------- REAL*8, PARAMETER :: SMALLEST = 1.0d-60 REAL*8, PARAMETER :: TINYALT = 1.0d-36 REAL*8, PARAMETER :: TINYNUM = 2*SMALLEST

(3) In "fvdas_convect_mod.f", avoid division by zero in routine CONVTRAN:

IF ( CDIFR > 1.d-6 ) THEN

 ! If the two layers differ significantly.  ! use a geometric averaging procedure CABV = MAX( CMIX(I,KM1), MAXC*TINYNUM, SMALLEST ) CBEL = MAX( CMIX(I,K), MAXC*TINYNUM, SMALLEST )  !-----------------------------------------------------------------  ! Prior to 12/19/06:  ! Avoid division by zero (bmy, 12/19/06)  ! CHAT(I,K) = LOG( CABV / CBEL)  ! & / ( CABV - CBEL)  ! & * CABV * CBEL  !-----------------------------------------------------------------

 ! If CABV-CBEL is zero then set CHAT=SMALLEST  ! so that we avoid div by zero (bmy, 12/19/06) IF ( ABS( CABV - CBEL ) > 0d0 ) THEN CHAT(I,K) = LOG( CABV / CBEL ) & / ( CABV - CBEL ) & * CABV * CBEL ELSE CHAT(I,K) = SMALLEST ENDIF

ELSE  ! Small diff, so just arithmetic mean CHAT(I,K) = 0.5d0 * ( CMIX(I,K) + CMIX(I,KM1) ) ENDIF

(4) Also I had to rewrite the parallel DO loops in the routine HACK_CONV since this was causing some kind of a memory fault.

You may just want to get the most recent version of fvdas_convect_mod.f, which has all of these fixes installed. See:

ftp ftp.as.harvard.edu cd pub/exchange/bmy get fvdas_convect_mod.f

So I would recommend trying to implement these fixes and see if this solves your problem.

NOTE: These fixes have been introduced into GEOS-Chem v7-04-10.


16 Oct 2007

From Mike Barkley (mbarkley@staffmail.ed.ac.uk)

> I think I've found an error in the regrid_1x1_mod.f subroutine (attached in the text file): > >SUBROUTINE REGRID_MASS_TO_2x25( I1, J1, L1, IN, I2, J2, OUT ) > >There is a do loop over longitude with the upper limit defined as the input latitude (J1) instead what should (?) be the output longitude (I2) - I've indicated where this in the program, Which is correct? We didn't notice this until we were running multi-processor 2x2.5 simulations on different servers.

The bug was:

 ! Non-polar latitudes
 DO J = 2, J2-1    
    DO I = 1, J1

which needs to be replaced by:

 ! Non-polar latitudes
 DO J = 2, J2-1    
    DO I = 1, I1

This bug has now been fixed in GEOS-Chem v7-04-13.


02 Nov 2007

From Bob Yantosca (yantosca@seas.harvard.edu)

>Some of you have reported a weird error in SMVGEAR that causes GEOS-Chem simulations to die unexpectedly. The main symptom of this error is that concentrations of some species (e.g CO) appear to go to zero, while other species (e.g. Ox) seem to reach unphysically high values, all within a single chemistry timestep. Then the simulation dies shortly thereafter. > >May Fu and Philippe Le Sager have isolated the cause of the problem. They found that in some instances it is possible (e.g. due to locally low OH) to get into a regime where the first derivative of a species goes very negative during SMVGEAR's internal iteration loop. This then causes the new species concentration to be negative. This can sometimes happen even if the local & global error tolerance checks have passed. Then upon exiting the internal iteration loop, SMVGEAR would automatically reset any negative species concentrations to zero (actually a small positive number like 1e-99). A species with zero concentration can adversely affect other species within the SMVGEAR solver process. Furthermore, sometimes these zero concentrations were propagating out of SMVGEAR and into the STT tracer array, which caused problems in other areas of the code. > >May & Philippe implemented a fix into the file "smvgear.f" which does the following: If a negative species concentration value is found during an internal iteration, then we don't set it to zero. We instead reduce the internal iteration timestep and do another iteration (i.e. re-evaluate the Jacobian matrix) to solve for the new species concentration. This process is repeated until SMVGEAR converges onto a non-negative solution. May & Philippe also added an extra error trap to stop the simulation if any negative species concentrations still persist upon exiting the subroutine. So the entire process should now be more robust. > >You may download the updated "smvgear.f" file from our anonymous FTP site: > > ftp ftp.as.harvard.edu > cd pub/geos-chem/patches/v7-04-12 > get README > get smvgear.f > >Then copy the "smvgear.f" file to your own source code directory and recompile. Please see the README file for more information on how to locate the places in "smvgear.f" that were modified. > >This is not really a "bug" but more of a "design flaw" in the original SMVGEAR package.

This bug has now been fixed in GEOS-Chem GEOS-Chem v7-04-13.