Developing GEOS-Chem

From Geos-chem
Jump to navigation Jump to search

NOTE: We will be updating this coding style guide in the near future. The recent efforts to interface GEOS-Chem with ESMs like NASA GEOS, CESM, BCC, and WRF necessitate a departure from past programming practice. We will keep you updated.


Background

We have written GEOS-Chem in the Fortran-90 (F90) language. F90 introduces several new features over its predecessor, Fortran-77 (F77), including allocatable arrays, modules, derived types, and new programming statements. We invite you to consult this list of Fortran tutorials.

F90 contains all of the features of F77. Legacy code adhering to the F77 standard should build with any F90 compiler. Although we have written GEOS-Chem source code to the F90 standard, you should be aware that many 3rd-party routines included in GEOS-Chem—such as, [[Photolysis_mechanism|FAST-JX], and ISORROPIA—were originally written to the F77 standard. In the past several GEOS-Chem releases, we have started to replaced older F77 code with F90 code, especially to facilitate our GEOS-Chem High Performance (aka GCHP) project.

You should also be aware that several other updates to the Fortran standard (F95, F2000, F2003) have been introduced in recent years. These new Fortran versions contain the same functionality as F90, but generally omit the obsolete F77-style features. Most modern Fortran compilers include support for all of these standards.

Code layout

Fixed format vs. free format

We previously wrote GEOS-Chem source code using fixed-format layout, which has the following requirements:

  • A line continuation character (usually &) is placed in column 6
  • Numeric labels can occupy column 2 through column 5
  • Executable statements begin in column 7 and extend to column 72

We chose the fixed-format layout for compatibility with the F77-style include files that were contained in the Headers/ directory. Starting with GEOS-Chem v9-01-03, we converted all header files to Fortran modules. This allows the use of free-format layout, which does not restrict where you place executable statements. (To continue a free-format statement across several lines of code, you must place an ampersand (&;) character at the end of each line.)

Modules in GEOS-Chem that are still written in F77 fixed-format layout are denoted with the .F extension, while modules written in F90 free-format layout are denoted with the .F90 extension. We recommend that you write all new GEOS-Chem code using the free format layout.

Modules

Place all of your new GEOS-Chem code into Fortran 90 modules, if possible. These allow you to save both and routines and variables within a single program unit.

An example module is shown below.

!------------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !MODULE: MY\_MODULE
!
! !DESCRIPTION: \subsection*{Overview}
! Module MY\_MODULE contains variables and routines to do something.
! (bmy, 5/1/03)
!\\
!\\
! !INTERFACE
!
       MODULE MY_MODULE
!
! !USES:
! 
     IMPLICIT NONE

     ! Make everything private...
     PRIVATE
!
! !PUBLIC MEMBER FUNCTIONS:
!
     PUBLIC :: DRIVER
     PUBLIC :: INIT_MY_MODULE
     PUBLIC :: CLEANUP_MY_MODULE
!
! !REVISION HISTORY:
!  (1 ) Bug fix in MY_SUBROUTINE (bmy, 5/1/03)
!EOP
!------------------------------------------------------------------------------
!BOC
!
! !PRIVATE DATA MEMBERS:
!
     ! Argument to the SIN function
     REAL*8, ALLOCATABLE :: B(:)

     CONTAINS
!EOC 
!------------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !IROUTINE: DRIVER
!
! !DESCRIPTION: \subsection*{Overview}
! Subroutine DRIVER is the driver routine for MY_MODULE. (bmy, 5/1/03)
!\\
!\\
! !INTERFACE:
!
      SUBROUTINE DRIVER
!
! !REVISION HISTORY:
!
!EOP
!------------------------------------------------------------------------------
!BOC
!
! !LOCAL VARIABLES:
!
      LOGICAL, SAVE :: FIRST = .TRUE.
      INTEGER, :: I
      REAL*8 :: X, Y1, Y2
      REAL*8, PARAMETER :: PI180 = 180d0 / 3.14159265358979323d0
 
      !=================================================================
      ! DRIVER begins here!
      !=================================================================

      ! Initialize 
      IF ( FIRST ) THEN
         CALL INIT_MY_MODULE()
      ENDIF

      !=================================================================
      ! Loop over degrees
      !=================================================================
      DO I = 1, 360

         ! Convert to radians
         B = DBLE( I ) / PI180

         ! Call subroutine
         CALL MY_SUBROUTINE( B(I), Y1 )

         ! Call function
         Y2 = MY_FUNCTION( B(I) )

         ! Write output
         WRITE( 6, '(i3, 1x, 3(f13.6,1x))' ) I, B(I), Y1, Y2
      ENDDO 
 
      ! Return to calling program
      END SUBROUTINE DRIVER
!EOC
!-----------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !IROUTINE: MY\_SUBROUTINE
!
! !DESCRIPTION: \subsection*{Overview}
! Function MY_SUBROUTINE returns the sine of a number. (bmy, 5/1/03)
!\\
!\\
! !INTERFACE:
!
      SUBROUTINE MY_SUBROUTINE( X, Y )
!
! !INPUT PARAMETERS: 
!
      ! Argument for the sine function
      REAL*8, INTENT(IN) :: X
!
! !RETURN VALUE:
!     
      REAL*8, INTENT(OUT) :: Y 
!
! !REVISION HISTORY:
! (1 ) Corrected typographic error (bmy, 5/1/03)
!EOP
!------------------------------------------------------------------------------
!BOC
      !=================================================================
      ! MY_SUBROUTINE begins here!
      !=================================================================
      Y = SIN( X )
 
      ! Return to calling program
      END SUBROUTINE MY_SUBROUTINE
!EOC
!-----------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !IROUTINE: MY\_FUNCTION
!
! !DESCRIPTION: \subsection*{Overview}
! Function MY_FUNCTION returns the sine of a number. (bmy, 5/1/03)
!\\
!\\
! !INTERFACE:
!
      FUNCTION MY_FUNCTION( X ) RESULT( Y )
!
! !INPUT PARAMETERS: 
!
      ! Argument for the sine function
      REAL*8, INTENT(IN) :: X
!
! !RETURN VALUE:
!     
      REAL*8, INTENT(OUT) :: Y 
!
! !REVISION HISTORY:
!
!EOP
!------------------------------------------------------------------------------
!BOC
 
      !=================================================================
      ! MY_FUNCTION begins here!
      !=================================================================
      Y = SIN( X )

      ! Return to calling program
      END FUNCTION MY_FUNCTION
!EOC
!------------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !IROUTINE: INIT\_MY\_MODULE
!
! !DESCRIPTION: \subsection*{Overview}
! Subroutine INIT\_MY\_MODULE allocates and zeroes the B array. (bmy, 5/1/03)
!\\
!\\ 
! !INTERFACE:
! 
     SUBROUTINE INIT_MY_MODULE
!
! !USES
!
      USE ERROR_MOD, ONLY : ALLOC_ERR
!
! !REVISION HISTORY:
!
!EOP
!------------------------------------------------------------------------------
!BOC
!
! !LOCAL VARIABLES
!
     INTEGER :: AS

      !=================================================================
      ! INIT_MY_MODULE begins here!
      !=================================================================

      ! Allocate B, check status
      ALLOCATE( B(360), STAT=AS )

      ! Error check
      IF ( AS /= 0 ) CALL ALLOC_ERR( 'B' )

      ! Zero B
      B(:) = 0d0

      ! Return to calling program
      END SUBROUTINE INIT_MY_MODULE
!EOC
!-----------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !IROUTINE: CLEANUP\_MY\_MODULE
!
! !DESCRIPTION: \subsection*{Overview}
! Subroutine CLEANUP_MY_MODULE deallocates all module arrays. (bmy, 5/1/03)
!\\
!\\
! !INTERFACE:
!
     SUBROUTINE CLEANUP_MY_MODULE()
!
! !REVISION HISTORY:
!
!EOP
!------------------------------------------------------------------------------
!BOC

      !=================================================================
      ! CLEANUP_MY_MODULE begins here!
      !=================================================================
      IF ( ALLOCATED( B ) ) DEALLOCATE( B )

      ! Return to calling program
      END SUBROUTINE CLEANUP_MY_MODULE
 !EOC

 END MODULE MY_MODULE<

Fortran 90 modules written for GEOS-Chem contain the following features:

Item Description
Module header This is similar to the subroutine header. Contains a list of all module variables, routines, and modification notes, in Automatic documentation with ProTex
PRIVATE declarations You can "hide" certain routines or variables within a F90 module from other routines or modules. Ideally your module will be like a "black box" with only a few routines or variables that are publicly accessible. By convention, in GEOS-Chem, we declare everything PRIVATE first then indicate the public routines.
IMPLICIT NONE If you declare IMPLICIT NONE once at the top of the module, then you don't have to declare it in the individual subroutines; the declaration "carries through" below.
Module variables and arrays Any variables declared above the CONTAINS statement are as if they were stored in F77 common blocks; that is, they are preserved between calls to the various module routines. Also, you should declare module arrays as ALLOCATABLE whenever possible. This allows you to only set aside memory for that array if a certain routine is called. This results in more efficient memory management.
CONTAINS statement All module variables and PRIVATE declarations go above the CONTAINS statement. Module routines and functions go below the CONTAINS statement
Subroutines and functions
Your module will contain various subroutines and functions depending on the particular needs at hand.
An INIT routine Your module should have a subroutine named INIT_modulename, which initializes and allocates all module arrays.
A CLEANUP routine Your module should have a subroutine named CLEANUP_modulename, which deallocates module arrays. (You only need this if you have allocatable arrays).

F90 modules are very similar to classes in Java and C++; they allow you to group data and the routines which work on the data in a single package.

Theoretically, each GEOS-Chem routine and/or variable should belong to a module. In practice, this is not true, as we do have separate subroutine and function files from some of the older routines which were written by 3rd party sources. However, you should write all new GEOS-Chem code in module form.

Also, it is OK for modules to reference other modules. If Module B references Module A, all that you have to do is to make sure that Module B declaration in the Makefile refers to Module A. The GEOS-Chem Makefiles have been prepared for you by the GEOS-Chem Support Team, so in most instances, you will not have to concern yourself with this.

Automatic documentation with ProTeX

We generate detailed reference documentation with ProTeX. ProTeX is a perl script developed by NASA/GMAO that generates ProTeX files from GEOS-Chem's subroutine, function, and module comment headers. The LaTeX file can then be compiled to produce both PostScript and PDF output.

To use ProTeX, you need to add a standard header at the top of each subroutine, function, and module. The header looks like this:

!------------------------------------------------------------------------------
!          Harvard University Atmospheric Chemistry Modeling Group            !
!------------------------------------------------------------------------------
!BOP
!
! !IROUTINE: metero
!
! !DESCRIPTION: Subroutine METERO calculates meteorological constants needed
!  for the dry deposition velocity module. (lwh, gmg, djj, 1989, 1994; bmy,
!  10/3/05)
!\\
!\\
! !INTERFACE:
!
      SUBROUTINE METERO( State_Met, CZ1,    TC0,  OBK,       CFRAC,  
     &                   RADIAT,    AZO,    USTR, ZH,        LSNOW, 
     &                   RHB,       PRESSU, W10,  SUNCOS_MID        )
!
! !USES:
!
      USE GIGC_State_Met_Mod, ONLY : MetState 
      IMPLICIT NONE
!
! !INPUT PARAMETERS:
!
      TYPE(MetState), INTENT(IN)  :: State_Met   ! Meteorology State object
!
! !OUTPUT PARAMETERS:
!
      LOGICAL, INTENT(OUT) :: LSNOW (MAXIJ)  ! Flag for denoting snow/ice
      REAL*8,  INTENT(OUT) :: CZ1   (MAXIJ)  ! Midpt ht of 1st model level [m]
      REAL*8,  INTENT(OUT) :: TC0   (MAXIJ)  ! Grid box sfc temperature [K]
      REAL*8,  INTENT(OUT) :: OBK   (MAXIJ)  ! Monin-Obhukov length [m]
      REAL*8,  INTENT(OUT) :: CFRAC (MAXIJ)  ! Column cloud fraction [unitless]
      REAL*8,  INTENT(OUT) :: RADIAT(MAXIJ)  ! Solar radiation @ ground [W/m2]
      REAL*8,  INTENT(OUT) :: RHB   (MAXIJ)  ! Rel humidity at sfc [unitless]
      REAL*8,  INTENT(OUT) :: USTR  (MAXIJ)  ! Friction velocity [m/s]
      REAL*8,  INTENT(OUT) :: ZH    (MAXIJ)  ! PBL height [m]
      REAL*8,  INTENT(OUT) :: PRESSU(MAXIJ)  ! Local surface pressure [Pa]
      REAL*8,  INTENT(OUT) :: W10   (MAXIJ)  ! 10 meter windspeed [m/s]
      REAL*8,  INTENT(OUT) :: SUNCOS_MID(MAXIJ)  ! COS(SZA) @ midpt of current
                                                 !  chemistry timestep
     ! Dimension AZO for GCAP or GEOS met fields (swu, bmy, 5/25/05)
#if  defined( GCAP )
      REAL*8, INTENT(OUT)  :: AZO(NTYPE)     ! Roughness heights, by landtype
#else
      REAL*8, INTENT(OUT)  :: AZO(MAXIJ)     ! Roughness heights, by grid box
#endif
!
! !REMARKS:
!  NOTE: We save into arrays of dimension MAXIJ=IIPAR*JJPAR for compatibility
!  with the legacy drydep routine DEPVEL.
!                                                                             .
!  References (see full citations above):
!  ============================================================================
!  (1 ) Wesely, M. L., 1989. 
!  (2 ) Jacob, D.J., and S.C. Wofsy, 1990
!
! !REVISION HISTORY: 
!  (1 ) Now reference GET_PEDGE from "pressure_mod.f".  Now reference T from 
!        "dao_mod.f".  Removed obsolete code & comments, and added new 
!         documentation header.  Now force double precision with "D" 
!         exponents.  Now compute OBK here as well.  Bundled into F90 module
!         "drydep_mod.f" (bmy, 11/20/02)
!  (2 ) Now reference CLDFRC, RADSWG, ZO, USTAR from "dao_mod.f".  Also now 
!         pass CFRAC, RADIAT, AZO, USTR back to the calling routine 
!         via the arg list. (bmy, 12/9/03)
!  (3 ) Now use explicit formula for IJLOOP to allow parallelization
!        (bmy, 7/20/04)
!  (4 ) Now compute ZH and LSNOW here instead of w/in DO_DRYDEP.  Parallelize
!        DO-loops.  Now use BXHEIGHT from "dao_mod.f" instead of computing 
!        the thickness of the 1st level here.  Remove reference to 
!        "pressure_mod.f".  Remove reference to T from "dao_mod.f".  Now
!        reference ALBD from "dao_mod.f" (bmy, 2/22/05)
!  (5 ) Now references RH from "dao_mod.f".  Now passes relative humidity
!        from the surface layer back via RHB argument. (bec, bmy, 4/13/05)
!  (6 ) Now call GET_OBK from "dao_mod.f" to get the M-O length for both
!        GEOS or GCAP met fields.  Remove local computation of M-O length
!        here.  Also now dimension AZO appropriately for GCAP or GEOS met
!        fields.  Remove obsolete variables. (swu, bmy, 5/25/05)
!  (7 ) Now make sure all USE statements are USE, ONLY (bmy, 10/3/05)
!  (8 ) Move XLTMMP function to module MEGANUT_MOD. (ccc, 11/20/09)
!  (9 ) Add sea level pressure and 10m windspeed as arguments (jaegle 5/11/11)
!  22 Dec 2011 - M. Payer    - Added ProTeX headers
!  10 Jan 2012 - M. Payer    - Added local surface pressure
!  09 Nov 2012 - M. Payer    - Replaced all met field arrays with State_Met
!                              derived type object
!  28 Nov 2012 - R. Yantosca - Add SUNCOS_MID to the argument list and 
!                              populate that with State_Met%SUNCOSmid
!EOP
!------------------------------------------------------------------------------
!BOC
!
! !LOCAL VARIABLES:
!
      INTEGER               :: I,  J,  IJLOOP
      REAL*8                :: THIK
      REAL*8                :: SP
!
! !EXTERNAL FUNCTIONS:
!
      REAL*8, EXTERNAL      :: SFCWINDSQR ! Surface wind speed (jaegle 5/11/11)

      !=================================================================
      ! METERO begins here!
      !=================================================================

      ... code goes here ...

     END SUBROUTINE METERO
!EOC

Note that the ProTeX header contains the following elements:

  1. Indicators (!BOP, !EOP) as to where ProTeX should look for source code header comments<
  2. Indicators (!BOC, !EOC) as to where the source code begins and ends. ProTeX will ignore code between !BOC and !EOC unless you tell it otherwise.
  3. A description of the routine with the original date
  4. A list of input & output arguments
  5. A list of references (if applicable)
  6. References to F90 modules (if any) after the USE keyword followed by the IMPLICIT NONE declaration
  7. Each time the file is modified the REVISION HISTORY section should be updated.
  8. Declarations for local variables
  9. Declarations for Fortran parameters (aka constants)
  10. Declarations for external functions (if necessary)

Fortran 90 allows you to declare an argument to a subroutine or function with one of the following descriptors:

  1. INTENT(IN) (read-only),
  2. INTENT(OUT) (write-only) or
  3. INTENT(INOUT) (read-and-write)

This helps you to prevent overwriting arguments which should not be overwritten.

Style Guide

Denoting code authors

Generally, we denote code authors by a 3-letter abbreviation in source code comments (bmy, 5/1/03).

The more comments, the better!

It is good practice to add copious comments to your source code. This will make sure that current and future GEOS-Chem users will be able to understand the modifications that you have added.

Keep in mind that many GEOS-Chem users (including the GEOS-Chem Support Team) will probably not be as familiar with your specific area of research as you are. While most users will probably get the "big picture" of what you are doing, they will be less knowledgeable about the little details (i.e. why does this reaction rate have a value of X, why is this parameter set to Y, why was this IF statement deleted, etc.). Providing sufficient source code documentation will eliminate any such guesswork.

Section headers

We recommend that you separate sections of source code with one or more divider comments:

!=================================================================
! 1st level header
!=================================================================

   !--------------------------------------------------------------
   ! 2nd level header
   !--------------------------------------------------------------

      !-----------------------
      ! 3rd-level header
      !-----------------------

Line up each header with the the code that follows immediately below it. You can mix and match these styles as you wish. You don't always have to extend the header all the way across the screen.

If you are modifying a fixed-format source code file, end the major section headers at column #72. This will remind your reader where the limits of the allowable source code region occurs. If you are modifying a free-format source code file, feel free to extend the ends of the major section headers to about column 76 or 77.

Capitals vs. lower-case

We used to write GEOS-Chem source code in all capitals. We chose this approach to avoid mistaking similar characters, such as the lowercase l and the number 1, which in some fonts look remarkably alike. You may at your discretion mix capitals and lowercase in variable names and subroutine names, particularly if you want to improve readability or preserve scientific abbreviations ( e.g. Rn_Pb_Be_mod.F).

We still recommend that you write Fortran reserved words in all capitals: SUBROUTINE, CALL, IF/THEN/ENDIF, DO/ENDDO, WHERE/ENDWHERE. This distinguishes Fortran language elements from your variable and routine names.

Because GEOS-Chem contains a significant amount of 3rd-party code, you will see many routines where the code is written in all lowercase letters. Feel free to leave this code as-is. Reformatting a fixed-format source code file can be a cumbersome task (and maybe not the best use of your precious time)

Good editors such as emacs can "colorize" the different words in a program (e.g. statements, variables, and quoted text are rendered in different colors), thus making the code infinitely more readable.

Indent by 3 spaces

Indent each new block of code 3 columns deeper than the last block:

         1         2         3         4
1234567890123456789012345678901234567890

     IF ( X == 0 ) THEN
        PRINT*, 'Hello, X is 0!'

        IF ( Y == 0 ) THEN
           PRINT*, 'Hello, X is 0 and Y is also 0!'
        ENDIF

     ENDIF

A good editor such as emacs will indent for you automatically. You can specify the level of indent in your editor setup file (e.g. .emacs).

Indenting nested DO loops

Omit indentation for each line of a multiple DO loop. Use this syntax:

DO N = 1, NNPAR
DO L = 1, LLPAR
DO J = 1, JJPAR
DO I = 1, IIPAR
   A(I,J,L,N) = A(I,J,L,N) / 2.0d0
ENDDO
ENDDO
ENDDO
ENDDO

instead of letting the DO loops "creep" unnecessarily across the screen:

DO N = 1, NNPAR
   DO L = 1, LLPAR
      DO J = 1, JJPAR
         DO I = 1, IIPAR
            A(I,J,L,N) = A(I,J,L,N) / 2.0d0
         ENDDO
      ENDDO
   ENDDO
ENDDO

Enhance readability with white space

Use LOTS of white space to separate code. For example instead of

IF(X.eq.0)CALL DOLOOP(X,Y,Z,A,B,C,1,2,3)

type:

IF ( X == 0 ) CALL DOLOOP( X, Y, Z, A, B, C, 1, 2, 3 )

This makes the code much more readable, which prevents bugs.. If you can't read what you've written, you will never find your mistake! Don't be afraid to continue the statement to the next line if necessary.

In the above example, we also have used the new F90 equality test operator == (see below for more information).

Line up code into columns for better readability

Always line up quantities in assignment statements. Instead of:

A=1
THISLOOP=2
C=3
D=THISLOOP*C - A

add white space so that all of the equals signs fall in the same column.

A        = 1
THISLOOP = 2
C        = 3 
D        = ( THISLOOP * C ) - A

Your eyes can make more sense out of text that is lined up into vertical columns.

Group multiple terms of an equation with parentheses (as we have done for the D equation above). The compiler evaluates evaluate expressions in the order the parentheses are placed, from innermost to outermost.

Use of white space for arrays and functions

Do not leave white space between array indices:

A = X(I,J,L)

Leave white space between arguments of functions and WRITE statements:

A = MYFUNCTION( I, J, L )

WRITE( 6, '(a)' )

Line up arguments into columns

Line up arguments in subroutine or function calls. If there are an even number of arguments, break them up symmetrically:

Fortran-77 fixed-file format:

     CALL MYSUB( THIS_A, THIS_B, THIS_C,
    &            THIS_D, THIS_E, THIS_F )

     X = MYFUNCTION( THIS_A, THIS_B, THIS_C,
    &                THIS_D, THIS_E, THIS_F )

Fortran-90 free-format:

     CALL MYSUB( THIS_A, THIS_B, THIS_C, &
                 THIS_D, THIS_E, THIS_F )

     X = MYFUNCTION( THIS_A, THIS_B, THIS_C, &

Use language features of Fortran-90

Operators

Use the new F90-style operators instead of the older F77-style operators:

>== instead of .EQ.

/= instead of .NE.

>  instead of .GT.

>= instead of .GE.

<  instead of .LT. 

>= instead of .LE.

.EQV. (or .eqv, it's case-insensitive) when comparing two logical values, i.e.
 
   IF ( MY_LOGICAL .eqv .FALSE. ) THEN ...

The new operators take up only one or two spaces (instead of four spaces) and are easier to read.

Some operators (.AND., .OR.</code, and .NOT.) are the same in F90 as in F77. This is also true of the boolean values (.TRUE. and .FALSE.).

Comment characters

Use the F90 ! comment character. This replaces the F77 comment character (a C placed in the first column of fixed-format code). You can can create very legible comments by lining up the ! with your source code.

Continuation characters

To continue an instruction from one line to the next, place an ampersand (&) at the end of the line (for free-format layout).

If you are working with a source code file that has not yet been converted to free-format, place an ampersand in column 6 of each continued line of code.

Array assignment statements

Use the F90 array assignment functionality to assign a value to all elements of an array without using array indices. For example:

INTEGER  :: ARRAY(10,10)
REAL(fp) :: ARRAY_F(10,10)

ARRAY(:,:)   = 0
ARRAY_F(:,:) = 0.0_fp

You can also leave off the default array mask (:,:), so

ARRAY   = 0
ARRAY_F = 0.0_fp

is also acceptable.

PARAMETER declaration statements

Initialize PARAMETER constants on the same line where they are declared:

REAL(fp), PARAMETER :: PI  = 3.14159265358979323_fp
REAL(fp), PARAMETER :: AVO = 6.022140857e+23_fp

Array constructors

Use F90 direct assignment statements to assign values to arrays:

REAL*8 :: A(2) = (/ 1.0d0, 2.0d0 /)

Avoid using obsolete F77 syntax:

REAL*8 A(2)
DATA A / 1.0, 2.0 /

CAVEAT! In the above example the A array will be automatically given the SAVE attribute. This means that its values will be stored from one subroutine call to the next. This may not be what you had intended to do, so just be aware of this.

The SAVE attribute

Include the SAVE attribute on the same line where the variable is declared:

LOGICAL, SAVE :: FIRSTTIME = .TRUE.

Avoid using obsolete F77 syntax:

LOGICAL FIRSTTIME 
SAVE FIRSTTIME
DATA FIRSTTIME / .TRUE. /

Note: SAVEd variables within a subroutine or function keep their values from one call to the next. This allows you to do some initialization only on the first call to a subroutine, for example:

LOGICAL, SAVE :: FIRSTTIME = .TRUE.

IF ( FIRSTTIME ) THEN
  PRINT*, 'This is the first time this routine is called!'
  FIRSTTIME = .FALSE.
ENDIF

Since the value of FIRSTTIME is saved between calls, the sentence above will only be printed out on the first call to the subroutine.

The EXTERNAL attribute

Include the EXTERNAL attribute on the same line where a function's type is declared. Use this syntax:

INTEGER, EXTERNAL :: MYFUNC

Avoid using obsolete F77 syntax:

INTEGER MYFUNC
EXTERNAL MYFUNC

NOTE: If you are referencing functions contained within a module file, then you do not need to declare it with EXTERNAL. The EXTERNAL statement is a hangover from F77.

The SELECT CASE statement

Use the F90 SELECT CASE statement to pick from a list of options. In F77 you would have had to write:

IF ( X .EQ. 1 ) THEN
 
  CALL MYSUB( X1 )

ELSE IF ( X .EQ. 2 ) THEN 

  CALL MYSUB( X2 )

ELSE

  CALL MYSUB( X0 )

ENDIF

but with F90, you can write this instead:

SELECT CASE ( X )

   CASE( 1 )
      CALL MYSUB( X1 )

   CASE( 2 ) 
      CALL MYSUB( X2 )

   CASE DEFAULT
      CALL MYSUB( X0 )

END SELECT

SELECT CASE also works with character constants:

SELECT CASE ( TRIM( TRACERNAME ) )

   CASE( 'NOx' )
      CALL MYSUB( 1 )

   CASE( 'Ox' )
      CALL MYSUB( 2 )

   CASE DEFAULT
      CALL MYSUB( 3 )

END SELECT

A note of warning: You cannot specify CASE( X ) where X is a variable, or else the compiler will balk. Then you must resort to the IF - ELSE IF block structure as shown above.

Logarithms

Use the F90 intrinsic functions LOG( X ) and LOG10( X ) to take the natural and common logarithms of X. Here X may be declared either as REAL(f4) or REAL(f8).

Avoid using the F77 intrinsic functions ALOG( X ) and ALOG10( X ). Some compilers do not support these functions.

Keyword arguments

Use F90's keyword argument capability to clarify subroutine calls. You can replace this:

 CALL READ_A1( NYMD,     NHMS, 
&              ALBEDO,   CLDTOT,   EFLUX,    EVAP,    
&              ... etc ...                         )

with:

 CALL READ_A1( NYMD     = NYMD, 
&              NHMS     = NHMS,
&              ALBEDO   = ALBD,
&              CLDTOT   = CLDTOT
&              EFLUX    = EFLUX
&              EVAP     = EVAP
&             ... etc ...      )

In the call above, you see several pairs of variable names separated by an equals sign: ALBEDO = ALBD

The name on the left of the equals sign is the name of the argument as defined in the subroutine (i.e. ALBEDO). This is often called the "dummy argument" because it is just a name for the memory that is getting passed to it from outside of the subroutine.

The name on the right of the equals sign is the name of the variable that we are passing down to the subroutine. Here we are using the ALBD variable (from GEOS-Chem module dao_mod.F and passing that down to READ_A1 as the ALBEDO argument.

                     THIS_D, THIS_E, THIS_F )

Numeric and character data types

Data types

GEOS-Chem uses the following basic data types:

Data type Fortran type Flexible precision definintion Range
TRUE or FALSE values LOGICAL
Whole numbers INTEGER -2 x 109 up to +2 x 109
4-byte (aka 32-bit) floating point real values REAL*4 REAL(f4) -1 x 1038 up to +1 x 1038
8-byte (aka 64-bit) floating point real values REAL*8 REAL(f8) -1 x 10312 up to +1 x 10312
Character strings CHARACTER(LEN=255)

However, you should use 4-byte floating point instead of 8-byte floating point in the following instances:

  1. Diagnostic arrays: GEOS-Chem diagnostic values rarely exceed 1038, since they are usually in units such as kg, v/v, ppbv, or molecules/cm2/s.
  2. Arrays and scalars required for binary or netCDF file I/O: Declaring these types of variables REAL*4 will help to keep file sizes as small as possible, thus saving disk space.

Some third-party routines used by GEOS-Chem (such as TPCORE) use the REAL datatype. Most compilers (but not all) interpret REAL as a 4-byte floating point real. If you want to force the compiler to interpret REAL as an 8-byte floating point real, you must use a special compiler switch (usually -r8, but check your compiler manual to be sure). The GEOS-Chem Makefiles already take care of this for you.

NOTE: See our Floating point math issues wiki page for more information.

Flexible precision

GEOS-Chem v10-01 and later versions use a flexible precision definition. Floating point variables can be defined with a Fortran KIND parameter that can be switched between 4-byte and 8-bytes at compile time.

Here is an example of how you can use flexible precision in GEOS-Chem:

USE PRECISION_MOD          ! Contains the fp, f4, f8 parameters used to define real values
 
REAL*4   :: VAR_REAL4      ! This was the old way to define a  4-byte real (from Fortran-77)
REAL*8   :: VAR REAL8      ! This was the old way to define an 8-byte real (from Fortran-77) 

REAL(f4) :: VAR_REAL4_NEW  ! This is the new way to define a  4-byte real (introduced in Fortran 90)
REAL(f8) :: VAR_REAL8_NEW  ! This is the new way to define an 8-byte real (introduced in Fortran 90)

REAL(fp) :: VAR_FLEXIBLE   ! This is an 8-byte real by default, but can be switched to a 4-byte real at compile time 

Declaring floating point real variables with REAL(fp) allows you to change from the default 8-byte precision to 4-byte precision by compiling GEOS-Chem with the PRECISION=4 switch. (NOTE: As described in the prior section, some floating point variables should always be declared as 4-byte reals, such as those variables that get written to disk.)

We have not yet made full use of the flexible precision in GEOS-Chem as of GEOS-Chem v11-01. This will happen in a later version.

Use the Fortran-90 exponent syntax

Use _fp (double-precision) suffixes when assigning a constant value to a REAL(fp) variable:

REAL(fp) :: PI, ONE_THOUSAND
PI           = 3.14159265358979323_fp
ONE_THOUSAND = 1.0e+3_fp

This will ensure that the value will be declared with sufficient precision.

Similarly, if you have used REAL(f4) to declare 4-byte floating point variables, then use this syntax:

REAL(f4) :: PI, ONE_THOUSAND
PI           = 3.14159265358979323_f4
ONE_THOUSAND = 1.0e+3_f4

and similarly for REAL(f8):

REAL(f9) :: PI, ONE_THOUSAND
PI           = 3.14159265358979323_f8
ONE_THOUSAND = 1.0e+3_f9

Default string length

Declare strings with 255 characters, even if you don't know a priori how long the string will be. Use the TRIM statement to strip excess white space. For example:

CHARACTER(LEN=255) :: STR

STR = 'I am the very model of a modern major general...'

WRITE( 6, '(a)' ) TRIM( STR )

Converting from number to character (and vice-versa)

Internal write statements

Several GEOS-Chem routines convert a numeric variable to a character variable (and vice-versa). This usually happens when a number representing a date or time has to be incorporated into a file path.

Use a Fortran internal write statement to convert from from INTEGER to CHARACTER. This looks like a regular WRITE statement, only that the unit number is replaced by a character variable. Think of it as "writing" your number into a character variable instead of to a file.

For example, the following code creates a string containing the date 20110701:

! Declare variables
CHARACTER(LEN=8) :: DATE_STR
INTEGER          :: DATE = 20110701

! FORTRAN internal write -- converts number to string
WRITE( DATE_STR, '(i8.8)' ) DATE

You can incorporate DATE_STR into a directory or file path. The format '(i8.8)' tells the WRITE command to make the string 8 characters long, and to change any spaces (if they exist) into preceding zeroes.

Internal read statements

Use a Fortran internal read to extract numbers from a character string. This is the reverse of the Fortran internal write described above:

! Declare variables
CHARACTER(LEN=8) :: DATE_STR = '20010101'
INTEGER :: DATE

! FORTRAN internal read -- converts string to number
READ( DATE_STR, '(i8.8)' ) DATE

Math optimizations

Write polynomials efficiently

Logarithms and exponentiation are the most computationally expensive mathematical functions. Avoid using the traditional formulation:

Y = A0 + A1*X + A2*X**2 + A3*X**3 + A4*X**4 + A5*X**5

but instead use parentheses to group the polynomial terms.

Y = A0 + X* ( A1 + X* ( A2 + X* ( A3 + X* ( A4 + X* ( A5 )))))

This modified polynomial expression replaces the costly exponentiations with more efficient multiplication operations. A Nth order polynomial will now only require N multiplications.

Exercise care with exponential approximations

Although it is tempting to replace EXP( -x ) with the first-order approximation 1 - x, this can result in negative values for certain values of x. This can cause negative tracer concentrations.

DO loops

The CYCLE statement

Always use the CYCLE statement to skip to the next iteration in a DO loop. Use this syntax:

DO I = 1, 1000

  ! Skip to next iteration
  IF ( X(I) == 0 ) CYCLE

  ! Print
  PRINT*, X(I)

ENDDO

instead of this obsolete F77 syntax:

DO 100 I = 1, 1000

   IF ( X(I) .EQ. 0 ) GOTO 100
   PRINT*, X(I)

100 CONTINUE

Notice that the F90 ENDDO statement replaces the labeled CONTINUE statement. This results in cleaner code.

The EXIT statement

Always use the EXIT statement to completely exit from a DO loop. Use this syntax:

DO I = 1, 1000

  ! Break out of this loop if X==0!
  IF ( X(I) == 0 ) EXIT

  ! Print info if we have not exited
  WRITE( 6, '(''Iteration: '', i5, f13.6)' ) I, X(I)

ENDDO 

PRINT*, 'outside loop'

instead of this obsolete F77 syntax, which requires a GOTO statement:

C Do-loop
DO 100 I = 1, 1000

   IF ( X(I) .EQ. 0 ) GOTO 200
   PRINT*, 'inside loop'

100 CONTINUE

C Continue outside loop
200 CONTINUE
PRINT*, 'outside loop'

Ordering of DO loops

Structure your DO loops so that they access memory in the most optimal manner. Fortran is a column-major language, which means that arrays are stored in memory by columns first, then by rows. If you have declared an array such as:

INTEGER :: I, J, L, N
REAL*8  :: ARRAY(IIPAR,JJPAR,LLPAR,NNPAR)

then for optimal efficiency, the leftmost dimension (I) needs to vary the fastest, and needs to be accessed by the innermost DO loop. Then the next leftmost dimension (J) should be accessed by the next innermost DO loop, and so on.

Therefore, the proper way to loop over this array is:

DO N = 1, NNPAR
DO L = 1, LLPAR
DO J = 1, JJPAR
DO I = 1, IIPAR
   ARRAY(I,J,L,N) = ARRAY(I,J,L,N) * 2.0d0
ENDDO
ENDDO
ENDDO
ENDDO

Note that the I index is varying most often, since it is the innermost DO loop, then J, L, and N. This is opposite to how a car's odometer reads.

If you loop through an array in this fashion, with leftmost indicies varying fastest, then the code minimizes the number of times it has to load subsections of the array into cache memory. In this optimal manner of execution, all of the array elements sitting in the cache memory are read in the proper order before the next array subsection needs to be loaded into the cache. But if you step through array elements in the wrong order, the number of cache loads is proportionally increased. Because it takes a finite amount of time to reload array elements into cache memory, the more times you have to access the cache, the longer it will take the code to execute. This can slow down the code dramatically.

Infinite DO loop with EXIT

Use "infinite" DO loops in situations where you need to iterate for an unknown number of iterations before some exit criterion is reached. This is most often used to read data from a file whose length you do not know a priori.

Here is an example of an infinite DO loop used to read data from disk:

INTEGER :: I, IOS, IUNIT
REAL*8 :: A(10000) 

! Open file
OPEN( IUNIT, FILE='myfile.txt', IOSTAT=IOS )

! Quit if we can't open the file
IF ( IOS /= 0 ) THEN
   PRINT*, 'Error opening the file!'
   STOP
ENDIF

! Infinite DO loop for reading data from "myfile.txt"
DO

   ! Read I and A(I) from the file
   READ( IUNIT, '(i3,f11.3)', IOSTAT=IOS ) I, A(I)
 
   ! IOS < 0 is end-of-file, so we exit the loop
   IF ( IOS < 0 ) EXIT
	
   ! If IOS < 0, then this is an I/O error,
   ! so we print an error msg and quit
   IF ( IOS > 0 ) THEN
      PRINT*, 'I/O error reading from file!
      STOP
   ENDIF
ENDDO

! Close the file after we exit the loop
CLOSE( IUNIT )

In this example, the DO loop wants to execute forever, but it is stopped by an external criterion—the end of the file. The OPEN, READ, and CLOSE functions can return the I/O status to a variable if you specify via the IOSTAT keyword. If the I/O status variable (in this case, named IOS) is negative, then this is a normal end-of-file condition, and so we can just exit the loop and close the file. However, if IOS is a nonzero, positive number, then this means that we have encountered a real error condition.

Reserved DO loop index variable names

Reserve the following variable names for DO loop and array indices:

  1. Use I for the longitude index
  2. Use J for the latitude index
  3. Use L for the altitude index
  4. Use N for the quantity index (tracer, species, etc.)

Therefore, you can assume that any DO loop that uses I, J, and L is looping over grid box longitude, latitude, and altitude dimensions, and that any DO loop which uses N is looping over species or some other quantity. These special indices should NOT be used to refer other quantities. In this way, if you should get an error message such as:

Error at grid box (I,J) = (23,34)!

you will immediately understand (I,J) are the longitude and latitude indices of the box where the error happened.

NOTE: In some 3rd-party routines, you will see K used instead of L to denote the altitude index. We recommend leaving these as-is, so as to avoid having to rewrite entire sections of 3rd-party code. However, you should use L for the altitude index in any new GEOS-Chem code that you write.

Programming techniques to promote running in HPC environments

Restrict screen output to the root CPU

When GEOS-Chem connects to an external GCM—such as NASA's GEOS-5 GCM—it will utilize MPI parallelization. Each CPU will perform all of GEOS-Chem's operations, but on a much smaller geographical domain (i.e. on a single vertical column or several vertical columns). This also means that each CPU will print informational messages to the screen (or log file, if you redirect screen output there) simultaneously. All of these I/O operations can seriously impact performance.

We have now started the process of bracketing WRITE and PRINT statements with IF blocks. In GEOS-Chem v9-01-03 and higher versions, you will see code like this:

 IF ( am_I_Root ) THEN 
    WRITE( 6, '(a  )' ) REPEAT( '=', 79 )           
    WRITE( 6, '(a,/)' ) 'G E O S - C H E M   U S E R   I N P U T'
    WRITE( 6, 100   ) TRIM( FILENAME )
100 FORMAT( 'READ_INPUT_FILE: Reading ', a )        
 ENDIF

When we use MPI parallelization, the am_I_Root variable, which is passed as an argument to subroutines and functions, determines if we are on the root CPU. This will restrict the printing of informational messages to just the root CPU. We encourage you to use this approach as you write new GEOS-Chem code.

For more information, please see this wiki post.

Do not use fixed parameters for file unit numbers

GEOS-Chem v9-01-02 and prior versions used fixed parameters (stored in module GeosUtil/file_mod.F) for Fortran logical unit numbers. Logical unit numbers, or LUN's, are numeric values that are used in Fortran OPEN, WRITE, READ, and CLOSE statements. A typical GEOS-Chem file read operation looks like this:

USE FILE_MOD, ONLY : IU_FILE

...

! Open file
OPEN( UNIT=IU_FILE, FILE=TRIM( FILENAME ) ... )
  
! Read data
READ( IU_FILE, IOSTAT=IOS ) ...

! Close file
CLOSE( IU_FILE )

But when we connect GEOS-Chem to an external GCM, we can no longer rely on pre-defined LUNs. The GCM may have already assigned the LUN that we are trying to use to a different file. In GEOS-Chem v9-01-03 and higher versions, all LUNs are determined dynamically. We now call a function (inquireMod/findFreeLUN) to return the next unused LUN. You will now see code that looks like this:

USE inquireMod, ONLY : findFreeLUN

...

INTEGER :: IU_FILE

...

! Find a free file LUN
IU_FILE = findFreeLUN()

! Open file
OPEN( UNIT=IU_FILE, FILE=TRIM( FILENAME ) ... )

! Read data
READ( IU_FILE, IOSTAT=IOS ) ...
      
! Close file
CLOSE( IU_FILE )

For more information, please see this wiki post.

Do not refer to public variables via USE statements

We used to refer to variables in modules by means of the USE statement, such as:

! Reference variables from dao_mod.F
USE DAO_MOD, ONLY : UWND, VWND, T

While many routines in GEOS-Chem still use this syntax, we are abandoning this approach. Instead of referring to module variables with USE, we shall (wherever expedient) use derived type objects to contain variables and arrays, which then can be passed from one routine to another via the argument list. We are doinFile:G this to facilitate our GEOS-Chem HP, which seeks to embed GEOS-Chem within the NASA GEOS-5 GCM.

The only things that you should obtain from modules via the USE statement will be:

  1. Subroutines
  2. Functions
  3. Derived type definitions

such as:

! Refer to derived type definition from gigc_state_met_mod.F90
USE State_Met_Mod, ONLY : MetState

! Refer to subroutines from dao_mod.F
USE DAO_MOD, ONLY: AIRQNT, COSSZA

Here are some tips for moving arrays or adding new arrays to the derived type objects:

  • Instead of finding and replacing every instance of an array with State_Chm&array throughout a module, create pointers at the top of each routine (e.g. localarray => State_Chm%array). This is the method we used when replacing the species array (Spc => State_Chm%Species).
  • Where possible, clean up routine arguments. If State_Chm is already passed, then there is no need to also pass State_Chm%array.
  • Consider adding arrays used for bpch diagnostics into State_Diag instead of State_Chm.
  • Try to avoid confusing fields names in the State_Chm object. For example, in DO_DIAG_OH State_Chm%AIR_MASS may be confused with fields in State_Met. If we use pointers at the top of routines, we can still use the existing local array name to avoid having to change code deep in subroutines (e.g. AIR_MASS => State_Chm%Sum_Air_Mass).
  • Allocate State_Chm fields in routine Init_State_Chm instead of local modules.
  • The GCST highly recommends running unit tests and difference tests often as you make these updates. Unit tests will ensure you don't break any simulations and difference tests will ensure you aren't introducing any changes to model output.

More on derived-type objects

We are now moving away from including module variables into subroutines or functions via USE statements. Please see this wiki post on our Derived type objects used by the Grid-Independent GEOS-Chem wiki page for more information about how we are passing data between subroutines via derived type objects.