Developing GEOS-Chem
NOTE: We will be updating this coding style guide in the near future. The recent efforts to interface GEOS-Chem with ESMs like NASA GEOS, CESM, BCC, and WRF necessitate a departure from past programming practice. We will keep you updated.
Contents
- 1 Background
- 2 Code layout
- 3 Style Guide
- 3.1 Denoting code authors
- 3.2 The more comments, the better!
- 3.3 Section headers
- 3.4 Capitals vs. lower-case
- 3.5 Indent by 3 spaces
- 3.6 Indenting nested DO loops
- 3.7 Enhance readability with white space
- 3.8 Line up code into columns for better readability
- 3.9 Use of white space for arrays and functions
- 3.10 Line up arguments into columns
- 3.11 Use language features of Fortran-90
- 3.11.1 Operators
- 3.11.2 Comment characters
- 3.11.3 Continuation characters
- 3.11.4 Array assignment statements
- 3.11.5 PARAMETER declaration statements
- 3.11.6 Array constructors
- 3.11.7 The SAVE attribute
- 3.11.8 The EXTERNAL attribute
- 3.11.9 The SELECT CASE statement
- 3.11.10 Logarithms
- 3.11.11 Keyword arguments
- 3.12 Numeric and character data types
- 3.13 Converting from number to character (and vice-versa)
- 3.14 Math optimizations
- 3.15 DO loops
- 4 Programming techniques to promote running in HPC environments
Background
We have written GEOS-Chem in the Fortran-90 (F90) language. F90 introduces several new features over its predecessor, Fortran-77 (F77), including allocatable arrays, modules, derived types, and new programming statements. We invite you to consult this list of Fortran tutorials.
F90 contains all of the features of F77. Legacy code adhering to the F77 standard should build with any F90 compiler. Although we have written GEOS-Chem source code to the F90 standard, you should be aware that many 3rd-party routines included in GEOS-Chem—such as, [[Photolysis_mechanism|FAST-JX], and ISORROPIA—were originally written to the F77 standard. In the past several GEOS-Chem releases, we have started to replaced older F77 code with F90 code, especially to facilitate our GEOS-Chem High Performance (aka GCHP) project.
You should also be aware that several other updates to the Fortran standard (F95, F2000, F2003) have been introduced in recent years. These new Fortran versions contain the same functionality as F90, but generally omit the obsolete F77-style features. Most modern Fortran compilers include support for all of these standards.
Code layout
Fixed format vs. free format
We previously wrote GEOS-Chem source code using fixed-format layout, which has the following requirements:
- A line continuation character (usually
&
) is placed in column 6 - Numeric labels can occupy column 2 through column 5
- Executable statements begin in column 7 and extend to column 72
We chose the fixed-format layout for compatibility with the F77-style include files that were contained in the Headers/
directory. Starting with GEOS-Chem v9-01-03, we converted all header files to Fortran modules. This allows the use of free-format layout, which does not restrict where you place executable statements. (To continue a free-format statement across several lines of code, you must place an ampersand (&;
) character at the end of each line.)
Modules in GEOS-Chem that are still written in F77 fixed-format layout are denoted with the .F extension, while modules written in F90 free-format layout are denoted with the .F90 extension. We recommend that you write all new GEOS-Chem code using the free format layout.
Modules
Place all of your new GEOS-Chem code into Fortran 90 modules, if possible. These allow you to save both and routines and variables within a single program unit.
An example module is shown below.
!------------------------------------------------------------------------------ ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !MODULE: MY\_MODULE ! ! !DESCRIPTION: \subsection*{Overview} ! Module MY\_MODULE contains variables and routines to do something. ! (bmy, 5/1/03) !\\ !\\ ! !INTERFACE ! MODULE MY_MODULE ! ! !USES: ! IMPLICIT NONE ! Make everything private... PRIVATE ! ! !PUBLIC MEMBER FUNCTIONS: ! PUBLIC :: DRIVER PUBLIC :: INIT_MY_MODULE PUBLIC :: CLEANUP_MY_MODULE ! ! !REVISION HISTORY: ! (1 ) Bug fix in MY_SUBROUTINE (bmy, 5/1/03) !EOP !------------------------------------------------------------------------------ !BOC ! ! !PRIVATE DATA MEMBERS: ! ! Argument to the SIN function REAL*8, ALLOCATABLE :: B(:) CONTAINS !EOC !------------------------------------------------------------------------------ ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: DRIVER ! ! !DESCRIPTION: \subsection*{Overview} ! Subroutine DRIVER is the driver routine for MY_MODULE. (bmy, 5/1/03) !\\ !\\ ! !INTERFACE: ! SUBROUTINE DRIVER ! ! !REVISION HISTORY: ! !EOP !------------------------------------------------------------------------------ !BOC ! ! !LOCAL VARIABLES: ! LOGICAL, SAVE :: FIRST = .TRUE. INTEGER, :: I REAL*8 :: X, Y1, Y2 REAL*8, PARAMETER :: PI180 = 180d0 / 3.14159265358979323d0 !================================================================= ! DRIVER begins here! !================================================================= ! Initialize IF ( FIRST ) THEN CALL INIT_MY_MODULE() ENDIF !================================================================= ! Loop over degrees !================================================================= DO I = 1, 360 ! Convert to radians B = DBLE( I ) / PI180 ! Call subroutine CALL MY_SUBROUTINE( B(I), Y1 ) ! Call function Y2 = MY_FUNCTION( B(I) ) ! Write output WRITE( 6, '(i3, 1x, 3(f13.6,1x))' ) I, B(I), Y1, Y2 ENDDO ! Return to calling program END SUBROUTINE DRIVER !EOC !----------------------------------------------------------------------------- ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: MY\_SUBROUTINE ! ! !DESCRIPTION: \subsection*{Overview} ! Function MY_SUBROUTINE returns the sine of a number. (bmy, 5/1/03) !\\ !\\ ! !INTERFACE: ! SUBROUTINE MY_SUBROUTINE( X, Y ) ! ! !INPUT PARAMETERS: ! ! Argument for the sine function REAL*8, INTENT(IN) :: X ! ! !RETURN VALUE: ! REAL*8, INTENT(OUT) :: Y ! ! !REVISION HISTORY: ! (1 ) Corrected typographic error (bmy, 5/1/03) !EOP !------------------------------------------------------------------------------ !BOC !================================================================= ! MY_SUBROUTINE begins here! !================================================================= Y = SIN( X ) ! Return to calling program END SUBROUTINE MY_SUBROUTINE !EOC !----------------------------------------------------------------------------- ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: MY\_FUNCTION ! ! !DESCRIPTION: \subsection*{Overview} ! Function MY_FUNCTION returns the sine of a number. (bmy, 5/1/03) !\\ !\\ ! !INTERFACE: ! FUNCTION MY_FUNCTION( X ) RESULT( Y ) ! ! !INPUT PARAMETERS: ! ! Argument for the sine function REAL*8, INTENT(IN) :: X ! ! !RETURN VALUE: ! REAL*8, INTENT(OUT) :: Y ! ! !REVISION HISTORY: ! !EOP !------------------------------------------------------------------------------ !BOC !================================================================= ! MY_FUNCTION begins here! !================================================================= Y = SIN( X ) ! Return to calling program END FUNCTION MY_FUNCTION !EOC !------------------------------------------------------------------------------ ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: INIT\_MY\_MODULE ! ! !DESCRIPTION: \subsection*{Overview} ! Subroutine INIT\_MY\_MODULE allocates and zeroes the B array. (bmy, 5/1/03) !\\ !\\ ! !INTERFACE: ! SUBROUTINE INIT_MY_MODULE ! ! !USES ! USE ERROR_MOD, ONLY : ALLOC_ERR ! ! !REVISION HISTORY: ! !EOP !------------------------------------------------------------------------------ !BOC ! ! !LOCAL VARIABLES ! INTEGER :: AS !================================================================= ! INIT_MY_MODULE begins here! !================================================================= ! Allocate B, check status ALLOCATE( B(360), STAT=AS ) ! Error check IF ( AS /= 0 ) CALL ALLOC_ERR( 'B' ) ! Zero B B(:) = 0d0 ! Return to calling program END SUBROUTINE INIT_MY_MODULE !EOC !----------------------------------------------------------------------------- ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: CLEANUP\_MY\_MODULE ! ! !DESCRIPTION: \subsection*{Overview} ! Subroutine CLEANUP_MY_MODULE deallocates all module arrays. (bmy, 5/1/03) !\\ !\\ ! !INTERFACE: ! SUBROUTINE CLEANUP_MY_MODULE() ! ! !REVISION HISTORY: ! !EOP !------------------------------------------------------------------------------ !BOC !================================================================= ! CLEANUP_MY_MODULE begins here! !================================================================= IF ( ALLOCATED( B ) ) DEALLOCATE( B ) ! Return to calling program END SUBROUTINE CLEANUP_MY_MODULE !EOC END MODULE MY_MODULE<
Fortran 90 modules written for GEOS-Chem contain the following features:
Item | Description |
---|---|
Module header | This is similar to the subroutine header. Contains a list of all module variables, routines, and modification notes, in Automatic documentation with ProTex |
PRIVATE declarations
|
You can "hide" certain routines or variables within a F90 module from other routines or modules. Ideally your module will be like a "black box" with only a few routines or variables that are publicly accessible. By convention, in GEOS-Chem, we declare everything PRIVATE first then indicate the public routines.
|
IMPLICIT NONE
|
If you declare IMPLICIT NONE once at the top of the module, then you don't have to declare it in the individual subroutines; the declaration "carries through" below.
|
Module variables and arrays | Any variables declared above the CONTAINS statement are as if they were stored in F77 common blocks; that is, they are preserved between calls to the various module routines. Also, you should declare module arrays as ALLOCATABLE whenever possible. This allows you to only set aside memory for that array if a certain routine is called. This results in more efficient memory management.
|
CONTAINS statement
|
All module variables and PRIVATE declarations go above the CONTAINS statement. Module routines and functions go below the CONTAINS statement
|
An INIT routine | Your module should have a subroutine named INIT_modulename , which initializes and allocates all module arrays.
|
A CLEANUP routine | Your module should have a subroutine named CLEANUP_modulename , which deallocates module arrays. (You only need this if you have allocatable arrays).
|
F90 modules are very similar to classes in Java and C++; they allow you to group data and the routines which work on the data in a single package.
Theoretically, each GEOS-Chem routine and/or variable should belong to a module. In practice, this is not true, as we do have separate subroutine and function files from some of the older routines which were written by 3rd party sources. However, you should write all new GEOS-Chem code in module form.
Also, it is OK for modules to reference other modules. If Module B references Module A, all that you have to do is to make sure that Module B declaration in the Makefile refers to Module A. The GEOS-Chem Makefiles have been prepared for you by the GEOS-Chem Support Team, so in most instances, you will not have to concern yourself with this.
Automatic documentation with ProTeX
We generate detailed reference documentation with ProTeX. ProTeX is a perl script developed by NASA/GMAO that generates ProTeX files from GEOS-Chem's subroutine, function, and module comment headers. The LaTeX file can then be compiled to produce both PostScript and PDF output.
To use ProTeX, you need to add a standard header at the top of each subroutine, function, and module. The header looks like this:
!------------------------------------------------------------------------------ ! Harvard University Atmospheric Chemistry Modeling Group ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: metero ! ! !DESCRIPTION: Subroutine METERO calculates meteorological constants needed ! for the dry deposition velocity module. (lwh, gmg, djj, 1989, 1994; bmy, ! 10/3/05) !\\ !\\ ! !INTERFACE: ! SUBROUTINE METERO( State_Met, CZ1, TC0, OBK, CFRAC, & RADIAT, AZO, USTR, ZH, LSNOW, & RHB, PRESSU, W10, SUNCOS_MID ) ! ! !USES: ! USE GIGC_State_Met_Mod, ONLY : MetState IMPLICIT NONE ! ! !INPUT PARAMETERS: ! TYPE(MetState), INTENT(IN) :: State_Met ! Meteorology State object ! ! !OUTPUT PARAMETERS: ! LOGICAL, INTENT(OUT) :: LSNOW (MAXIJ) ! Flag for denoting snow/ice REAL*8, INTENT(OUT) :: CZ1 (MAXIJ) ! Midpt ht of 1st model level [m] REAL*8, INTENT(OUT) :: TC0 (MAXIJ) ! Grid box sfc temperature [K] REAL*8, INTENT(OUT) :: OBK (MAXIJ) ! Monin-Obhukov length [m] REAL*8, INTENT(OUT) :: CFRAC (MAXIJ) ! Column cloud fraction [unitless] REAL*8, INTENT(OUT) :: RADIAT(MAXIJ) ! Solar radiation @ ground [W/m2] REAL*8, INTENT(OUT) :: RHB (MAXIJ) ! Rel humidity at sfc [unitless] REAL*8, INTENT(OUT) :: USTR (MAXIJ) ! Friction velocity [m/s] REAL*8, INTENT(OUT) :: ZH (MAXIJ) ! PBL height [m] REAL*8, INTENT(OUT) :: PRESSU(MAXIJ) ! Local surface pressure [Pa] REAL*8, INTENT(OUT) :: W10 (MAXIJ) ! 10 meter windspeed [m/s] REAL*8, INTENT(OUT) :: SUNCOS_MID(MAXIJ) ! COS(SZA) @ midpt of current ! chemistry timestep ! Dimension AZO for GCAP or GEOS met fields (swu, bmy, 5/25/05) #if defined( GCAP ) REAL*8, INTENT(OUT) :: AZO(NTYPE) ! Roughness heights, by landtype #else REAL*8, INTENT(OUT) :: AZO(MAXIJ) ! Roughness heights, by grid box #endif ! ! !REMARKS: ! NOTE: We save into arrays of dimension MAXIJ=IIPAR*JJPAR for compatibility ! with the legacy drydep routine DEPVEL. ! . ! References (see full citations above): ! ============================================================================ ! (1 ) Wesely, M. L., 1989. ! (2 ) Jacob, D.J., and S.C. Wofsy, 1990 ! ! !REVISION HISTORY: ! (1 ) Now reference GET_PEDGE from "pressure_mod.f". Now reference T from ! "dao_mod.f". Removed obsolete code & comments, and added new ! documentation header. Now force double precision with "D" ! exponents. Now compute OBK here as well. Bundled into F90 module ! "drydep_mod.f" (bmy, 11/20/02) ! (2 ) Now reference CLDFRC, RADSWG, ZO, USTAR from "dao_mod.f". Also now ! pass CFRAC, RADIAT, AZO, USTR back to the calling routine ! via the arg list. (bmy, 12/9/03) ! (3 ) Now use explicit formula for IJLOOP to allow parallelization ! (bmy, 7/20/04) ! (4 ) Now compute ZH and LSNOW here instead of w/in DO_DRYDEP. Parallelize ! DO-loops. Now use BXHEIGHT from "dao_mod.f" instead of computing ! the thickness of the 1st level here. Remove reference to ! "pressure_mod.f". Remove reference to T from "dao_mod.f". Now ! reference ALBD from "dao_mod.f" (bmy, 2/22/05) ! (5 ) Now references RH from "dao_mod.f". Now passes relative humidity ! from the surface layer back via RHB argument. (bec, bmy, 4/13/05) ! (6 ) Now call GET_OBK from "dao_mod.f" to get the M-O length for both ! GEOS or GCAP met fields. Remove local computation of M-O length ! here. Also now dimension AZO appropriately for GCAP or GEOS met ! fields. Remove obsolete variables. (swu, bmy, 5/25/05) ! (7 ) Now make sure all USE statements are USE, ONLY (bmy, 10/3/05) ! (8 ) Move XLTMMP function to module MEGANUT_MOD. (ccc, 11/20/09) ! (9 ) Add sea level pressure and 10m windspeed as arguments (jaegle 5/11/11) ! 22 Dec 2011 - M. Payer - Added ProTeX headers ! 10 Jan 2012 - M. Payer - Added local surface pressure ! 09 Nov 2012 - M. Payer - Replaced all met field arrays with State_Met ! derived type object ! 28 Nov 2012 - R. Yantosca - Add SUNCOS_MID to the argument list and ! populate that with State_Met%SUNCOSmid !EOP !------------------------------------------------------------------------------ !BOC ! ! !LOCAL VARIABLES: ! INTEGER :: I, J, IJLOOP REAL*8 :: THIK REAL*8 :: SP ! ! !EXTERNAL FUNCTIONS: ! REAL*8, EXTERNAL :: SFCWINDSQR ! Surface wind speed (jaegle 5/11/11) !================================================================= ! METERO begins here! !================================================================= ... code goes here ... END SUBROUTINE METERO !EOC
Note that the ProTeX header contains the following elements:
- Indicators (
!BOP
,!EOP
) as to where ProTeX should look for source code header comments< - Indicators (
!BOC
,!EOC
) as to where the source code begins and ends. ProTeX will ignore code between!BOC
and!EOC
unless you tell it otherwise. - A description of the routine with the original date
- A list of input & output arguments
- A list of references (if applicable)
- References to F90 modules (if any) after the
USE
keyword followed by theIMPLICIT NONE
declaration - Each time the file is modified the
REVISION HISTORY
section should be updated. - Declarations for local variables
- Declarations for Fortran parameters (aka constants)
- Declarations for external functions (if necessary)
Fortran 90 allows you to declare an argument to a subroutine or function with one of the following descriptors:
INTENT(IN)
(read-only),INTENT(OUT)
(write-only) orINTENT(INOUT)
(read-and-write)
This helps you to prevent overwriting arguments which should not be overwritten.
Style Guide
Denoting code authors
Generally, we denote code authors by a 3-letter abbreviation in source code comments (bmy, 5/1/03)
.
The more comments, the better!
It is good practice to add copious comments to your source code. This will make sure that current and future GEOS-Chem users will be able to understand the modifications that you have added.
Keep in mind that many GEOS-Chem users (including the GEOS-Chem Support Team) will probably not be as familiar with your specific area of research as you are. While most users will probably get the "big picture" of what you are doing, they will be less knowledgeable about the little details (i.e. why does this reaction rate have a value of X, why is this parameter set to Y, why was this IF statement deleted, etc.). Providing sufficient source code documentation will eliminate any such guesswork.
Section headers
We recommend that you separate sections of source code with one or more divider comments:
!================================================================= ! 1st level header !================================================================= !-------------------------------------------------------------- ! 2nd level header !-------------------------------------------------------------- !----------------------- ! 3rd-level header !-----------------------
Line up each header with the the code that follows immediately below it. You can mix and match these styles as you wish. You don't always have to extend the header all the way across the screen.
If you are modifying a fixed-format source code file, end the major section headers at column #72. This will remind your reader where the limits of the allowable source code region occurs. If you are modifying a free-format source code file, feel free to extend the ends of the major section headers to about column 76 or 77.
Capitals vs. lower-case
We used to write GEOS-Chem source code in all capitals. We chose this approach to avoid mistaking similar characters, such as the lowercase l
and the number 1
, which in some fonts look remarkably alike. You may at your discretion mix capitals and lowercase in variable names and subroutine names, particularly if you want to improve readability or preserve scientific abbreviations ( e.g. Rn_Pb_Be_mod.F
).
We still recommend that you write Fortran reserved words in all capitals: SUBROUTINE
, CALL
, IF/THEN/ENDIF
, DO/ENDDO
, WHERE/ENDWHERE
. This distinguishes Fortran language elements from your variable and routine names.
Because GEOS-Chem contains a significant amount of 3rd-party code, you will see many routines where the code is written in all lowercase letters. Feel free to leave this code as-is. Reformatting a fixed-format source code file can be a cumbersome task (and maybe not the best use of your precious time)
Good editors such as emacs
can "colorize" the different words in a program (e.g. statements, variables, and quoted text are rendered in different colors), thus making the code infinitely more readable.
Indent by 3 spaces
Indent each new block of code 3 columns deeper than the last block:
1 2 3 4 1234567890123456789012345678901234567890 IF ( X == 0 ) THEN PRINT*, 'Hello, X is 0!' IF ( Y == 0 ) THEN PRINT*, 'Hello, X is 0 and Y is also 0!' ENDIF ENDIF
A good editor such as emacs
will indent for you automatically. You can specify the level of indent in your editor setup file (e.g. .emacs
).
Indenting nested DO loops
Omit indentation for each line of a multiple DO
loop. Use this syntax:
DO N = 1, NNPAR DO L = 1, LLPAR DO J = 1, JJPAR DO I = 1, IIPAR A(I,J,L,N) = A(I,J,L,N) / 2.0d0 ENDDO ENDDO ENDDO ENDDO
instead of letting the DO loops "creep" unnecessarily across the screen:
DO N = 1, NNPAR DO L = 1, LLPAR DO J = 1, JJPAR DO I = 1, IIPAR A(I,J,L,N) = A(I,J,L,N) / 2.0d0 ENDDO ENDDO ENDDO ENDDO
Enhance readability with white space
Use LOTS of white space to separate code. For example instead of
IF(X.eq.0)CALL DOLOOP(X,Y,Z,A,B,C,1,2,3)
type:
IF ( X == 0 ) CALL DOLOOP( X, Y, Z, A, B, C, 1, 2, 3 )
This makes the code much more readable, which prevents bugs.. If you can't read what you've written, you will never find your mistake! Don't be afraid to continue the statement to the next line if necessary.
In the above example, we also have used the new F90 equality test operator ==
(see below for more information).
Line up code into columns for better readability
Always line up quantities in assignment statements. Instead of:
A=1 THISLOOP=2 C=3 D=THISLOOP*C - A
add white space so that all of the equals signs fall in the same column.
A = 1 THISLOOP = 2 C = 3 D = ( THISLOOP * C ) - A
Your eyes can make more sense out of text that is lined up into vertical columns.
Group multiple terms of an equation with parentheses (as we have done for the D equation above). The compiler evaluates evaluate expressions in the order the parentheses are placed, from innermost to outermost.
Use of white space for arrays and functions
Do not leave white space between array indices:
A = X(I,J,L)
Leave white space between arguments of functions and WRITE statements:
A = MYFUNCTION( I, J, L ) WRITE( 6, '(a)' )
Line up arguments into columns
Line up arguments in subroutine or function calls. If there are an even number of arguments, break them up symmetrically:
Fortran-77 fixed-file format:
CALL MYSUB( THIS_A, THIS_B, THIS_C, & THIS_D, THIS_E, THIS_F ) X = MYFUNCTION( THIS_A, THIS_B, THIS_C, & THIS_D, THIS_E, THIS_F )
Fortran-90 free-format:
CALL MYSUB( THIS_A, THIS_B, THIS_C, & THIS_D, THIS_E, THIS_F ) X = MYFUNCTION( THIS_A, THIS_B, THIS_C, &
Use language features of Fortran-90
Operators
Use the new F90-style operators instead of the older F77-style operators:
>== instead of .EQ. /= instead of .NE. > instead of .GT. >= instead of .GE. < instead of .LT. >= instead of .LE. .EQV. (or .eqv, it's case-insensitive) when comparing two logical values, i.e. IF ( MY_LOGICAL .eqv .FALSE. ) THEN ...
The new operators take up only one or two spaces (instead of four spaces) and are easier to read.
Some operators (.AND.
, .OR.</code, and <code>.NOT.
) are the same in F90 as in F77. This is also true of the boolean values (.TRUE.
and .FALSE.
).
Comment characters
Use the F90 !
comment character. This replaces the F77 comment character (a C
placed in the first column of fixed-format code). You can can create very legible comments by lining up the !
with your source code.
Continuation characters
To continue an instruction from one line to the next, place an ampersand (&
) at the end of the line (for free-format layout).
If you are working with a source code file that has not yet been converted to free-format, place an ampersand in column 6 of each continued line of code.
Array assignment statements
Use the F90 array assignment functionality to assign a value to all elements of an array without using array indices. For example:
INTEGER :: ARRAY(10,10) REAL(fp) :: ARRAY_F(10,10) ARRAY(:,:) = 0 ARRAY_F(:,:) = 0.0_fp
You can also leave off the default array mask (:,:),
so
ARRAY = 0 ARRAY_F = 0.0_fp
is also acceptable.
PARAMETER declaration statements
Initialize PARAMETER
constants on the same line where they are declared:
REAL(fp), PARAMETER :: PI = 3.14159265358979323_fp REAL(fp), PARAMETER :: AVO = 6.022140857e+23_fp
Array constructors
Use F90 direct assignment statements to assign values to arrays:
REAL*8 :: A(2) = (/ 1.0d0, 2.0d0 /)
Avoid using obsolete F77 syntax:
REAL*8 A(2) DATA A / 1.0, 2.0 /
CAVEAT! In the above example the A
array will be automatically given the SAVE
attribute. This means that its values will be stored from one subroutine call to the next. This may not be what you had intended to do, so just be aware of this.
The SAVE attribute
Include the SAVE attribute on the same line where the variable is declared:
LOGICAL, SAVE :: FIRSTTIME = .TRUE.
Avoid using obsolete F77 syntax:
LOGICAL FIRSTTIME SAVE FIRSTTIME DATA FIRSTTIME / .TRUE. /
Note: SAVE
d variables within a subroutine or function keep their values from one call to the next. This allows you to do some initialization only on the first call to a subroutine, for example:
LOGICAL, SAVE :: FIRSTTIME = .TRUE. IF ( FIRSTTIME ) THEN PRINT*, 'This is the first time this routine is called!' FIRSTTIME = .FALSE. ENDIF
Since the value of FIRSTTIME
is saved between calls, the sentence above will only be printed out on the first call to the subroutine.
The EXTERNAL attribute
Include the EXTERNAL
attribute on the same line where a function's type is declared. Use this syntax:
INTEGER, EXTERNAL :: MYFUNC
Avoid using obsolete F77 syntax:
INTEGER MYFUNC EXTERNAL MYFUNC
NOTE: If you are referencing functions contained within a module file, then you do not need to declare it with EXTERNAL
. The EXTERNAL
statement is a hangover from F77.
The SELECT CASE statement
Use the F90 SELECT CASE
statement to pick from a list of options. In F77 you would have had to write:
IF ( X .EQ. 1 ) THEN CALL MYSUB( X1 ) ELSE IF ( X .EQ. 2 ) THEN CALL MYSUB( X2 ) ELSE CALL MYSUB( X0 ) ENDIF
but with F90, you can write this instead:
SELECT CASE ( X ) CASE( 1 ) CALL MYSUB( X1 ) CASE( 2 ) CALL MYSUB( X2 ) CASE DEFAULT CALL MYSUB( X0 ) END SELECT
SELECT CASE
also works with character constants:
SELECT CASE ( TRIM( TRACERNAME ) ) CASE( 'NOx' ) CALL MYSUB( 1 ) CASE( 'Ox' ) CALL MYSUB( 2 ) CASE DEFAULT CALL MYSUB( 3 ) END SELECT
A note of warning: You cannot specify CASE( X )
where X
is a variable, or else the compiler will balk. Then you must resort to the IF - ELSE IF
block structure as shown above.
Logarithms
Use the F90 intrinsic functions LOG( X )
and LOG10( X )
to take the natural and common logarithms of X
. Here X
may be declared either as REAL(f4)
or REAL(f8)
.
Avoid using the F77 intrinsic functions ALOG( X )
and ALOG10( X )
. Some compilers do not support these functions.
Keyword arguments
Use F90's keyword argument capability to clarify subroutine calls. You can replace this:
CALL READ_A1( NYMD, NHMS, & ALBEDO, CLDTOT, EFLUX, EVAP, & ... etc ... )
with:
CALL READ_A1( NYMD = NYMD, & NHMS = NHMS, & ALBEDO = ALBD, & CLDTOT = CLDTOT & EFLUX = EFLUX & EVAP = EVAP & ... etc ... )
In the call above, you see several pairs of variable names separated by an equals sign: ALBEDO = ALBD
The name on the left of the equals sign is the name of the argument as defined in the subroutine (i.e. ALBEDO
). This is often called the "dummy argument" because it is just a name for the memory that is getting passed to it from outside of the subroutine.
The name on the right of the equals sign is the name of the variable that we are passing down to the subroutine. Here we are using the ALBD
variable (from GEOS-Chem module dao_mod.F
and passing that down to READ_A1
as the ALBEDO
argument.
THIS_D, THIS_E, THIS_F )
Numeric and character data types
Data types
GEOS-Chem uses the following basic data types:
Data type | Fortran type | Flexible precision definintion | Range |
---|---|---|---|
TRUE or FALSE values | LOGICAL |
||
Whole numbers | INTEGER |
-2 x 109 up to +2 x 109 | |
4-byte (aka 32-bit) floating point real values | REAL*4 |
REAL(f4) |
-1 x 1038 up to +1 x 1038 |
8-byte (aka 64-bit) floating point real values | REAL*8 |
REAL(f8) |
-1 x 10312 up to +1 x 10312 |
Character strings | CHARACTER(LEN=255) |
However, you should use 4-byte floating point instead of 8-byte floating point in the following instances:
- Diagnostic arrays: GEOS-Chem diagnostic values rarely exceed 1038, since they are usually in units such as kg, v/v, ppbv, or molecules/cm2/s.
- Arrays and scalars required for binary or netCDF file I/O: Declaring these types of variables
REAL*4
will help to keep file sizes as small as possible, thus saving disk space.
Some third-party routines used by GEOS-Chem (such as TPCORE
) use the REAL
datatype. Most compilers (but not all) interpret REAL
as a 4-byte floating point real. If you want to force the compiler to interpret REAL
as an 8-byte floating point real, you must use a special compiler switch (usually -r8,
but check your compiler manual to be sure). The GEOS-Chem Makefiles already take care of this for you.
NOTE: See our Floating point math issues wiki page for more information.
Flexible precision
GEOS-Chem v10-01 and later versions use a flexible precision definition. Floating point variables can be defined with a Fortran KIND parameter that can be switched between 4-byte and 8-bytes at compile time.
Here is an example of how you can use flexible precision in GEOS-Chem:
USE PRECISION_MOD ! Contains the fp, f4, f8 parameters used to define real values REAL*4 :: VAR_REAL4 ! This was the old way to define a 4-byte real (from Fortran-77) REAL*8 :: VAR REAL8 ! This was the old way to define an 8-byte real (from Fortran-77) REAL(f4) :: VAR_REAL4_NEW ! This is the new way to define a 4-byte real (introduced in Fortran 90) REAL(f8) :: VAR_REAL8_NEW ! This is the new way to define an 8-byte real (introduced in Fortran 90) REAL(fp) :: VAR_FLEXIBLE ! This is an 8-byte real by default, but can be switched to a 4-byte real at compile time
Declaring floating point real variables with REAL(fp)
allows you to change from the default 8-byte precision to 4-byte precision by compiling GEOS-Chem with the PRECISION=4
switch. (NOTE: As described in the prior section, some floating point variables should always be declared as 4-byte reals, such as those variables that get written to disk.)
We have not yet made full use of the flexible precision in GEOS-Chem as of GEOS-Chem v11-01. This will happen in a later version.
Use the Fortran-90 exponent syntax
Use _fp
(double-precision) suffixes when assigning a constant value to a REAL(fp)
variable:
REAL(fp) :: PI, ONE_THOUSAND PI = 3.14159265358979323_fp ONE_THOUSAND = 1.0e+3_fp
This will ensure that the value will be declared with sufficient precision.
Similarly, if you have used REAL(f4)
to declare 4-byte floating point variables, then use this syntax:
REAL(f4) :: PI, ONE_THOUSAND PI = 3.14159265358979323_f4 ONE_THOUSAND = 1.0e+3_f4
and similarly for REAL(f8)
:
REAL(f9) :: PI, ONE_THOUSAND PI = 3.14159265358979323_f8 ONE_THOUSAND = 1.0e+3_f9
Default string length
Declare strings with 255 characters, even if you don't know a priori how long the string will be. Use the TRIM
statement to strip excess white space. For example:
CHARACTER(LEN=255) :: STR STR = 'I am the very model of a modern major general...' WRITE( 6, '(a)' ) TRIM( STR )
Converting from number to character (and vice-versa)
Internal write statements
Several GEOS-Chem routines convert a numeric variable to a character variable (and vice-versa). This usually happens when a number representing a date or time has to be incorporated into a file path.
Use a Fortran internal write statement to convert from from INTEGER
to CHARACTER
. This looks like a regular WRITE
statement, only that the unit number is replaced by a character variable. Think of it as "writing" your number into a character variable instead of to a file.
For example, the following code creates a string containing the date 20110701:
! Declare variables CHARACTER(LEN=8) :: DATE_STR INTEGER :: DATE = 20110701 ! FORTRAN internal write -- converts number to string WRITE( DATE_STR, '(i8.8)' ) DATE
You can incorporate DATE_STR
into a directory or file path. The format '(i8.8)'
tells the WRITE
command to make the string 8 characters long, and to change any spaces (if they exist) into preceding zeroes.
Internal read statements
Use a Fortran internal read to extract numbers from a character string. This is the reverse of the Fortran internal write described above:
! Declare variables CHARACTER(LEN=8) :: DATE_STR = '20010101' INTEGER :: DATE ! FORTRAN internal read -- converts string to number READ( DATE_STR, '(i8.8)' ) DATE
Math optimizations
Write polynomials efficiently
Logarithms and exponentiation are the most computationally expensive mathematical functions. Avoid using the traditional formulation:
Y = A0 + A1*X + A2*X**2 + A3*X**3 + A4*X**4 + A5*X**5
but instead use parentheses to group the polynomial terms.
Y = A0 + X* ( A1 + X* ( A2 + X* ( A3 + X* ( A4 + X* ( A5 )))))
This modified polynomial expression replaces the costly exponentiations with more efficient multiplication operations. A Nth order polynomial will now only require N multiplications.
Exercise care with exponential approximations
Although it is tempting to replace EXP( -x )
with the first-order approximation 1 - x,
this can result in negative values for certain values of x
. This can cause negative tracer concentrations.
DO loops
The CYCLE statement
Always use the CYCLE
statement to skip to the next iteration in a DO
loop. Use this syntax:
DO I = 1, 1000 ! Skip to next iteration IF ( X(I) == 0 ) CYCLE ! Print PRINT*, X(I) ENDDO
instead of this obsolete F77 syntax:
DO 100 I = 1, 1000 IF ( X(I) .EQ. 0 ) GOTO 100 PRINT*, X(I) 100 CONTINUE
Notice that the F90 ENDDO
statement replaces the labeled CONTINUE
statement. This results in cleaner code.
The EXIT statement
Always use the EXIT
statement to completely exit from a DO
loop. Use this syntax:
DO I = 1, 1000 ! Break out of this loop if X==0! IF ( X(I) == 0 ) EXIT ! Print info if we have not exited WRITE( 6, '(''Iteration: '', i5, f13.6)' ) I, X(I) ENDDO PRINT*, 'outside loop'</pre>
instead of this obsolete F77 syntax, which requires a GOTO
statement:
C Do-loop DO 100 I = 1, 1000 IF ( X(I) .EQ. 0 ) GOTO 200 PRINT*, 'inside loop' 100 CONTINUE C Continue outside loop 200 CONTINUE PRINT*, 'outside loop'
Ordering of DO loops
Structure your DO
loops so that they access memory in the most optimal manner. Fortran is a column-major language, which means that arrays are stored in memory by columns first, then by rows. If you have declared an array such as:
INTEGER :: I, J, L, N REAL*8 :: ARRAY(IIPAR,JJPAR,LLPAR,NNPAR)
then for optimal efficiency, the leftmost dimension (I
) needs to vary the fastest, and needs to be accessed by the innermost DO
loop. Then the next leftmost dimension (J
) should be accessed by the next innermost DO
loop, and so on.
Therefore, the proper way to loop over this array is:
DO N = 1, NNPAR DO L = 1, LLPAR DO J = 1, JJPAR DO I = 1, IIPAR ARRAY(I,J,L,N) = ARRAY(I,J,L,N) * 2.0d0 ENDDO ENDDO ENDDO ENDDO
Note that the I
index is varying most often, since it is the innermost DO
loop, then J
, L
, and N
. This is opposite to how a car's odometer reads.
If you loop through an array in this fashion, with leftmost indicies varying fastest, then the code minimizes the number of times it has to load subsections of the array into cache memory. In this optimal manner of execution, all of the array elements sitting in the cache memory are read in the proper order before the next array subsection needs to be loaded into the cache. But if you step through array elements in the wrong order, the number of cache loads is proportionally increased. Because it takes a finite amount of time to reload array elements into cache memory, the more times you have to access the cache, the longer it will take the code to execute. This can slow down the code dramatically.
Infinite DO loop with EXIT
Use "infinite" DO
loops in situations where you need to iterate for an unknown number of iterations before some exit criterion is reached. This is most often used to read data from a file whose length you do not know a priori.
Here is an example of an infinite DO
loop used to read data from disk:
INTEGER :: I, IOS, IUNIT REAL*8 :: A(10000) ! Open file OPEN( IUNIT, FILE='myfile.txt', IOSTAT=IOS ) ! Quit if we can't open the file IF ( IOS /= 0 ) THEN PRINT*, 'Error opening the file!' STOP ENDIF ! Infinite DO loop for reading data from "myfile.txt" DO ! Read I and A(I) from the file READ( IUNIT, '(i3,f11.3)', IOSTAT=IOS ) I, A(I) ! IOS < 0 is end-of-file, so we exit the loop IF ( IOS < 0 ) EXIT ! If IOS < 0, then this is an I/O error, ! so we print an error msg and quit IF ( IOS > 0 ) THEN PRINT*, 'I/O error reading from file! STOP ENDIF ENDDO ! Close the file after we exit the loop CLOSE( IUNIT )
In this example, the DO
loop wants to execute forever, but it is stopped by an external criterion—the end of the file. The OPEN
, READ
, and CLOSE
functions can return the I/O status to a variable if you specify via the IOSTAT
keyword. If the I/O status variable (in this case, named IOS
) is negative, then this is a normal end-of-file condition, and so we can just exit the loop and close the file. However, if IOS
is a nonzero, positive number, then this means that we have encountered a real error condition.
Reserved DO loop index variable names
Reserve the following variable names for DO
loop and array indices:
- Use
I
for the longitude index - Use
J
for the latitude index - Use
L
for the altitude index - Use
N
for the quantity index (tracer, species, etc.)
Therefore, you can assume that any DO
loop that uses I
, J
, and L
is looping over grid box longitude, latitude, and altitude dimensions, and that any DO
loop which uses N
is looping over species or some other quantity. These special indices should NOT be used to refer other quantities. In this way, if you should get an error message such as:
Error at grid box (I,J) = (23,34)!
you will immediately understand (I,J)
are the longitude and latitude indices of the box where the error happened.
NOTE: In some 3rd-party routines, you will see K
used instead of L
to denote the altitude index. We recommend leaving these as-is, so as to avoid having to rewrite entire sections of 3rd-party code. However, you should use L
for the altitude index in any new GEOS-Chem code that you write.
Programming techniques to promote running in HPC environments
Restrict screen output to the root CPU
When GEOS-Chem connects to an external GCM—such as NASA's GEOS-5 GCM—it will utilize MPI parallelization. Each CPU will perform all of GEOS-Chem's operations, but on a much smaller geographical domain (i.e. on a single vertical column or several vertical columns). This also means that each CPU will print informational messages to the screen (or log file, if you redirect screen output there) simultaneously. All of these I/O operations can seriously impact performance.
We have now started the process of bracketing WRITE
and PRINT
statements with IF
blocks. In GEOS-Chem v9-01-03 and higher versions, you will see code like this:
IF ( am_I_Root ) THEN WRITE( 6, '(a )' ) REPEAT( '=', 79 ) WRITE( 6, '(a,/)' ) 'G E O S - C H E M U S E R I N P U T' WRITE( 6, 100 ) TRIM( FILENAME ) 100 FORMAT( 'READ_INPUT_FILE: Reading ', a ) ENDIF
When we use MPI parallelization, the am_I_Root
variable, which is passed as an argument to subroutines and functions, determines if we are on the root CPU. This will restrict the printing of informational messages to just the root CPU. We encourage you to use this approach as you write new GEOS-Chem code.
For more information, please see this wiki post.
Do not use fixed parameters for file unit numbers
GEOS-Chem v9-01-02 and prior versions used fixed parameters (stored in module GeosUtil/file_mod.F
) for Fortran logical unit numbers. Logical unit numbers, or LUN's, are numeric values that are used in Fortran OPEN
, WRITE
, READ
, and CLOSE
statements. A typical GEOS-Chem file read operation looks like this:
USE FILE_MOD, ONLY : IU_FILE ... ! Open file OPEN( UNIT=IU_FILE, FILE=TRIM( FILENAME ) ... ) ! Read data READ( IU_FILE, IOSTAT=IOS ) ... ! Close file CLOSE( IU_FILE )
But when we connect GEOS-Chem to an external GCM, we can no longer rely on pre-defined LUNs. The GCM may have already assigned the LUN that we are trying to use to a different file. In GEOS-Chem v9-01-03 and higher versions, all LUNs are determined dynamically. We now call a function (inquireMod/findFreeLUN
) to return the next unused LUN. You will now see code that looks like this:
USE inquireMod, ONLY : findFreeLUN ... INTEGER :: IU_FILE ... ! Find a free file LUN IU_FILE = findFreeLUN() ! Open file OPEN( UNIT=IU_FILE, FILE=TRIM( FILENAME ) ... ) ! Read data READ( IU_FILE, IOSTAT=IOS ) ... ! Close file CLOSE( IU_FILE )
For more information, please see this wiki post.
Do not refer to public variables via USE statements
We used to refer to variables in modules by means of the USE
statement, such as:
! Reference variables from dao_mod.F USE DAO_MOD, ONLY : UWND, VWND, T
While many routines in GEOS-Chem still use this syntax, we are abandoning this approach. Instead of referring to module variables with USE
, we shall (wherever expedient) use derived type objects to contain variables and arrays, which then can be passed from one routine to another via the argument list. We are doinFile:G this to facilitate our GEOS-Chem HP, which seeks to embed GEOS-Chem within the NASA GEOS-5 GCM.</p>
The only things that you should obtain from modules via the USE statement will be:
- Subroutines
- Functions
- Derived type definitions
such as:
! Refer to derived type definition from gigc_state_met_mod.F90 USE State_Met_Mod, ONLY : MetState ! Refer to subroutines from dao_mod.F USE DAO_MOD, ONLY: AIRQNT, COSSZA
Here are some tips for moving arrays or adding new arrays to the derived type objects:
- Instead of finding and replacing every instance of an array with State_Chm&array throughout a module, create pointers at the top of each routine (e.g.
localarray => State_Chm%array
). This is the method we used when replacing the species array (Spc => State_Chm%Species
). - Where possible, clean up routine arguments. If State_Chm is already passed, then there is no need to also pass State_Chm%array.
- Consider adding arrays used for bpch diagnostics into State_Diag instead of State_Chm.
- Try to avoid confusing fields names in the State_Chm object. For example, in DO_DIAG_OH State_Chm%AIR_MASS may be confused with fields in State_Met. If we use pointers at the top of routines, we can still use the existing local array name to avoid having to change code deep in subroutines (e.g. AIR_MASS => State_Chm%Sum_Air_Mass).
- Allocate State_Chm fields in routine Init_State_Chm instead of local modules.
- The GCST highly recommends running unit tests and difference tests often as you make these updates. Unit tests will ensure you don't break any simulations and difference tests will ensure you aren't introducing any changes to model output.
More on derived-type objects
We are now moving away from including module variables into subroutines or functions via USE
statements. Please see this wiki post on our Derived type objects used by the Grid-Independent GEOS-Chem wiki page for more information about how we are passing data between subroutines via derived type objects.