Intel Fortran Compiler

From Geos-chem
Revision as of 14:48, 14 June 2019 by Bmy (talk | contribs) (→‎Overview)
Jump to navigation Jump to search

This page contains information about the Intel Fortran compiler (aka "IFORT" compiler).

The Intel Fortran compiler is our recommended proprietary compiler for GEOS-Chem.

Overview

An Intel license is required

Intel Fortran requires the purchase of an expensive site license. If your institution does not have the resources to purchase the Intel Fortran Compiler, then we recommend that you use the GNU Fortran compiler—which is free and open source—instead.

Documentation

You can find more information about the Intel Fortran Compiler here:

  1. Fortran 19 documentaton
  2. Intel Fortran 17

Also, normally when you installs the Intel Fortran compilers, you also will install the C and C++ compilers.

--Bob Yantosca (talk) 19:44, 10 January 2019 (UTC)

Intel Fortran Compiler versions that have been tested with GEOS-Chem

The following platforms and compilers are currently supported.

Platform Compiler Status Tested by
Linux ifort 17.0.4 and later versions Supported
  • NOTE: GEOS-Chem v10-01 and prior versions will not compile with ifort 17 and higher.
GCST
Linux ifort 15.0.0 and similar builds Supported GCST
Linux ifort 13.0.079 and similar builds Supported GCST
Linux ifort 12 Supported GCST
Linux ifort 11.1.069 and similar builds Supported GCST

--Bob Yantosca (talk) 14:48, 14 June 2019 (UTC)

Environment settings for Intel Fortran

Here is some information about how you can customize your Unix environment to use the Intel Fortran compiler.

Using the module command to load Intel Fortran and related libraries

On Unix computer systems, you can use a module manager such as Lmod to load the GNU Fortran compiler library (and its dependencies) into your Unix environment. For example, we use the following commands on the Harvard cluster (odyssey.rc.fas.harvard.edu):

 # These commands load Intel Fortran 17 on the Harvard "Odyssey" cluster
 module load intel/17.0.4-fasrc01
 module load openmpi/2.1.0-fasrc02
 module load netcdf/4.3.2-fasrc05
 module load netcdf-fortran/4.4.0-fasrc03

You can ask your IT staff what the corresponding commands would be on your particular cluster.

The module command should automatically define several environment variables for you:

Variable Expected setting Description
FC ifort Name of the Intel Fortran compiler
CC icc Name of the Intel C compiler
CXX icpc Name of the Intel C++ compiler
NETCDF_HOME System-dependent Path to the root netCDF folder
NETCDF_INCLUDE System-dependent Path to the netCDF include folder (e.g. $NETCDF_HOME/include)
NETCDF_LIB System-dependent Path to the netCDF library folder (e.g. $NETCDF_HOME/lib or $NETCDF_HOME/lib64)
NETCDF_FORTRAN_HOME System-dependent Path to the root netCDF Fortran folder
NETCDF_FORTRAN_INCLUDE System-dependent Path to the netCDF Fortran include folder (e.g. $NETCDF_FORTRAN_HOME/include)
NETCDF_FORTRAN_LIB System-dependent Path to the netCDF Fortran library folder (e.g. $NETCDF_FORTRAN_HOME/lib or $NETCDF_FORTRAN_HOME/lib64)

If these variables are not automatically set by the module command on your system (or if your system does not use the module command), then:

  • Set the FC, CC, and CXX variables manually in your startup script (e.g. .bashrc or .cshrc).
  • Ask your IT staff where the netCDF library paths are located, and set the NETCDF_HOME, NETCDF_INCLUDE, and NETCDF_LIB environment variables accordingly.

Depending on your system's configuration, you may find that the netCDF Fortran library is installed in a different folder than the netCDF C-language library. If this is the case, then the Lmod module manager should automatically define the NETCDF_FORTRAN_HOME, NETCDF_FORTRAN_INCLUDE, and NETCDF_FORTRAN_LIB environment variables. If not, then ask your IT staff what the proper paths are so that you can set these variables manually.

--Bob Yantosca (talk) 19:28, 10 January 2019 (UTC)

Requesting sufficient stack memory for GEOS-Chem

In order to run GEOS-Chem with Intel Fortran, you must request the maximum amount of stack memory in your Unix environment. (The stack memory is where local automatic variables and temporary !$OMP PRIVATE variables will be created.) Add the following lines to your system startup file:

If you use bash.
add this to your .bashrc file
If you use csh or tcsh,
add this to your .cshrc file
ulimit -s unlimited
export OMP_STACKSIZE=500m
limit stacksize unlimited
setenv OMP_STACKSIZE 500m

The ulimit -s unlimited (for bash) or limit stacksize unlimited commands tell the Unix shell to use the maximum amount of stack memory available.

The environment variable OMP_STACKSIZE must also be set to a very large number. In this example, we are nominally requesting 500 MB of memory. But in practice, this will tell the GNU Fortran compiler to use the maximum amount of stack memory available on your system. The value 500m is a good round number that is larger than the amount of stack memory on most computer clusters, but you can change this if you wish.

NOTE: Setting the OMP_STACKSIZE environment variable will make it easier to switch between different compilers on your system. The KMP_STACKSIZE environment variable only works with the Intel Fortran Compiler but not with GNU Fortran.

--Bob Yantosca (talk) 21:35, 4 October 2016 (UTC)

Performance

Please see our GEOS-Chem performance wiki page for a summary of recent timing tests done with the Intel Fortran compiler.

--Bob Yantosca (talk) 19:29, 10 January 2019 (UTC)

Optimization

In this section we present information about the various optimization options available in the Intel Fortran Compiler.

Optimization options

Here is a quick reference table of optimization options (taken from the online Intel Fortran Compiler User and Reference Guides.

Option Description How invoked in GEOS-Chem?
-O0 Turns off all optimizations. Math expressions will be evaluated in the same order in which they are written, which is necessary for debugging. If you are using a debugger (such as Totalview), compile with -g -O0. DEBUG=yes or
OPT=-O0
-O1 Enables optimizations for speed and disables some optimizations that increase code size and affect speed. The -O1 option may improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops.

Setting -O1 automatically sets the following options:

  1. -funroll-loops0,
  2. -nofltconsistency (same as -mno-ieee-fp),
  3. -fomit-frame-pointer,
  4. -ftz
OPT=-O1
-O2 (aka -O) Enables optimizations for speed. This is the generally recommended optimization level.

This option also enables:

  1. Inlining of intrinsics
  2. Intra-file interprocedural optimizations, which include:
    • inlining
    • constant propagation#
    • forward substitution
    • routine attribute propagation
    • variable address-taken analysis
    • dead static function elimination
    • removal of unreferenced variables
  3. The following capabilities for performance gain:
    • constant propagation
    • copy propagation
    • dead-code elimination
    • global register allocation
    • global instruction scheduling and control speculation
    • loop unrolling
    • optimized code selection
    • partial redundancy elimination
    • strength reduction/induction variable simplification
    • variable renaming
    • exception handling optimizations
    • tail recursions
    • peephole optimizations
    • structure assignment lowering and optimizations
    • dead store elimination

On Linux and Mac OS X systems, if -g is specified, -O2 is turned off and -O0 is the default unless -O2 (or -O1 or -O3) is explicitly specified in the command line together with -g.

Default setting
-O3 Enables -O2 optimizations plus more aggressive optimizations, such as prefetching, scalar replacement, and loop and memory access transformations.

Enables optimizations for maximum speed, such as:

  1. Loop unrolling, including instruction scheduling
  2. Code replication to eliminate branches
  3. Padding the size of certain power-of-two arrays to allow more efficient cache use.

On Linux and Mac OS X systems, the -O3 option sets option -fomitframe-pointer.

The -O3 optimizations may not cause higher performance unless loop and memory access transformations take place. The optimizations may slow down code in some cases compared to -O2 optimizations. The -O3 option is recommended for applications that have loops that heavily use floating-point calculations and process large data sets.

OPT=-O3

--Bob Y. 11:14, 3 October 2013 (EDT)

Recommended compilation and optimization options for GEOS-Chem

In this section, we present information about the compilation and optimization options that are invoked when you compile a GEOS-Chem simulation.

List of commonly-used compilation options

Here are the IFORT compilation options currently used by GEOS-Chem:

Option Description How invoked in GEOS-Chem?
Normal compiler settings
-cpp Turns on the C-preprocessor, to evaluate #if and #define statements in the source code. Default setting
-w Suppresses all compiler warnings. This is mainly a convenience to prevent excessive output to the screen or log file.

NOTE: Most compiler warnings are harmless. Execution does not stop when a warning is displayed, unlike an error message, which causes program execution to halt at the point where the error occurred.

Default setting
-O2 Optimizes the source code for speed, without taking too many liberties with numerical precision. For more information, please see the optimization options section above. Default setting
-auto This option places local variables (scalars and arrays of all types), except those declared as SAVE, on the run-time stack. It is as if the variables were declared with the AUTOMATIC attribute. It does not affect variables that have the SAVE attribute or ALLOCATABLE attribute, or variables that appear in an EQUIVALENCE statement or in a common block. Default setting
-noalign Prevents the compiler from padding bytes anywhere in common blocks and structures. Padding can affect numerical precision. Default setting
-convert big_endian Specifies that the format will be big endian for integer data and big endian IEEE floating-point for real and complex data. This only affects file I/O to/from binary files (such as binary punch files) but not ASCII, netCDF, or other file formats. Default setting
-vec-report0 Tells the compiler to suppress printing "LOOP HAS BEEN VECTORIZED" messages. This reduces the amount of output that is sent to the screen and/or GEOS-Chem log file. Default setting
-fp-model source Rounds intermediate results to source-defined precision and enables value-safe optimizations. Basically, this tells the compiler not to take too many liberties with how numerical expressions are evaluated. For more information about this option, please see our precision-safe optimization section below. This option can be disabled by compiling GEOS-Chem with the PRECISE=no Makefile option. Default setting
-traceback This option tells the compiler to generate extra information in the object file to provide source file traceback information when a severe error occurs at run time. When the severe error occurs, source file, routine name, and line number correlation information is displayed along with call stack hexadecimal addresses (program counter trace). This option increases the size of the executable program, but has no impact on run-time execution speeds. It functions independently of the debug option.
  • Default setting
    (v11-01 and higher)
  • TRACEBACK=yes
    (prior versions)
Special compiler settings
-r8 This option tells the compiler to treat variables that are declared as REAL as REAL*8 (as opposed to REAL*4.

NOTE: This option is not used globally, but is only applied to certain indidvidual files (mostly from third-party codes like ISORROPIA. Current GEOS-Chem programming practice is to use either REAL*4 or REAL*8 instead of REAL, which avoids confusion.

Used as needed
-mcmodel=medium This option is used to tell IFORT to use more than 2GB of static memory. This avoids a specific type of memory error that can occur if you compile GEOS-Chem for use with an extremely high-resolution grid (e.g. 0.25° x 0.3125° nested grid). Default setting
-shared-intel
(formerly -i-dynamic)
This option needs to be used in conjunction with -mcmodel=medium. It causes Intel-provided libraries to be linked in dynamically instead of statically (which is the default). Default setting
-ipo This option enables interprocedural optimization between files. This is also called multifile interprocedural optimization (multifile IPO) or Whole Program Optimization (WPO). When you specify this option, the compiler performs inline function expansion for calls to functions defined in separate files.

NOTE: Yuxuan Wang found that this option was useful for certain nested-grid simulations. See the this wiki post below for more information.

IPO=yes
-static This option prevents linking with shared libraries. It causes the executable to link all libraries statically.

NOTE: Yuxuan Wang found that this option was useful for certain nested-grid simulations. See the this wiki post below for more information.

IPO=yes
Settings only used for debugging
-debug all Tells the compiler turn on all debug error output. DEBUG=yes
-g Tells the compiler to generate full debugging information in the object file. This will cause a debugger (like Totalview) to display the actual lines of source code, instead of hexadecimal addresses (which is gibberish to anyone except hardware engineers). DEBUG=yes
-O0 Turns off all optmization. Source code instructions (e.g. DO loops, IF blocks) and numerical expressions are evaluated in precisely the order in which they are listed, without being internally rewritten by the optimizer. This is necessary for using a debugger (like Totalview). DEBUG=yes
-check bounds (aka -CB) Check for array-out-of-bounds errors. This is invoked when you compile GEOS-Chem with the BOUNDS=yes Makefile option. NOTE: Only use -CB for debugging, as this option will cause GEOS-Chem to execute more slowly! DEBUG=yes
-check arg_temp_created Checks to see if any array temporaries are created. Depending on how you write your subroutine and function calls, the compiler may need to create a temporary array to hold the values in the array before it passes them to the subroutine. For detailed information, please see our Passing array arguments efficiently in GEOS-Chem wiki page. DEBUG=yes
-fpe0 This option will cause GEOS-Chem to halt if any type of floating-point error is encountered. This can happen if an equation results in a denormal value, e.g. NaN, or +/-Infinity. Common causes of floating-point errors are divisions where the denominator becomes zero.
NOTE: The default compiler setting is -fpe3, which will convert many of these denormal values to zeros and then continue execution.
FPE=yes
-ftrapuv This option will assign a large numeric value to all local automatic variables. This makes it easier to identify numerical errors caused by improper initialization. FPE=yes

--Bob Y. 11:21, 3 October 2013 (EDT)

Typical settings for a GEOS-Chem simulation

The normal GEOS-Chem build uses the following IFORT compiler flags:

-cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 -fp-model source -openmp

whereas a debugging run (meant to execute in a debugger such as TotalView) will typically use these flags:

-cpp -w -O0 -auto -noalign -convert big_endian -g -DDEBUG -check arg_temp_created -debug all -fp-model source -fpe0 -ftrapuv -check bounds

NOTE: In order to avoid running out of memory if you compiling GEOS-Chem at extremely high resolution (e.g. the 0.25° x 0.3125° nested grids), we recommend adding the following flags:

-mcmodel=medium -shared-intel

These are automatically set when you compile with the NETCDF=yes or HDF=yes compiler options (in GEOS-Chem v9-01-03 and higher).

--Bob Y. 17:34, 29 February 2012 (EST)

Precision-safe optimization

You can use the following Intel Fortran Compiler options to select how aggressively you would like to optimize floating-point operations.

Default behavior

-fp-model fast

Example source code:

REAL T0, T1, T2;
...
T0 = 4.0E + 0.1E + T1 + T2; 

When this option is specified, the compiler applies the following semantics:

  1. Additions may be performed in any order
  2. Intermediate expressions may use single, double, or extended precision
  3. The constant addition may be pre-computed, assuming the default rounding mode

Using these semantics, the following shows some possible ways the compiler may interpret the original code:

REAL T0, T1, T2; 
...
T0 = (T1 + T2) + 4.1E; 

or

REAL T0, T1, T2; 
...
T0 = (T1 + 4.1E) + T2;

Preferred alternative

-fp-model source (aka -fp-model precise)

Example source code:

REAL T0, T1, T2;
...
T0 = 4.0E + 0.1E + T1 + T2; 

When this option is specified, the compiler applies the following semantics:

  1. Additions are performed in program order, taking into account any parentheses
  2. Intermediate expressions use the precision specified in the source code
  3. The constant addition may be pre-computed, assuming the default rounding mode

Using these semantics, the following shows a possible way the compiler may interpret the original code:

REAL T0, T1, T2;
...
T0 = ((4.1E + T1) + T2);

Summary

If you do not select any -fp-model option, the Intel Fortran Compiler will default to -fp-model fast. As you can see from the examples above, this may not optimize the code in the same way each time. This can lead to minor numerical noise in the output, as was seen in ISORROPIA II.

To avoid this situation, we recommend compiling all source code files with -fp-model source. This will be the new default in GEOS-Chem v9-01-02.

Reference: Intel® Fortran Floating-point Operations; Document Number: 315892-003US

--Bob Y. 17:01, 25 August 2011 (EDT)

Optimization options for faster runs

Yuxuan Wang told us about the optimization options: -ipo and -static and said these options would speed up the simulations. I've tested these options on our system at Harvard. The run with the new options show very tiny differences (much less than 1% over 1 month) compared to a run with the old options only. For a full-chemistry run (43 tracers) on 4x5 resolution and 4 processors, the run time is about 10% shorter than previously.

These options are especially efficient to handle the transport. So in simulations with a faster chemistry (like tagged tracers simulations), we expect to see a higher gain in time. For example, the time for a methane run is shorten by about 30 %.

To use these options, compile GEOS-Chem with the IPO=yes Makefile option, e.g.

make -j4 IPO=yes

--Ccarouge 15:54, 8 September 2009 (EDT)
--Bob Y. 17:50, 29 February 2012 (EST)

Optimization level for debugging

If you would like to run your code in a debugger, such as Totalview, you must use the following compiler switches:

-g -O0

Using -O0 will ensure that the source code gets executed in the same order in which it is written (i.e. this disables all compiler optimizations). The -g switch will tell the debugger to display lines of source code instead of hexadecimal memory addresses (which are more or less gibberish unless you are a hardware engineer).

GEOS-Chem will add these switches automatically for you if you compile with the DEBUG=yes option.

--Bob Y. 15:28, 22 February 2012 (EST)

Caveat about optimizing for specific chipsets

The standard GEOS-Chem build sequence does not include any optimization flags that are specific to a certain type of CPU. If you are interested, you can certainly experiment for yourself. But be aware that this may invoke certain chip-level optimizations that could potentially change the simulation output.

Jenny Fisher wrote:

I have tested the new chips & compiler option. I found that there are small differences [in difference test output]...if I use exactly the same compile commands and number of processors between our old cores and our new Broadwell cores (E5-2690 v4). The differences are very small and I think nothing to worry about.

However, adding the preferred compiler flag -xCORE-AVX2 led to much bigger differences (e.g., up to 5% difference or 10 ppb in ozone…). I haven’t investigated the differences in detail. I did run a one month benchmark comparison, and see that the differences can be consequential after a month (i.e. not just differences in regions where values are low.

I have no idea what is causing these differences. So I guess for the moment, I would recommend *not* using the specific optimisation for Broadwell/Haswell cores. However, I think it probably is ok to use the Broadwell cores without this flag. I am not sure what impact this choice will have on performance.

--Bob Yantosca (talk) 14:50, 28 March 2017 (UTC)

Known issues

The following issues occur with certain versions of the Intel Fortran Compiler. Several of these issues have now been resolved in the most recent GEOS-Chem versions. We recommend upgrading to the most recent public release if possible.

Compilation issues with Intel Fortran 18

This update was included in GEOS-Chem 12.0.1, which was released on 24 Aug 2018.

The GEOS-Chem Support Team recently tested GEOS-Chem 12.0.0 with ifort 18, which is a recent release of the Intel Fortran Compiler. An "out-of-the-box" compilation with ifort 18 resulted in these errors:

Location Problem Solution
Makefile_header.mk The -openmp switch has been retired in Intel Fortran 18.

The new option to turn on OpenMP parallelization is now called -qopenmp.

Now issue a command to get the compiler version. This is saved into the COMPILER_VERSION variable in Makefile_header.mk.

If COMPILER_VERSION is 18 or higher, use -qopenmp to activate OpenMP. Otherwise use -openmp.

--Bob Yantosca (talk) 15:51, 10 September 2018 (UTC)

Cannot compile GEOS-Chem v10-01 with Intel Fortran Compiler v17

This issue is now resolved in GEOS-Chem v11-01.

Myroslav Hordiichuk wrote:

When compiling with make -j2 MET=geosfp GRID=4x5 UCX=y CHEM=benchmark there is a problem, I can't fix. The screenshots are attached.

File:Ifort 17 error.png

File:Ifort 17 error2.png

Bob Yantosca wrote:

Thanks for writing. [Because...] you [are] using a very new version of the Intel compiler (v2017)... then that may be not able to parse the module interfaces in NcdfUtil/ncdf_mod.F90 and in HEMCO/Core/hco_diagn_mod.F90. We recently encountered this issue when trying to port GEOS-Chem v11-01 to the GNU Fortran compiler. I ended up rewriting code in these modules to avoid this issue.

What I think is happening is that, like GNU Fortran, Intel Fortran 2017 is by default using the newer Fortran 2003 or Fortran 2008 standard. This standard is more strict than the Fortran-90 language standard when it comes to the module interfaces. Long story short: I had to rewrite the DIAGN_UPDATE interface in HEMCO/Core/hco_diagn_mod.F90 to remove the OPTIONAL array arguments. This got the code to compile with GNU Fortran. Basically, instead of having only 2 subroutines contained in the DIAGN_UPDATE module interface, I had to have 6. The arguments Scalar, Array2D, Array3D cannot be OPTIONAL arguments in this case. This syntax used to be OK in the original Fortran-90 standard but evidently not in the newer F2003 or F2008 standards. We also had to modify a similar module interface in NcdfUtil/ncdf_mod.F90 accordingly.

If you go to this wiki link, you will see several updates that we had to make to GEOS-Chem v11-01 in order to get it to work with GNU Fortran in the various sub-folders of the GC source code. These updates are already included in our v11-01 development code, and thus should let v11-01 (when it is released) to compile with Intel Fortran 2017 as well.

--Bob Yantosca (talk) 20:47, 20 January 2017 (UTC)

Resetting stacksize for Linux

Overall machine memory limits are set with the Unix limit command. If you use csh or tcsh, you can set the following commands in your ~/.cshrc file:

   # Max out machine limits
   limit cputime      unlimited     # NOTE: "Unlimited" is not truly unlimited.  It 
   limit filesize     unlimited     # will set the given limit to the maximum value
   limit datasize     unlimited     # determined by your hardware configuration
   limit stacksize    unlimited     
   limit coredumpsize unlimited
   limit memoryuse    unlimited
   limit vmemoryuse   unlimited
   limit descriptors  unlimited
   limit memorylocked unlimited
   limit maxproc      unlimited

Or if you use bash, you can add these to commands your ~/.bashrc file:

   # Max out machine limits
   ulimit -t unlimited              # cputime
   ulimit -f unlimited              # filesize
   ulimit -d unlimited              # datasize
   ulimit -s unlimited              # stacksize
   ulimit -c unlimited              # coredumpsize
   ulimit -m unlimited              # memoryuse
   ulimit -v unlimited              # vmemoryuse
   ulimit -n unlimited              # descriptors
   ulimit -l unlimited              # memorylocked
   ulimit -u unlimited              # maxproc

NOTE: depending on your particular OS build (Linux, CentOS, Fedora, Ubuntu), not all of these limits will be used.

It is important to set the stacksize memory to the maximum value, because this will determine the amount of memory available for temporary variables, which are:

  • variables that are not declared with the SAVE attribute
  • variables that are not located at the top of a module
  • variables that are not located in a common block.

However, one quirk is that the stacksize memory for child processes (i.e. processes spawned by CPUs within !$OMP PARALLEL DO loops) are not set by the stacksize limit, but instead by the OMP_STACKSIZE environment variable. If OMP_STACKSIZE is not set with a high enough value, then your GEOS-Chem simulation may think it doesn't have enough memory to proceed, and may die with a segmentation fault error.

The fix for this situation is to make sure that you set OMP_STACKSIZE to a high value. It's OK if the value you give to OMP_STACKSIZE is larger than the largest amount of memory on your system. As long as it's set to a high positive number it will work.

If you use csh or tcsh, you can add this command to your ~/.cshrc file.

# Reset the child stack size to a large positive number
# (It's OK if this is larger than the max value, it's just a kludge)
setenv OMP_STACKSIZE 500m

Or if you use bash, add this command to your ~/.bashrc file:

# Reset the child stack size to a large positive number
# (It's OK if this is larger than the max value, it's just a kludge)
export OMP_STACKSIZE=500m

Resetting the OMP_STACKSIZE environment variable in this manner usually will correct the following errors:

NOTE: We now recommend that you use the OMP_STACKSIZE variable instead of the KMP_STACKSIZE variable. This will make it easier to switch between compilers on your system. OMP_STACKSIZE works with all compilers, but KMP_STACKSIZE only works with Intel Fortran.

--Bob Y. 14:05, 5 November 2014 (EST)

Out of memory asking for NNNNN

Debra Weisenstein wrote:

I'm trying to compile GEOS-CHEM (the stratospheric beta version that Seb Eastham at MIT is working on) on hpc at Harvard, and though it compiled for me early before, I've now been getting an "out of memory" error every time it tries to compile GeosCore/diag49_mod.F. I changed the compile option for mcmodel to large. Here is the compile line and error.
   ifort -cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 
   -mcmodel=large -i-dynamic -fp-model source -openmp -Dmultitask 
   -I../Headers -module ../mod -I/home/dkweis/include -c diag49_mod.F 
   Fatal compilation error: Out of memory asking for 36864.
I've gone back and tried compiling older versions that I've compiled and run before and get a similar error:
   ifort -cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 
   -fp-model source -openmp -Dmultitask -DAPM 
   -I../Headers -module ../mod -c dao_mod.F 
   Fatal compilation error: Out of memory asking for 20480.
Any idea what is going on? The "top" command shows 301780k of free memory (out of 8GB) and 906164k of free swap.

Bob Yantosca wrote:

I googled the error message (“Out of memory asking for ….”) and I found this post online. The person who replied to this post gave 3 suggestions:
  1. Use -O2.
  2. Split up the source into smaller files if possible (may be difficult if it's a module.)
  3. Raise your datasize limit
I'd start with the last. If you can't resolve it, please submit a test case to Intel Premier Support.
So it may be that your datasize limit is not maxed out in your computational environment. You can check the setting of the system limits by typing:
   limit
at the Unix prompt. You’ll get a screen like this:
   [54 bmy]% limit
   datasize     unlimited
   stacksize    unlimited
   ... etc ...
Depending on your platform (i.e. combination of hardware + operating system) you may not have all of these limits available. The two most important limits for GEOS-Chem are datasize and stacksize. These should be set to unlimited. For good measure, you can also set the other available limits for your platform to unlimited as well. Follow the instructions in this wiki post. One caveat: if you try to set a limit that does not exist on your platform to unlimited, you will get a warning message from the Unix shell. You can then delete the offending limit setting from your .cshrc or .bashrc file.

--Bob Y. 14:58, 25 July 2013 (EDT)

Bug fix for GEOS-Chem compiled with Intel Fortran Compiler 12

Prasad Kasibhatla wrote:

I am experiencing the IFORT 12 compile failure of GEOS-Chem v9-01-03 with nested grid options turned on in define.h. The compile seems to be failing in strat_chem_mod.F90 (see end of this msg). And to add - the compile succeeds for v9-01-03l with IFORT 12. I noticed that strat_chem_mod.F90 appeared post-v9-01-03l.
   ifort -cpp -w -O2 -auto -noalign -convert big_endian -vec-report0 -mcmodel=medium -i-dynamic 
   -fp-model source -openmp -Dmultitask -I../Headers -module ../mod -I/opt/geos-netcdf-4/include 
   -c -free strat_chem_mod.F90
   : catastrophic error: **Internal compiler error: segmentation violation signal raised** 
   Please report this error along with the circumstances in which it occurred in a Software Problem Report.  
   Note: File and line given may not be explicit cause of this error.
   compilation aborted for strat_chem_mod.F90 (code 1)
   make[3]: *** [strat_chem_mod.o] Error 1
   make[3]: Leaving directory `/nfs/fire/psk9/Code.v9-01-03/GeosCore'
   make[2]: *** [lib] Error 2
   make[2]: Leaving directory `/nfs/fire/psk9/Code.v9-01-03/GeosCore'
   make[1]: *** [all] Error 2
   make[1]: Leaving directory `/nfs/fire/psk9/Code.v9-01-03/GeosCore'
   make: *** [all] Error 2 

Bob Yantosca wrote:

I have this version of IFORT installed locally:
   [67 bmy Code.v9-02]% ifort -V
   Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 12.1.3.293 Build 20120212
   Copyright (C) 1985-2012 Intel Corporation.  All rights reserved.
   FOR NON-COMMERCIAL USE ONLY
And I got the exact same error as you did. I looked on the internet and it seemed that the issue is related to how IFORT 12 deals w/ OpenMP DO loops (click HERE and HERE).
A little brute-force debugging revealed the offending parallel DO loop in strat_chem_mod.F90, in routine CALC_STE:
   ! Determine mean tropopause level for the period
   !$OMP PARALLEL DO                               &
   !$OMP DEFAULT( SHARED )                         &
   !$OMP PRIVATE( I,  J  )
   DO J = 1,JJPAR
   DO I = 1,IIPAR
      LTP(I,J) = NINT( TPauseL(I,J) / TPauseL_Cnt )
   ENDDO
   ENDDO
   !$OMP END PARALLEL DO
I also noticed that this DO loop is only executed for global simulations. Immediately preceding it there is an #if statement that exits out of the routine if you are doing an nested grid run:
   #if defined( NESTED_NA ) || defined( NESTED_CH ) || defined( NESTED_EU )
       ! This method only works for a global domain.
       ! It could be modified for nested domains if the total mass flux across the
       ! boundaries during the period is taken into account.
       RETURN
   #endif
So given that this code doesn’t get used for nested runs, I added an #else block so that the code isn’t compiled in when you compile for nested grids:
   #if defined( NESTED_NA ) || defined( NESTED_CH ) || defined( NESTED_EU )
       ! This method only works for a global domain.
       ! It could be modified for nested domains if the total mass flux across the
       ! boundaries during the period is taken into account.
       RETURN
   !------------------------------------------------------------------------------
   ! Prior to 10/5/12:
   ! Since the rest of this code isn't needed for the nested grid, wrap it
   ! in an #else statement.  This might help the code to compile for IFORT 12.
   !#endif
   !------------------------------------------------------------------------------
   #else

       ! Determine mean tropopause level for the period
       !$OMP PARALLEL DO                               &
       !$OMP DEFAULT( SHARED )                         &
       !$OMP PRIVATE( I,  J  )
       DO J = 1,JJPAR
       DO I = 1,IIPAR
          LTP(I,J) = NINT( TPauseL(I,J) / TPauseL_Cnt )
       ENDDO
       ENDDO
       !$OMP END PARALLEL DO

       ... etc ...

   #endif

     END SUBROUTINE Calc_STE
I’ve compiled this with the NA nested grid, the CH nested grid, and for 2 x 2.5 and 4 x 5 global GEOS-5 simulations, and the code does not get that compilation error. I’ve pushed this to the GEOS-Chem repository as a last-minute bug fix. I’m not sure why the error happens in the first place w/ the IFORT 12 compiler, but in any case, this fixes it.

--Bob Y. 13:52, 5 October 2012 (EDT)

Error compiling with IFORT 12 and Mac OS X

David Lary wrote:

I am using Intel(R) 64, Version 12.1.3.289 Build 20120130 on Mac OS X 10.7.4. When I typed make, I get the following error:
   ld: library not found for -lcrt1.10.6.o
   make[2]: *** [exe] Error 1
   make[1]: *** [all] Error 2
   make: *** [all] Error 2

Bob Yantosca replied:

I think this is an issue particular to IFORT running on Mac OS X. It is probably not finding the proper library. I found a couple of posts (post #1 and post #2) on the internet that describe this error.
Post #2 recommends adding the following path to your LD_LIBRARY_PATH environment variable:
   /Developer/SDKs/MacOSX10.4u.sdk/usr/lib. 
   (i.e. $(SDKROOT)/usr/lib)

--Bob Y. 10:43, 26 June 2012 (EDT)

Compilation error with IFORT 12

Geert Vinken wrote:

I was trying to compile the GEOS-Chem v9-01-03k version, but I was getting a strange error when trying to do this with ifort 12.1.2:
   biofuel_mod.F(363): error #5082: Syntax error, found IDENTIFIER 'X1' when expecting one of: ( % [ : . = =>
                     GENERIC8_1x1 = GENERIC_1x1 
   ----------------------------^
   biofuel_mod.F(363): error #6404: This name does not have a type, and must have an explicit type.   [GENERIC8_1]
                     GENERIC8_1x1 = GENERIC_1x1 
   ------------------^
   biofuel_mod.F(363): error #6460: This is not a field name that is defined in the encompassing structure.   [X1]
                     GENERIC8_1x1 = GENERIC_1x1 
   ----------------------------^
   biofuel_mod.F(363): error #6366: The shapes of the array expressions do not conform.   [X1]
                     GENERIC8_1x1 = GENERIC_1x1 
   ----------------------------^ 
   compilation aborted for biofuel_mod.F (code 1)
Compiling with ifort 11.1 shows no problems at this module. Any of you ever heard of this problem or know a way around it? If I uncomment this line the model compiles, so it's just this line (and apparently the combination of 8_1x1 =) that's causing the problem.
Also, renaming the GENERIC8_1x1 array to GENERIC8 seems to solve the problem.

Bob Yantosca replied:

I think the default behavior of IFORT 12.1 has changed to adopt one of the more modern Fortran standards (F2003 or F2008). I found this entry in the Intel Fortran Compiler 12.1 manual which describes how one specifies floating-point constants.
As shown in the entry above, the IFORT 12 compiler now interprets an underscore immediately preceded and followed by numbers as an alternate way to specify numbers in scientific notation. For example, the traditional way of representing a REAL*4 and REAL*8 constants:
   REAL*4, PARAMETER :: PI = 3.14159e0
   REAL*8, PARAMETER :: PI = 3.141592658979323d0
can now also be represented as:
   REAL*4, PARAMETER :: PI = 3.14159e0_4
   REAL*8, PARAMETER :: PI = 3.141592658979323e0_8
Therefore, the string:
   GENERIC8_1x1
is probably now parsed by the compiler as an exponential. This more than likely separates the string into substrings GENERIC8_1x1 and x1, which is what generates the error.
I have looked for a compiler switch in the IFORT 12.1 manual that would disable this feature, but have not been successful. The short-term solution is to rename the variables so that it does not have an underscore surrounded by two numbers (e.g. from GENERIC8_1x1 to GENERIC_8_1x1).

--Bob Y. 18:13, 23 May 2012 (EDT)

Relocation truncated to fit error

If your code uses many large arrays, or if you are compiling an ultra-fine resolution version of GEOS-Chem (e.g. a 0.25° x 0.3125° GEOS-FP nested grid), then you may see this type of error:

Relocation truncated to fit: R_X86_64_32S against `.bss' Error"

The wording you get may differ slightly than the example shown above.

Long story short: IFORT is telling you that your program is trying to use more than 2GB of statically-allocated data (i.e. data space that is not declared with an ALLOCATABLE statement) at compile time. The default setting in IFORT is to expect to use less than 2GB of memory, so you are hitting the upper limit.

The solution is simple: recompile your code with the following compiler flags:

-mcmodel=medium -shared-intel

The -mcmodel=medium flag will tell IFORT that you expect to use more than 2GB of statically-allocated memory in your program. However, this also requires that you use link using dynamic libraries instead of the normal shared libraries. Using the -shared-intel flag will turn on the dynamic library linking. (Starting with GEOS-Chem v9-01-03, these compiler flags will be applied to the build sequence automatically.)

IMPORTANT NOTE! If your code links to any libraries such as HDF or netCDF, then you MUST rebuild each library, making sure that the Fortran and C compilers use the -mcmodel=medium option. Please see our Installing libraries for GEOS-Chem page for examples.

GEOS-Chem v9-01-03 and higher will automatically set these flags for you.

For more information, please see the following links:

  1. Typist vs. Programmer blog
  2. MITGCM support blog
  3. Software.intel.com blog

--Bob Y. (talk) 18:24, 25 September 2015 (UTC)

Problems with IFORT 11.0.xxx

You should use GEOS-Chem with IFORT 11.1.058 or higher versions. Please see the discussion below about problems in the earlier versions of IFORT 11.0.xxx:

Tzung-May Fu wrote:

I tested the Intel Fortran v11.0.074 compiler, but found that it is incompatible with the GC code. This is related to the partition.f bug that I reported earlier. (Actually, I'm not sure there is a bug in partition.f any more, unless you have also run into it with IFORT v10).
I ran a 1-day simulation, using Bob's v8-01-03 standard run release, with no change at all. Using Intel Fortran v10.1.015, I was able to replicate Bob's standard run. However, when I switched to Intel Fortran v11.0.074, I ran into the error in partition.f, due to the CONCNOX-SUM1 < 0d0 check. Here's the error message in log:
   ===============================
   GEOS-CHEM ERROR: STOP 30000
   STOP at partition.f
   ===============================
I then tried Bob's fix to partition.f. This time the run finishes, warning the user about the CONCNOX-SUM1 < 0d0 issue. But the output result is completely wacky!!! Below you can compare the surface Ox concentrations, using
The (B) spatial pattern is completely off. NOx is also affected and shows the similar weird pattern.
I'm pretty sure the problem is in the chemistry part. I've tried turning off the optimization but the problem persists. Perhaps there is some problem with the way IFORTv11 treats floating points? Also, I am not sure if IFORTv11 caused the weird model result, or if IFORTv11 caused some issues in chemistry, and the partition.f 'fix' subsequently lead to the weird result.
Long story short, it seems like IFORTv11 is not a good choice for now, and that the 'fix' to partition.f should not be implemented.

Philippe Le Sager wrote:

Thanks for testing Ifort11. We did run into the partition bug with Ifort10 after fixing tpcore. So I doubt that the weird result is related to that partition fix, and it is probably just a problem with IFORT 11.

Bob Yantosca wrote:

You might have to go thru the IFORT 11 manuals to see if any default behavior has changed (i.e. optimization, compiler options, etc). It may not just be the concnox thing but something else in the numerics that is particular to IFORT 11.
There is usually a "What's new" document w/ every Intel compiler release. Maybe that has some more information, you could look at it.

Bob Yantosca wrote:

I've also heard from some folks @ NASA that IFORT 11.0 was problematic. They claim that IFORT 11.1 is much better. You may want to look into this in the meantime.

--Bob Y. 16:50, 7 October 2009 (EDT)

Eric Sofen wrote:

Both Becky Alexander and I have run into problems with IFORT 11.1. When either of us run offline aerosol simulations compiled on IFORT 11.1, the simulation compiles and runs without errors, but the sulfur budgets are way off. The problems seem to be occurring in the deposition code, as Becky's simulations end up with very little deposition, but at the same time, the S burdens are too low. In my case, the deposition ends up being an order of magnitude too high. Changing back to IFORT 10 fixed both of these problems.

--Eric Sofen 13:32, 22 October 2009

Yuxuan Wang wrote:

From our interaction with the Intel people, ifort 11.1.056 should work for GEOS-Chem. The GC version we tested at Tsinghua is v8-02-01 (nested-grid China with GEOS-5 meteorology). The platform we tested is Nehalem from Intel, with the following compilation options:
 -cpp -w -static -fno-alias -O2 -safe_cray_ptr -no-prec-sqrt -no-prec-div -auto -noalign -convert big_endian
Not sure whether these options will work for Mac OSX. From the testing, we found that codes compiled with ifort 11.1.056 ran at 2% faster than ifort 10.1.008.

--Bob Y. 14:59, 4 November 2009 (EST)

Problem with IFORT 11 and GEOS-Chem adjoint

Nicolas Bousserez wrote:

We have been struggling for some time with the following problem when running GC adjoint (v8-02-01):
  "OMP abort: Initializing libguide.so, 
but found libguide.so already initialized".
After some investigations it seems like it is a linker error generated when different parts of the program try to link both static and dynamic verions of the OpenMP runtime. There is an option in ifort 11 to have openmp linked statically, which theoretically should fix this problem.
But using ifort 11 for GC seems to cause other problems and this compilation option doesn't exist with ifort 10. The fact is that Daven Henze, who is using ifort 10 and a linux platform similar to ours never got the above problem. Has anyone got this error before? My platform configuration is the following:
   Linux node9 2.6.9-89.0.23.ELsmp #1 SMP Wed Mar 17 06:49:21 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
If anyone running GC adjoint has a similar configuration please let me know what is your Makefile (using netcdf libraries) configuration and which version of ifort you're using so that I can do some testing.

--Bob Y. 09:43, 8 April 2011 (EDT)

Incompatibility between IFORT 11 and OS version

If you are using a the Intel Fortran Compiler version 11, you may encounter some incompatibilities with your operating system, which might require an OS upgrade.

Nicolas Bousserez wrote:

We have been struggling for some time with the following problem when running GC adjoint (v8-02-01). We get this error:
   "OMP abort: Initializing libguide.so, but we but found libguide.so already initialized".
After some investigations it seems like it is a linker error generated when different parts of the program try to link both static and dynamic verions of the OpenMP runtime. There is an option in ifort 11 to have openmp linked statically, which theoretically should fix this problem. But using ifort 11 for GC seems to cause other problems and this compilation option doesn't exist with ifort 10. The fact is that Daven Henze, who is using ifort 10 and a linux platform similar to ours never got the above problem. Has anyone got this error before? My platform configuration is the following:
   Linux node9 2.6.9-89.0.23.ELsmp #1 SMP Wed Mar 17 06:49:21 EDT 2010
   x86_64 x86_64 x86_64 GNU/Linux

Nicolas Bousserez wrote:

For what it's worth, this is the oldest OS we're using:
   Linux terra-01.vpn.as.harvard.edu 2.6.18-194.3.1.el5_lustre.1.8.4 #1 SMP 
   Fri Jul 9 21:55:24 MDT 2010 x86_64 x86_64 x86_64 GNU/Linux
and this is the newest:
   Linux kvm-12.s.as.harvard.edu 2.6.18-194.32.1.el5 #1 SMP 
   Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
What is relevant is the 2.6.9 and the 2.6.18, not the compilation dates. It means you're running something equivalent to a RHEL4/CentOS-4 kernel instead of RHEL5/CentOS-5, which has implications for your libraries, compatibility, bugs, security, etc. I would guess that you've been updating RHEL4 (first released 2005) or equivalent for several years, and ifort11 was released during the era of RHEL5 (first released 2007), so it wouldn't be too surprising if there were a library incompatibility. I don't know whether that is the cause of your symptom, but it might be. (RHEL6 was released late last year and v12 of the Intel compilers have also been released. CentOS-6 will be out soon.)
You can continue to use the older compiler with the older OS, but I'd recommend upgrading the OS, which is worth doing anyway.

--Bob Y. 10:25, 13 April 2011 (EDT)

Speedup With Hyperthreading on Nehalem chips

Hyperthreading is when a job uses more threads than there are actual CPU cores. I've noticed that using 16 threads ($OMP_NUM_THREADS = 16) on an 8-core system (2 x quad core Intel Nehalem X5570's) leads to a 15% speedup over using 8 threads. These tests were with GEOS-Chem v8-02-03, full chemistry, 2x2.5, ifort 10.1.021, and

 FFLAGS = -cpp -w -O3 -auto -noalign -convert big_endian -g -traceback -CB -vec-report0.   

This does not have a positive impact when using earlier generations of Intel chips (Harpertown or Clovertown).

--Daven Henze 1:42, 16 December 2009 (MDT)

Performance bottleneck caused by inefficient subroutine calls

Special care has to be taken when passing pointer arrays or sub-fields of dervied type objects to subroutines. If this is done incorrectly, it can cause a huge performance slowdown. Please see the discussion on our Passing array arguments efficiently in GEOS-Chem wiki page for full details.

--Bob Y. 10:49, 10 June 2013 (EDT)

Bugs in the IFORT compiler cause HEMCO to segfault

The GEOS-Chem Support Team has recently determined that bugs in the Intel Fortran Compiler versions 14 and 15 have caused the Harvard-NASA Emissions Component (aka HEMCO) to halt with segmentation faults. For more information, please see these wiki posts:

  1. IFORT 15 error when using array-out-of-bounds error checking
  2. IFORT 13/IFORT 14 segmentation fault error

--Bob Y. (talk) 20:38, 25 August 2015 (UTC)