Difference between revisions of "Installing libraries for GEOS-Chem"

From Geos-chem
Jump to: navigation, search
(Install libraries with Spack: basic usage)
Line 1: Line 1:
On this page we provide instructions on how to install netCDF-4 and related libraries for GEOS-Chem.
+
#REDIRECT [[Libraries and file formats used by GEOS-Chem]]
 
+
== A brief introduction to netCDF ==
+
 
+
GEOS-Chem reads and writes data using the netCDF file format. NetCDF is a self-describing file format that can store data fields as well as the relevant "metadata", or information about the contents of the file. Types of metadata include descriptive names, units, horizontal and vertical, coordinates, file creation date/time, file history, etc.
+
 
+
The [http://www.unidata.ucar.edu/software/netcdf/docs/faq.html netCDF frequently asked questions (FAQ)] guide gives this short overview of netCDF:
+
 
+
<blockquote>NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The netCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.
+
 
+
NetCDF data is:
+
 
+
*Self-Describing. A netCDF file includes information about the data it contains.
+
*Portable. A netCDF file can be accessed by computers with different ways of storing integers, characters, and floating-point numbers.
+
*Scalable. A small subset of a large dataset may be accessed efficiently.
+
*Appendable. Data may be appended to a properly structured netCDF file without copying the dataset or redefining its structure.
+
*Sharable. One writer and multiple readers may simultaneously access the same netCDF file.
+
*Archivable. Access to all earlier forms of netCDF data will be supported by current and future versions of the software.
+
</blockquote>
+
 
+
There are two commonly-used major versions of netCDF in use today:
+
 
+
#netCDF-3, aka "netCDF classic".
+
#netCDF-4
+
 
+
The major difference between the two versions is that netCDF-4 relies on the HDF5 library "under the hood" whereas netCDF-3 does not. For this reason, netCDF-4 can be used to store more data per file than netCDF-3.
+
 
+
A netCDF installation contains library files (ending in <tt>.a</tt>) , which hold compiled utility routines meant to be called from programs written in C or Fortran. In netCDF-4.1 and prior versions, the C-language library file (<tt>libnetcdf.a</tt>) and the Fortran-language library file (<tt>libnetcdff.a</tt>) were always installed into the same folder by default. But starting with netCDF-4.2, the netCDF Fortran libraries now must be built from a separate distribution package. Because of this new configuration, you might find that the <tt>libnetcdff.a</tt> (Fortran) and <tt>libnetcdf.a</tt> (C) library files are stored in separate folders on your system. Ask your IT staff for more information about how netCDF is installed on your system.  See [[#netCDF 4.2 and later versions require a separate netCDF-Fortran installation|this section below]] for more information.
+
 
+
== Check to see if netCDF is already installed on your system ==
+
 
+
If you are going to use GEOS-Chem on a shared computer system, chances are that your IT staff will have already installed one or more netCDF library versions that you can use.  Depending on your system's setup, there are several ways that you can tell your computational environment where to find the netCDF library files, as described below.
+
 
+
=== Using modules ===
+
 
+
Many high-performance computing (HPC) clusters use the '''Lmod module software'''.  Lmod allows you to load different compilers and libraries with  simple <tt>module load</tt> commands.  For example, on the Harvard Odyssey cluster, compiler and netCDF libraries are initialized with commands such as these:
+
 
+
module purge
+
module load gcc/8.2.0-fasrc01
+
module load openmpi/3.1.1-fasrc01
+
module load netcdf/4.1.3-fasrc02
+
 
+
The first line removes all pre-loaded modules.  The second line loads the GNU C and Fortran compilers (version 8.2.0).  The third and fourth lines openmpi 3.1.1 (which netCDF depends on), and finally netCDF 4.1.3 itself.  You can add these module load statements into your system startup files (e.g. <tt>.bashrc</tt>, <tt>.bash_aliases</tt>), etc.
+
 
+
As a convenience, Lmod will also export the relevant folder paths to your Unix environment.  For example, issuing the above module statements on the Harvard Odyssey cluster will export the following environment variables:
+
 
+
$GCC_HOME        # Home folder for gcc 8.2.0
+
$GCC_INCLUDE    # Folder where include files of gcc 8.2.0 are stored
+
$GCC_LIB        # Folder where library files of gcc 8.2.0 are stored
+
$MPI_HOME        # Home folder for openmpi 3.1.1
+
$MPI_INCLUDE    # Folder where include files (e.g. mpi.h) of openmpi 3.1.1 are stored
+
$MPI_LIB        # Folder where library files (e.g. libmpi*.a) openmpi 3.1.1 are stored
+
$NETCDF_HOME    # Home folder for netcdf-4.1.3
+
$NETCDF_INCLUDE  # Folder where include files (e.g. netcdf.h, netcdf.inc) are stored
+
$NETCDF_LIB      # Folder where library files (e.g. libnetcdf.a, libnetcdff.a) for netCDF 4.1.3 are stored
+
 
+
You can then use these environment variables to tell GEOS-Chem where it can find the netCDF libraries on your system.  See [[Setting_Unix_environment_variables_for_GEOS-Chem|our ''Setting Unix environment variables for GEOS-Chem'' wiki page]] for more information.  NOTE: The names of these environment variables may be different on your system (ask you IT staff for more information).
+
 
+
Lmod makes it very easy to switch between different compilers and libraries.  To load the netCDF libraries that were built with the Intel Fortran Compiler, all one has to do is to use a different set of <tt>module load</tt> statements, such as:
+
 
+
module purge
+
module load intel/17.0.4-fasrc01
+
module load openmpi/2.1.0-fasrc02
+
module load netcdf/4.3.2-fasrc05
+
module load netcdf-fortran/4.4.0-fasrc03
+
 
+
''NOTE: For an explanation of why netcdf-fortran is loaded as a separate module, [[#netCDF_4.2_and_later_versions_require_a_separate_netCDF-Fortran_installation|please see this section below]].''
+
 
+
One downside of using Lmod is that you are locked into using only those compiler and software versions that have already been installed on your system by your IT staff.  For example, an update to computer model that you are using might also updating to a new compiler version that is not yet available on your system. In this case, you will need to request that your IT staff install the new compiler version for you (and wait for them to do it).  But in general, Lmod succeeds in ensuring that only well-tested compiler/software combinations are available to users.
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:58, 9 January 2019 (UTC)
+
 
+
=== Manual library installation ===
+
 
+
If your computer system does not use [[#Using modules|Lmod]], then the netCDF libraries may have already been installed by your IT staff in one of the usual Unix folder locations (such as <tt>/usr/lib</tt> or <tt>/usr/local/lib</tt>).  If this is the case, ask your IT staff where these libraries reside.
+
 
+
Once you know the location of the compiler and netCDF libraries, you can [[Setting_Unix_environment_variables_for_GEOS-Chem|set the proper environment variables for GEOS-Chem]].
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:08, 9 January 2019 (UTC)
+
 
+
=== Library installation on the cloud ===
+
 
+
If you are using GEOS-Chem on the Amazon Web Services cloud computing platform, then the netCDF libraries will already be installed for you, either as part of the Amazon Machine Image (AMI) or software container (e.g. Docker or Singularity) that you used to initialize your computational environment.  The proper [[Setting Unix environment variables for GEOS-Chem|Unix environment variables]] will also be defined.
+
 
+
For more information, please see our comprehensive cloud computing tutorial: [http://cloud-geos-chem.org '''cloud.geos-chem.org''']
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:43, 9 January 2019 (UTC)
+
 
+
=== netCDF 4.2 and later versions require a separate netCDF-Fortran installation ===
+
 
+
In our section on [[#Using modules|the Lmod module system]] above, we used the following example commands to load libraries that are compatible with GNU Fortran 8.2.0:
+
 
+
module purge
+
module load gcc/8.2.0-fasrc01
+
module load openmpi/3.1.1-fasrc01
+
module load netcdf/4.1.3-fasrc02
+
 
+
But later on in that same section, we listed a different set of <tt>module load</tt> commands to load libraries that are compatible with Intel Fortran Compiler 17.0.4:
+
 
+
module purge
+
module load intel/17.0.4-fasrc01
+
module load openmpi/2.1.0-fasrc02
+
module load netcdf/4.3.2-fasrc05
+
module load netcdf-fortran/4.4.0-fasrc03
+
 
+
You might have noticed that have loaded netcdf-fortran as a separate module for Intel Fortran but not for GNU Fortran.  What is the reason for this?
+
 
+
As it turns out, in all netCDF versions up to 4.1.3, the library files for the C-language interface (<tt>libnetcdf.a</tt>) and the Fortran-language interface (<tt>libnetcdff.a</tt>) were always stored in the same folder.  But in netCDF 4.2.0 (circa 2010) and later versions, the Fortran-language interface to netCDF was moved to a completely separate distribution, with its own version numbering system.  Therefore, if you are using a netCDF package greater than 4.2.0, you have to install netCDF-Fortran as a completely separate library.
+
 
+
If your computer system uses the [[#Using modules|Lmod module software]], then loading the netcdf-fortran module will also export the following environment variables to your Unix environment:
+
 
+
NETCDF_FORTRAN_HOME    # Root folder of the netCDF Fortran-language interface
+
NETCDF_FORTRAN_INCLUDE  # Folder where netCDF-Fortran include files (e.g. netcdf.inc) are stored
+
NETCDF_FORTRAN_LIB      # Folder where netCDF-Fortran library files (e.g. libnetcdff.a) are stored.
+
 
+
These are analogous to <tt>NETCDF_HOME</tt>, <tt>NETCDF_INCLUDE</tt>, and <tt>NETCDF_LIB</tt> [[#Using modules|as mentioned above]].
+
 
+
Long story short:
+
 
+
# If you are using netCDF-4.2.0 and later (which are the most recent versions of netCDF), look for a separate netCDF-Fortran installation
+
# If you are using netCDF-4.1.3 and prior, then there is no separate netCDF-Fortran installation
+
 
+
GEOS-Chem is designed to work with any version of netCDF, regardless if the netCDF-Fortran installation is separate or not.
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:39, 9 January 2019 (UTC)
+
 
+
== If you do not have netCDF on your system, use Spack to install it ==
+
 
+
The GEOS-Chem-Libraries installer contains library versions that might not be compatible with the most recent Linux/Ubuntu/Fedora operating systems and gcc/gfortran and icc/ifort compilers.  In particular, several users have noticed that the build fails for gcc/gfortran compiler versions 4.8 and higher.
+
 
+
If you do not already have a pre-built netCDF library on your system, we recommend using [https://spack.readthedocs.io/en/latest/ the Spack package manager] to install the required libraries.  Spack should be able to install netCDF and all required libraries for a variety of compiler/platform combinations (including Linux and MacOS).
+
 
+
=== Download Spack ===
+
 
+
Clone the Spack repository, which is hosted at Github, to your disk space:
+
 
+
git clone https://github.com/spack/spack.git
+
 
+
=== Set up the environment for Spack ===
+
 
+
After cloning Spack, you need to run a script that initializes the environment for Spack.
+
 
+
If you use bash, then type:
+
 
+
# For bash/zsh users
+
$ export SPACK_ROOT=/path/to/spack
+
$ . $SPACK_ROOT/share/spack/setup-env.sh
+
 
+
If you use csh or tcsh, then type:
+
 
+
# For tcsh or csh users (note you must set SPACK_ROOT)
+
$ setenv SPACK_ROOT /path/to/spack
+
$ source $SPACK_ROOT/share/spack/setup-env.csh
+
 
+
In both examples above, the <tt>/path/to/spack</tt> is the top-level Spack directory folder on your disk.
+
 
+
After running the setup-env script, the path to the Spack executable will be added to your Unix PATH variable.  Therefore, you can just run Spack commands by typing <tt>spack COMMAND-NAME</tt>.
+
 
+
=== Make sure that Spack can find your compiler ===
+
 
+
Before you execute Spack, check to see if Spack recognizes your compiler.  Type:
+
 
+
  spack compilers
+
 
+
You should get output similar to this:
+
 
+
==> Available compilers
+
-- gcc ubuntu18.04-x86_64 ---------------------------------------
+
gcc@7.3.0
+
 
+
The compilers should be added to your <tt>~/.spack/compilers.yaml</tt> file (sometimes this is located at <tt>~/.spack/linux/compilers.yaml</tt>, depending on the operating system).  To check to see if Spack has found your compiler, type:
+
 
+
spack config get compilers
+
 
+
whichs should give you output similar to this:
+
 
+
compilers:
+
- compiler:
+
    environment: {}
+
    extra_rpaths: []
+
    flags: {}
+
    modules: []
+
    operating_system: ubuntu18.04
+
    paths:
+
      cc: /usr/bin/gcc-7
+
      cxx: /usr/bin/g++-7
+
      f77: /usr/bin/gfortran-7
+
      fc: /usr/bin/gfortran-7
+
    spec: gcc@7.3.0
+
    target: x86_64
+
 
+
If the cc, cxx, f77, and fc variables are blank, then you can manually specify the paths to these compilers by typing:
+
 
+
spack config edit compilers
+
 
+
Once you are sure that Spack "sees" your compiler, then you may proceed to installing libraries.
+
 
+
For more information, please see the [https://spack.readthedocs.io/en/latest/getting_started.html#compiler-configuration ''Compiler configuration'' section of the Spack manual].
+
 
+
=== Install libraries with Spack: basic usage ===
+
 
+
Once you have make sure that Spack has found the compiler, you can proceed to installing the libraries.  Here are the commands that you need:
+
 
+
{| border=1 cellspacing=0 cellpadding=5
+
|-bgcolor="#CCCCCC" valign="top"
+
!width="50px"|#
+
!width="400px"|Configuration
+
!width="525px"|Spack installation commands
+
 
+
|-valign="top"
+
|1
+
|Install netCDF for use with both GEOS-Chem "Classic" and [[GEOS-Chem HP|GCHP]] (also includes the MPI library)<br>'''''THIS IS OUR RECOMMENDED CONFIGURATION'''''
+
|<tt>spack install netcdf-fortran</tt>
+
 
+
|-valign="top"
+
|2
+
|Install netCDF for use with GEOS-Chem "Classic" only<br>(i.e. will NOT install the MPI library)
+
|<tt>spack install netcdf-fortran <span style="color:green">^netcdf</span><span style="color:orange">~mpi</span> <span style="color:green">^hdf5</span><span style="color:orange">~mpi</span></tt>
+
 
+
|}
+
 
+
The commands will tell Spack to download and install the '''netCDF Fortran-language library''' along with all of its dependent libraries (such as the '''netCDF C-language library''', the '''HDF-5 library''', an '''MPI library''', etc.).  By default, Spack picks the most recent library versions that are available, but you can modify this behavior this with the commands described below.  The installation can take about 30-60 minutes, depending on the options that you specify.
+
 
+
If you think you will be using GEOS-Chem in both its "Classic" mode and its high-performance mode (aka GCHP), we recommend that you install netCDF with MPI (Configuration #1).  If you are only going to use GEOS-Chem in "Classic" mode, you can omit installing MPI (Configuration #2).
+
 
+
''NOTE: For more information on why the netCDF C-language and Fortran-language libraries are installed separately, [[#netCDF_4.2_and_later_versions_require_a_separate_netCDF-Fortran_installation|please see this section]].
+
 
+
=== Customize the installation process ===
+
 
+
Using Spack to install [[#Install libraries with Spack|our recommended library configuration listed above)]] should be sufficient for most GEOS-Chem applications.  But Spack also lets you customize certain aspects of the installation process.  The table below gives some common examples:
+
 
+
{| border=1 cellspacing=0 cellpadding=5
+
|-bgcolor="#CCCCCC" valign="top"
+
!width="525px"|Action
+
!width="550px"|Spack commands
+
 
+
|-valign="top"
+
|Tell Spack to build libraries without depending on other libraries<br>(e.g. build <span style="color:green">netCDF</span> and its <span style="color:green">HDF5</span> dependency <span style="color:orange">without MPI</span>)
+
|<tt>spack install netcdf-fortran <span style="color:green">^netcdf</span><span style="color:orange">~mpi</span> <span style="color:green">^hdf5</span><span style="color:orange">~mpi</span></tt>
+
 
+
|-valign="top"
+
|Tell Spack to install <span style="color:red">specific library versions</span> instead of the most recent versions:
+
|<tt>spack install netcdf-fortran<span style="color:red">@4.4.0</span> netcdf<span style="color:red">@4.6</span></tt>
+
 
+
|-valign="top"
+
|Tell Spack to install libraries using a <span style="color:blue">specific compiler version</span>:
+
*You might need to explicitly specify the compiler to use if Spack found multiple compilers on your system (e.g. Intel, GNU)
+
|<tt>spack install netcdf-fortran <span style="color:blue">%gcc@8.2.0</span></tt>
+
 
+
|}
+
 
+
For more information about customization, please see the Spack beginner's tutorial:
+
 
+
* [https://spack.readthedocs.io/en/latest/tutorial_basics.html Spack Tutorial 101]
+
 
+
'''IMPORTANT NOTE:''' You can use Spack to install multiple versions of each library (e.g. netCDF for Intel Fortran, netCDF for GNU Fortran, or different netCDF versions for a given compiler).  Spack will keep each library installation separate from each other, and you can just simply pick which one you would like to use with GEOS-Chem (see next section).
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 16:01, 11 June 2019 (UTC)
+
 
+
=== Define GEOS-Chem environment variables that point to the Spack library paths ===
+
 
+
We recommend that you use the <tt>spack location</tt> command as shown below to automatically insert the paths to the netCDF and netCDF-Fortran libraries into one of your Unix environment startup scripts (such as <tt>.bashrc</tt> or  <tt>.bash_aliases</tt>):
+
 
+
# Environment variables for the netCDF C-language interface
+
export NETCDF_HOME=<span style="color:red">$(spack location -i netcdf)</span>
+
export GC_BIN=$NETCDF_HOME/bin
+
export GC_INCLUDE=$NETCDF_HOME/include
+
export GC_LIB=$NETCDF_HOME/lib
+
+
# Environment variables for the netCDF Fortran-languge interface
+
export NETCDF_FORTRAN_HOME=<span style="color:red">$(spack location -i netcdf-fortran)</span>
+
export GC_F_BIN=$NETCDF_FORTRAN_HOME/bin
+
export GC_F_INCLUDE=$NETCDF_FORTRAN_HOME/include
+
export GC_F_LIB=$NETCDF_FORTRAN_HOME/lib
+
 
+
If you have more than one version of the same library installed, then you can use specifiers in the <tt>spack location</tt> command to explicitly request a given version:
+
 
+
# Get the path for <span style="color:red">netCDF 4.7.0</span> compiled with <span style="color:blue">GNU Fortran 7.3.0</span>
+
export NETCDF_HOME=$(spack location -i <span style="color:red">netcdf@4.7.0</span><span style="color:blue">%gcc@7.3.0</span>)
+
... etc ...
+
+
# Get the path for <span style="color:darkorange">netCDF-Fortran 4.4.0</span> compiled with <span style="color:blue">GNU Fortran 7.3.0</span>
+
export NETCDF_FORTRAN_HOME=$(spack location -i <span style="color:darkorange">netcdf-fortran@4.4.0</span><span style="color:blue">%gcc@7.3.0</span>)
+
... etc ...
+
 
+
If you need to manually find the paths where Spack has installed the netCDF and netCDF-Fortran libraries, you can use these commands:
+
 
+
spack find --paths netcdf netcdf-fortran
+
 
+
which will display output similar to this:
+
 
+
==> 2 installed packages
+
-- linux-ubuntu18.04-x86_64 / gcc@7.3.0 -------------------------
+
    netcdf@4.7.0          /home/ubuntu/spack/opt/spack/linux-ubuntu18.04-x86_64/gcc-7.3.0/netcdf-4.7.0-cryif4blmrhr2i43pif2scmlwv3yu3nq
+
    netcdf-fortran@4.4.4  /home/ubuntu/spack/opt/spack/linux-ubuntu18.04-x86_64/gcc-7.3.0/netcdf-fortran-4.4.4-5xmcthsjv7qqmlyby3szrbs3pcphh3tp
+
 
+
Please see [[Setting Unix environment variables for GEOS-Chem|our ''Setting Unix environment variables for GEOS-Chem'' wiki page]] for more information about other environment variables that you may need to define.
+
 
+
=== For more information ===
+
 
+
For complete instructions on using Spack, please see the Spack manual
+
* https://spack.readthedocs.io/en/latest
+
 
+
If you need to manage a lot of separate software libraries, consider creating Spack environments.  You can easily switch back and forth between different environments.  Please see this tutorial for more information:
+
*https://spack.readthedocs.io/en/latest/tutorial_environments.html
+
 
+
You can use Spack to create module files for use with the Lmod module manager, which is used on many HPC cluster systems.  For more information, please see this tutorial:
+
*https://spack.readthedocs.io/en/latest/tutorial_modules.html
+
 
+
Here is a useful tutorial about using Spack to install libraries for High-Performance Computing applications:
+
* https://www.youtube.com/watch?v=BxNOxHu6FAI
+
 
+
For more information about using Spack with Docker and Singularity software containers, please see this tutorial:
+
*https://spack.readthedocs.io/en/latest/workflows.html?highlight=singularity#using-spack-to-create-docker-images
+
 
+
=== Reporting errors or technical issues encountered while using Spack ===
+
 
+
While the [[GEOS-Chem Support Team]] recommends Spack as a useful third-party tool, it is not responsible for solving Spack-related issues.
+
 
+
We encourage you to report all errors and technical issues directly to the Spack Support Team by opening a ticket on the Spack issue tracker:
+
 
+
* https://github.com/spack/spack/issues.
+
 
+
Please describe fully the commands that you used, as well as the type of system (OS, compilers, etc.) that you are using.  A member of the Spack Support Team should be able to answer your question more fully.
+
 
+
Spack is a cutting-edge tool for installing HPC packages.  That being said, there may still be compatibility issues with certain library/OS/compiler combinations.  The Spack Support Team is continually trying to improve Spack and to resolve any such issues.
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 16:15, 11 June 2019 (UTC)
+

Revision as of 14:55, 13 June 2019