Difference between revisions of "Installing libraries for GEOS-Chem"

From Geos-chem
Jump to: navigation, search
(Using Spack to install netCDF libraries)
(Redirected page to Guide to netCDF in GEOS-Chem)
 
(53 intermediate revisions by 3 users not shown)
Line 1: Line 1:
On this page we provide instructions on how to install netCDF-4 and related libraries for GEOS-Chem.
+
#REDIRECT [[Guide to netCDF in GEOS-Chem]]
 
+
== A brief introduction to netCDF ==
+
 
+
GEOS-Chem reads and writes data using the netCDF file format. NetCDF is a self-describing file format that can store data fields as well as the relevant "metadata", or information about the contents of the file. Types of metadata include descriptive names, units, horizontal and vertical, coordinates, file creation date/time, file history, etc.
+
 
+
The [http://www.unidata.ucar.edu/software/netcdf/docs/faq.html netCDF frequently asked questions (FAQ)] guide gives this short overview of netCDF:
+
 
+
<blockquote>NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The netCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.
+
 
+
NetCDF data is:
+
 
+
*Self-Describing. A netCDF file includes information about the data it contains.
+
*Portable. A netCDF file can be accessed by computers with different ways of storing integers, characters, and floating-point numbers.
+
*Scalable. A small subset of a large dataset may be accessed efficiently.
+
*Appendable. Data may be appended to a properly structured netCDF file without copying the dataset or redefining its structure.
+
*Sharable. One writer and multiple readers may simultaneously access the same netCDF file.
+
*Archivable. Access to all earlier forms of netCDF data will be supported by current and future versions of the software.
+
</blockquote>
+
 
+
There are two commonly-used major versions of netCDF in use today:
+
 
+
#netCDF-3, aka "netCDF classic".
+
#netCDF-4
+
 
+
The major difference between the two versions is that netCDF-4 relies on the HDF5 library "under the hood" whereas netCDF-3 does not. For this reason, netCDF-4 can be used to store more data per file than netCDF-3.
+
 
+
A netCDF installation contains library files (ending in <tt>.a</tt>) , which hold compiled utility routines meant to be called from programs written in C or Fortran. In netCDF-4.1 and prior versions, the C-language library file (<tt>libnetcdf.a</tt>) and the Fortran-language library file (<tt>libnetcdff.a</tt>) were always installed into the same folder by default. But starting with netCDF-4.2, the netCDF Fortran libraries now must be built from a separate distribution package. Because of this new configuration, you might find that the <tt>libnetcdff.a</tt> (Fortran) and <tt>libnetcdf.a</tt> (C) library files are stored in separate folders on your system. Ask your IT staff for more information about how netCDF is installed on your system.  See [[#netCDF 4.2 and later versions require a separate netCDF-Fortran installation|this section below]] for more information.
+
 
+
== Check to see if netCDF is already installed on your system ==
+
 
+
If you are going to use GEOS-Chem on a shared computer system, chances are that your IT staff will have already installed one or more netCDF library versions that you can use.  Depending on your system's setup, there are several ways that you can tell your computational environment where to find the netCDF library files, as described below.
+
 
+
=== Using modules ===
+
 
+
Many high-performance computing (HPC) clusters use the '''Lmod module software'''.  Lmod allows you to load different compilers and libraries with  simple <tt>module load</tt> commands.  For example, on the Harvard Odyssey cluster, compiler and netCDF libraries are initialized with commands such as these:
+
 
+
module purge
+
module load gcc/8.2.0-fasrc01
+
module load openmpi/3.1.1-fasrc01
+
module load netcdf/4.1.3-fasrc02
+
 
+
The first line removes all pre-loaded modules.  The second line loads the GNU C and Fortran compilers (version 8.2.0).  The third and fourth lines openmpi 3.1.1 (which netCDF depends on), and finally netCDF 4.1.3 itself.  You can add these module load statements into your system startup files (e.g. <tt>.bashrc</tt>, <tt>.bash_aliases</tt>), etc.
+
 
+
As a convenience, Lmod will also export the relevant folder paths to your Unix environment.  For example, issuing the above module statements on the Harvard Odyssey cluster will export the following environment variables:
+
 
+
$GCC_HOME        # Home folder for gcc 8.2.0
+
$GCC_INCLUDE    # Folder where include files of gcc 8.2.0 are stored
+
$GCC_LIB        # Folder where library files of gcc 8.2.0 are stored
+
$MPI_HOME        # Home folder for openmpi 3.1.1
+
$MPI_INCLUDE    # Folder where include files (e.g. mpi.h) of openmpi 3.1.1 are stored
+
$MPI_LIB        # Folder where library files (e.g. libmpi*.a) openmpi 3.1.1 are stored
+
$NETCDF_HOME    # Home folder for netcdf-4.1.3
+
$NETCDF_INCLUDE  # Folder where include files (e.g. netcdf.h, netcdf.inc) are stored
+
$NETCDF_LIB      # Folder where library files (e.g. libnetcdf.a, libnetcdff.a) for netCDF 4.1.3 are stored
+
 
+
You can then use these environment variables to tell GEOS-Chem where it can find the netCDF libraries on your system.  See [[Setting_Unix_environment_variables_for_GEOS-Chem|our ''Setting Unix environment variables for GEOS-Chem'' wiki page]] for more information.  NOTE: The names of these environment variables may be different on your system (ask you IT staff for more information).
+
 
+
Lmod makes it very easy to switch between different compilers and libraries.  To load the netCDF libraries that were built with the Intel Fortran Compiler, all one has to do is to use a different set of <tt>module load</tt> statements, such as:
+
 
+
module purge
+
module load intel/17.0.4-fasrc01
+
module load openmpi/2.1.0-fasrc02
+
module load netcdf/4.3.2-fasrc05
+
module load netcdf-fortran/4.4.0-fasrc03
+
 
+
''NOTE: For an explanation of why netcdf-fortran is loaded as a separate module, [[#netCDF_4.2_and_later_versions_require_a_separate_netCDF-Fortran_installation|please see this section below]].''
+
 
+
One downside of using Lmod is that you are locked into using only those compiler and software versions that have already been installed on your system by your IT staff.  For example, an update to computer model that you are using might also updating to a new compiler version that is not yet available on your system. In this case, you will need to request that your IT staff install the new compiler version for you (and wait for them to do it).  But in general, Lmod succeeds in ensuring that only well-tested compiler/software combinations are available to users.
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:58, 9 January 2019 (UTC)
+
 
+
=== Manual library installation ===
+
 
+
If your computer system does not use [[#Using modules|Lmod]], then the netCDF libraries may have already been installed by your IT staff in one of the usual Unix folder locations (such as <tt>/usr/lib</tt> or <tt>/usr/local/lib</tt>).  If this is the case, ask your IT staff where these libraries reside.
+
 
+
Once you know the location of the compiler and netCDF libraries, you can [[Setting_Unix_environment_variables_for_GEOS-Chem|set the proper environment variables for GEOS-Chem]].
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:08, 9 January 2019 (UTC)
+
 
+
=== Library installation on the cloud ===
+
 
+
If you are using GEOS-Chem on the Amazon Web Services cloud computing platform, then the netCDF libraries will already be installed for you, either as part of the Amazon Machine Image (AMI) or software container (e.g. Docker or Singularity) that you used to initialize your computational environment.  The proper [[Setting Unix environment variables for GEOS-Chem|Unix environment variables]] will also be defined.
+
 
+
For more information, please see our comprehensive cloud computing tutorial: [http://cloud-geos-chem.org '''cloud.geos-chem.org''']
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:43, 9 January 2019 (UTC)
+
 
+
=== netCDF 4.2 and later versions require a separate netCDF-Fortran installation ===
+
 
+
In our section on [[#Using modules|the Lmod module system]] above, we used the following example commands to load libraries that are compatible with GNU Fortran 8.2.0:
+
 
+
module purge
+
module load gcc/8.2.0-fasrc01
+
module load openmpi/3.1.1-fasrc01
+
module load netcdf/4.1.3-fasrc02
+
 
+
But later on in that same section, we listed a different set of <tt>module load</tt> commands to load libraries that are compatible with Intel Fortran Compiler 17.0.4:
+
 
+
module purge
+
module load intel/17.0.4-fasrc01
+
module load openmpi/2.1.0-fasrc02
+
module load netcdf/4.3.2-fasrc05
+
module load netcdf-fortran/4.4.0-fasrc03
+
 
+
You might have noticed that have loaded netcdf-fortran as a separate module for Intel Fortran but not for GNU Fortran.  What is the reason for this?
+
 
+
As it turns out, in all netCDF versions up to 4.1.3, the library files for the C-language interface (<tt>libnetcdf.a</tt>) and the Fortran-language interface (<tt>libnetcdff.a</tt>) were always stored in the same folder.  But in netCDF 4.2.0 (circa 2010) and later versions, the Fortran-language interface to netCDF was moved to a completely separate distribution, with its own version numbering system.  Therefore, if you are using a netCDF package greater than 4.2.0, you have to install netCDF-Fortran as a completely separate library.
+
 
+
If your computer system uses the [[#Using modules|Lmod module software]], then loading the netcdf-fortran module will also export the following environment variables to your Unix environment:
+
 
+
NETCDF_FORTRAN_HOME    # Root folder of the netCDF Fortran-language interface
+
NETCDF_FORTRAN_INCLUDE  # Folder where netCDF-Fortran include files (e.g. netcdf.inc) are stored
+
NETCDF_FORTRAN_LIB      # Folder where netCDF-Fortran library files (e.g. libnetcdff.a) are stored.
+
 
+
These are analogous to <tt>NETCDF_HOME</tt>, <tt>NETCDF_INCLUDE</tt>, and <tt>NETCDF_LIB</tt> [[#Using modules|as mentioned above]].
+
 
+
Long story short:
+
 
+
# If you are using netCDF-4.2.0 and later (which are the most recent versions of netCDF), look for a separate netCDF-Fortran installation
+
# If you are using netCDF-4.1.3 and prior, then there is no separate netCDF-Fortran installation
+
 
+
GEOS-Chem is designed to work with any version of netCDF, regardless if the netCDF-Fortran installation is separate or not.
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 19:39, 9 January 2019 (UTC)
+
 
+
== If you do not have netCDF on your system, use Spack to install it ==
+
 
+
The GEOS-Chem-Libraries installer contains library versions that might not be compatible with the most recent Linux/Ubuntu/Fedora operating systems and gcc/gfortran and icc/ifort compilers.  In particular, several users have noticed that the build fails for gcc/gfortran compiler versions 4.8 and higher.
+
 
+
If you do not already have a pre-built netCDF library on your system, we recommend using [https://spack.readthedocs.io/en/latest/ the Spack package manager] to install the required libraries.  Spack should be able to install netCDF and all required libraries for a variety of compiler/platform combinations (including Linux and MacOS).
+
 
+
=== Downloading Spack ===
+
 
+
Clone the Spack repository, which is hosted at Github, to your disk space:
+
 
+
git clone https://github.com/spack/spack.git
+
 
+
=== Using Spack to install netCDF libraries ===
+
 
+
Before you execute Spack, make sure that you specify the compiler that you wish to use.  From within the bin folder of the Spack respostory that you just downloaded, you can type:
+
 
+
  ./spack compiler find
+
 
+
to make sure that Spack has found the compiler that you want to use.  For more information, please see the [https://spack.readthedocs.io/en/latest/getting_started.html#compiler-configuration ''Compiler ccnfiguration'' section of the Spack manual].
+
 
+
'''Installing netCDF: our recommended configurations'''
+
 
+
Once you have make sure that Spack has found the compiler, you can proceed to installing the libraries.  Here are the commands that you need:
+
 
+
{| border=1 cellspacing=0 cellpadding=5
+
|-bgcolor="#CCCCCC" valign="top"
+
!width="50px"|#
+
!width="400px"|Configuration
+
!width="525px"|Spack installation commands
+
 
+
|-valign="top"
+
|1
+
|Install netCDF for use with both GEOS-Chem "Classic" and [[GCHP]] (also includes the MPI library)<br>'''''THIS IS OUR RECOMMENDED CONFIGURATION'''''
+
|<tt>cd spack/bin<br>./spack install netcdf-fortran</tt>
+
 
+
|-valign="top"
+
|2
+
|Install netCDF for use with GEOS-Chem "Classic" only<br>(i.e. will NOT install the MPI library)
+
|<tt>cd spack/bin<br>./spack install netcdf-fortran <span style="color:green">^netcdf</span><span style="color:orange">~mpi</span> <span style="color:green">^hdf5</span><span style="color:orange">~mpi</span></tt>
+
 
+
|}
+
 
+
The commands will tell Spack to download and install the '''netCDF Fortran-language library''' along with all of its dependent libraries (such as the '''netCDF C-language library''', the '''HDF-5 library''', an '''MPI library''', etc.).  By default, Spack picks the most recent library versions that are available, but you can modify this behavior this with the commands described below.  The installation can take about 30-60 minutes, depending on the options that you specify.
+
 
+
If you think you will be using GEOS-Chem in both its "Classic" mode and its high-performance mode (aka GCHP), we recommend that you install netCDF with MPI (Configuration #1).  If you are only going to use GEOS-Chem in "Classic" mode, you can omit installing MPI (Configuration #2).
+
 
+
''NOTE: For more information on why the netCDF C-language and Fortran-language libraries are installed separately, [[#netCDF_4.2_and_later_versions_require_a_separate_netCDF-Fortran_installation|please see this section]].
+
 
+
'''Commands for customizing the initialization process'''
+
 
+
Using Spack with the default options should be sufficient for most GEOS-Chem applications.  But Spack also lets you customize certain aspects of the installation process.  The table below gives some common examples:
+
 
+
{| border=1 cellspacing=0 cellpadding=5
+
|-bgcolor="#CCCCCC" valign="top"
+
!width="525px"|Action
+
!width="550px"|Spack commands
+
 
+
|-valign="top"
+
|Tell Spack to build libraries without depending on other libraries<br>(e.g. build <span style="color:green">netCDF</span> and its <span style="color:green">HDF5</span> dependency <span style="color:orange">without MPI</span>)
+
|<tt>cd spack/bin<br>./spack install netcdf-fortran <span style="color:green">^netcdf</span><span style="color:orange">~mpi</span> <span style="color:green">^hdf5</span><span style="color:orange">~mpi</span></tt>
+
 
+
|-valign="top"
+
|Tell Spack to install <span style="color:red">specific library versions</span> instead of the most recent versions:
+
|<tt>cd spack/bin<br>./spack install netcdf-fortran<span style="color:red">@4.4.0</span> netcdf<span style="color:red">@4.6</span></tt>
+
 
+
|-valign="top"
+
|Tell Spack to install libraries using a <span style="color:blue">specific compiler version</span>:
+
|<tt>cd spack/bin<br>./spack install netcdf-fortran <span style="color:blue">%gcc@8.2.0</span></tt>
+
 
+
|}
+
 
+
For more information about customization, please see the Spack beginner's tutorial:
+
 
+
* [https://spack.readthedocs.io/en/latest/tutorial_basics.html Spack Tutorial 101]
+
 
+
=== Pointing GEOS-Chem environment variables to the Spack library paths ===
+
 
+
To find the root paths where Spack has installed these libraries, you can use these commands:
+
 
+
spack find --paths netcdf-cxx4
+
spack find --paths netcdf-fortran
+
 
+
But you can also use the <tt>spack location</tt> command as shown below to automatically insert the paths to the netCDF and netCDF-Fortran libraries into one of your Unix environment startup scripts (such as <tt>.bashrc</tt> or  <tt>.bash_aliases</tt>):
+
 
+
# Environment variables for the netCDF C-language interface
+
export NETCDF_HOME=<span style="color:red">$(spack location -i netcdf)</span>
+
export GC_BIN=$NETCDF_HOME/bin
+
export GC_INCLUDE=$NETCDF_HOME/include
+
export GC_LIB=$NETCDF_HOME/lib
+
+
# Environment variables for the netCDF Fortran-languge interface
+
export NETCDF_FORTRAN_HOME=<span style="color:red">$(spack location -i netcdf-fortran)</span>
+
export GC_F_BIN=$NETCDF_FORTRAN_HOME/bin
+
export GC_F_INCLUDE=$NETCDF_FORTRAN_HOME/include
+
export GC_F_LIB=$NETCDF_FORTRAN_HOME/lib
+
 
+
Please see [[Setting Unix environment variables for GEOS-Chem|our ''Setting Unix environment variables for GEOS-Chem'' wiki page]] for more information about other environment variables that you may need to define.
+
 
+
'''For advanced users:''' You can also use Spack to create module files that can be used with the Lmod module manager, which is used on many HPC cluster systems.  For more information, please see the [https://spack.readthedocs.io/en/latest/tutorial_modules.html# ''Module Files'' tutorial of the Spack manual].
+
 
+
=== For more information ===
+
 
+
For complete instructions on using Spack, please see the Spack manual
+
* https://spack.readthedocs.io/en/latest/.
+
 
+
If you need to manage a lot of separate software environments, then you can use Spack to create packages, so that you can easily switch between them.  Please see this tutorial for more information:
+
*https://spack.readthedocs.io/en/latest/tutorial_environments.html
+
 
+
Here is a useful tutorial about using Spack to install libraries for High-Performance Computing applications:
+
* https://www.youtube.com/watch?v=BxNOxHu6FAI
+
 
+
For more information about using Spack with Docker and Singularity software containers, please see this tutorial:
+
*https://spack.readthedocs.io/en/latest/workflows.html?highlight=singularity#using-spack-to-create-docker-images
+
 
+
If you encounter an error while using Spack, it could be due to an incompatibility with your particular compiler/platform combination.  We encourage you to report all issues to the Spack developers by opening a ticket on the Spack issue tracker:
+
* https://github.com/spack/spack/issues.
+
 
+
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 16:34, 9 January 2019 (UTC)
+

Latest revision as of 14:04, 17 June 2019