Difference between revisions of "Installing required software"

From Geos-chem
Jump to: navigation, search
(Installing software packages for GEOS-Chem)
Line 19: Line 19:
  
  
== Installing software packages for GEOS-Chem ==
+
== If you are using GEOS-Chem on the Amazon Web Services Cloud ==
  
placeholder page
+
All of the required software libraries for GEOS-Chem will be included in the Amazon Machine Image (AMI) that you use to initialize your Amazon Elastic Cloud Compute (EC2) instance.  For more information, please see our GEOS-Chem cloud computing documentation ('''http://cloud.geos-chem.org''').
 +
 
 +
== If you are using GEOS-Chem on a computer cluster ==
 +
 
 +
If you are going to use GEOS-Chem on your institution's shared computer cluster, the required software libraries for GEOS-Chem may have already been installed by your IT staff or system administrator.  Depending on your system's setup, there are a few different ways that you can load these libraries into your computational environment.  These are described below.
 +
 
 +
=== First, check if required libraries are available as modules ===
 +
 
 +
Many high-performance computing (HPC) clusters use a module manager such as [https://lmod.readthedocs.io/en/latest/ '''Lmod'''] or [https://modules.readthedocs.io/en/latest/ '''environment-modules'''] to load software packages and libraries.  A module manager allows you to load different compilers and libraries with  simple <tt>module load</tt> commands.  For example, on the Harvard Cannon cluster, software packages can be loaded with commands such as these:
 +
 
 +
module purge
 +
module load gcc/8.2.0-fasrc01
 +
module load openmpi/3.1.1-fasrc01
 +
module load netcdf/4.1.3-fasrc02
 +
 
 +
* ''NOTE: On your system, the module names and/or version numbers may differ.  Ask your sysadmin or IT staff.''
 +
 
 +
The <tt>module purge</tt> command removes all pre-loaded modules.  The second line loads the GNU C and Fortran compilers (version 8.2.0).  The third and fourth lines load openmpi 3.1.1 (which netCDF depends on), and finally netCDF 4.1.3 itself.  You can add these module load statements into your system startup files (e.g. <tt>.bashrc</tt>, <tt>.bash_aliases</tt>), etc.
 +
 
 +
* ''NOTE: Dependencies of netCDF-4.1.3 (such as the HDF5 package) will be loaded automatically.''
 +
 
 +
As a convenience, your module manager '''may''' also export the relevant folder paths to your Unix environment.  For example, issuing the above module statements on the Harvard Cannon cluster will export the following environment variables:
 +
 
 +
$GCC_HOME        # Home folder for gcc 8.2.0
 +
$GCC_INCLUDE    # Folder where include files of gcc 8.2.0 are stored
 +
$GCC_LIB        # Folder where library files of gcc 8.2.0 are stored
 +
$MPI_HOME        # Home folder for openmpi 3.1.1
 +
$MPI_INCLUDE    # Folder where include files (e.g. mpi.h) of openmpi 3.1.1 are stored
 +
$MPI_LIB        # Folder where library files (e.g. libmpi*.a) openmpi 3.1.1 are stored
 +
$NETCDF_HOME    # Home folder for netcdf-4.1.3
 +
$NETCDF_INCLUDE  # Folder where include files (e.g. netcdf.h, netcdf.inc) are stored
 +
$NETCDF_LIB      # Folder where library files (e.g. libnetcdf.a, libnetcdff.a) for netCDF 4.1.3 are stored
 +
 
 +
You can then use these environment variables to tell GEOS-Chem where it can find the netCDF libraries on your system.  See [[Setting_Unix_environment_variables_for_GEOS-Chem|our ''Setting Unix environment variables for GEOS-Chem'' wiki page]] for more information. 
 +
 
 +
* ''NOTE: The names of these environment variables may be different on your system (ask you IT staff for more information).''
 +
 
 +
Module managers make it very easy to switch between different compilers and libraries.  For example, to load software libraries that were built with the Intel Fortran Compiler, all one has to do is to use a different set of <tt>module load</tt> statements, such as:
 +
 
 +
module purge
 +
module load intel/17.0.4-fasrc01
 +
module load openmpi/2.1.0-fasrc02
 +
module load netcdf/4.3.2-fasrc05
 +
module load netcdf-fortran/4.4.0-fasrc03
 +
 
 +
* ''NOTE: For an explanation of why netCDF-Fortran is loaded as a separate module, [[Introduction to netCDF#netCDF_4.2_and_later_versions_require_a_separate_netCDF-Fortran_installation|please see this section]].''
 +
 
 +
If netCDF-Fortran is installed as a separate module, then your module manager '''may''' also define additional environment variables for you.  For example, on the Harvard Odyssey cluster, the following environment variables are defined when a netCDF-Fortran module is loaded:
 +
 
 +
$NETCDF_FORTRAN_HOME    # Home folder for netcdf-4.4.0
 +
$NETCDF_FORTRAN_INCLUDE  # Folder where include files (e.g. netcdf.h, netcdf.inc) for netCDF-Fortran 4.4.0 are stored
 +
$NETCDF_FORTRAN_LIB      # Folder where library files (e.g. libnetcdf.a, libnetcdff.a) for netCDF-Fortran 4.4.0 are stored
 +
 
 +
One downside of using a module manager is that you are locked into using only those compiler and software versions that have already been installed on your system.  For example, an update to computer model that you are using might also updating to a new compiler version that is not yet available on your system. In this case, you will need to request that your IT staff install the new compiler version for you (and wait for them to do it).  But in general, module managers succeed in ensuring that only well-tested compiler/software combinations are available to users.
 +
 
 +
* ''NOTE: Not all module managers will create relevant environment variables when loading a package, so you may need to figure out how to define those manually.  Ask your IT staff or sysadmin for more information.''
 +
 
 +
=== Next, check if there is a Spack-built netCDF installation ===
 +
 
 +
If your system doesn't have a module manager installed, check to see if the netCDF libraries were built the Spack package manager.  You can type
 +
 
 +
spack find
 +
 
 +
to see if there are any Spack-built packages such as the GNU Fortran Compiler, netCDF, and/or netCDF-Fortran.  If your system also has a module manager installed, then you can load libraries with the <tt>spack load</tt> command, e.g.
 +
 
 +
spack load netcdf-c
 +
spack load netcdf-fortran
 +
... etc ...
 +
 
 +
If not, then check to see a Spack environment has been installed.  A Spack environment will load several libraries at once (similar to how Conda loads several python packages at once).  You can usually use:
 +
 
 +
spack activate ENVIRONMENT-NAME
 +
 
 +
and
 +
 +
despacktivate
 +
 
 +
to enter and exit the environment.
 +
 
 +
For more information about Spack, [[Use_Spack_to_install_netCDF_on_your_system|see these detailed instructions]].
 +
 
 +
=== Next, check if there is are manual library installation ===
 +
 
 +
If your computer system does not use a module manager then the netCDF libraries may have already been installed by your IT staff in one of the usual Unix folder locations (such as <tt>/usr/lib</tt> or <tt>/usr/local/lib</tt>).  If this is the case, ask your sysadmin or IT staff where these libraries reside.
 +
 
 +
Once you know the location of the compiler and netCDF libraries, you can [[Configuring your computational environment|set the proper environment variables for GEOS-Chem]].
 +
 
 +
=== Finally, install libraries by using Spack ===
 +
 
 +
If your system has none of the required software packages that GEOS-Chem needs, you can use the Spack package manager to install them.  For detailed instructions on using Spack, [[Use Spack to install netCDF on your system|please see this wiki page]].  We also recommend that you view the following tutorial videos on our GEOS-Chem Youtube Channel ('''https://www.youtube.com/c/geos-chem'''), which will walk you through the installation process.
 +
 
 +
{| border="1" cellspacing=0 cellpadding=5
 +
|-valign="top" align="center"
 +
|[[Image:Spack_1_thumbnail.png]]<br>[https://www.youtube.com/watch?v=7eMLYKLK9wY '''<big>Click HERE to view!</big>'''!]
 +
|[[Image:Spack_2_thumbnail.png]]<br>[https://www.youtube.com/watch?v=6tGCD3zQQW0 '''<big>Click HERE to view!</big>'''!]
 +
|}
 +
 
 +
--[[User:Bmy|Bob Yantosca]] ([[User talk:Bmy|talk]]) 15:02, 23 November 2020 (UTC)
  
  
 
----
 
----
 
'''''[[GEOS-Chem required disk space|Previous]] | [[Configuring your computational environment|Next]] | [[Getting Started with GEOS-Chem]]'''''
 
'''''[[GEOS-Chem required disk space|Previous]] | [[Configuring your computational environment|Next]] | [[Getting Started with GEOS-Chem]]'''''

Revision as of 15:02, 23 November 2020

Previous | Next | Getting Started with GEOS-Chem

  1. Minimum system requirements for GEOS-Chem
  2. Installing required software
  3. Configuring your computational environment
  4. Downloading source code
  5. Downloading data directories
  6. Creating run directories
  7. Configuring runs
  8. Compiling
  9. Running
  10. Output files
  11. Visualizing and processing output
  12. Coding and debugging
  13. Further reading


If you are using GEOS-Chem on the Amazon Web Services Cloud

All of the required software libraries for GEOS-Chem will be included in the Amazon Machine Image (AMI) that you use to initialize your Amazon Elastic Cloud Compute (EC2) instance. For more information, please see our GEOS-Chem cloud computing documentation (http://cloud.geos-chem.org).

If you are using GEOS-Chem on a computer cluster

If you are going to use GEOS-Chem on your institution's shared computer cluster, the required software libraries for GEOS-Chem may have already been installed by your IT staff or system administrator. Depending on your system's setup, there are a few different ways that you can load these libraries into your computational environment. These are described below.

First, check if required libraries are available as modules

Many high-performance computing (HPC) clusters use a module manager such as Lmod or environment-modules to load software packages and libraries. A module manager allows you to load different compilers and libraries with simple module load commands. For example, on the Harvard Cannon cluster, software packages can be loaded with commands such as these:

module purge
module load gcc/8.2.0-fasrc01
module load openmpi/3.1.1-fasrc01
module load netcdf/4.1.3-fasrc02
  • NOTE: On your system, the module names and/or version numbers may differ. Ask your sysadmin or IT staff.

The module purge command removes all pre-loaded modules. The second line loads the GNU C and Fortran compilers (version 8.2.0). The third and fourth lines load openmpi 3.1.1 (which netCDF depends on), and finally netCDF 4.1.3 itself. You can add these module load statements into your system startup files (e.g. .bashrc, .bash_aliases), etc.

  • NOTE: Dependencies of netCDF-4.1.3 (such as the HDF5 package) will be loaded automatically.

As a convenience, your module manager may also export the relevant folder paths to your Unix environment. For example, issuing the above module statements on the Harvard Cannon cluster will export the following environment variables:

$GCC_HOME        # Home folder for gcc 8.2.0
$GCC_INCLUDE     # Folder where include files of gcc 8.2.0 are stored
$GCC_LIB         # Folder where library files of gcc 8.2.0 are stored
$MPI_HOME        # Home folder for openmpi 3.1.1
$MPI_INCLUDE     # Folder where include files (e.g. mpi.h) of openmpi 3.1.1 are stored
$MPI_LIB         # Folder where library files (e.g. libmpi*.a) openmpi 3.1.1 are stored
$NETCDF_HOME     # Home folder for netcdf-4.1.3
$NETCDF_INCLUDE  # Folder where include files (e.g. netcdf.h, netcdf.inc) are stored
$NETCDF_LIB      # Folder where library files (e.g. libnetcdf.a, libnetcdff.a) for netCDF 4.1.3 are stored

You can then use these environment variables to tell GEOS-Chem where it can find the netCDF libraries on your system. See our Setting Unix environment variables for GEOS-Chem wiki page for more information.

  • NOTE: The names of these environment variables may be different on your system (ask you IT staff for more information).

Module managers make it very easy to switch between different compilers and libraries. For example, to load software libraries that were built with the Intel Fortran Compiler, all one has to do is to use a different set of module load statements, such as:

module purge
module load intel/17.0.4-fasrc01
module load openmpi/2.1.0-fasrc02
module load netcdf/4.3.2-fasrc05
module load netcdf-fortran/4.4.0-fasrc03

If netCDF-Fortran is installed as a separate module, then your module manager may also define additional environment variables for you. For example, on the Harvard Odyssey cluster, the following environment variables are defined when a netCDF-Fortran module is loaded:

$NETCDF_FORTRAN_HOME     # Home folder for netcdf-4.4.0
$NETCDF_FORTRAN_INCLUDE  # Folder where include files (e.g. netcdf.h, netcdf.inc) for netCDF-Fortran 4.4.0 are stored
$NETCDF_FORTRAN_LIB      # Folder where library files (e.g. libnetcdf.a, libnetcdff.a) for netCDF-Fortran 4.4.0 are stored

One downside of using a module manager is that you are locked into using only those compiler and software versions that have already been installed on your system. For example, an update to computer model that you are using might also updating to a new compiler version that is not yet available on your system. In this case, you will need to request that your IT staff install the new compiler version for you (and wait for them to do it). But in general, module managers succeed in ensuring that only well-tested compiler/software combinations are available to users.

  • NOTE: Not all module managers will create relevant environment variables when loading a package, so you may need to figure out how to define those manually. Ask your IT staff or sysadmin for more information.

Next, check if there is a Spack-built netCDF installation

If your system doesn't have a module manager installed, check to see if the netCDF libraries were built the Spack package manager. You can type

spack find

to see if there are any Spack-built packages such as the GNU Fortran Compiler, netCDF, and/or netCDF-Fortran. If your system also has a module manager installed, then you can load libraries with the spack load command, e.g.

spack load netcdf-c
spack load netcdf-fortran
... etc ...

If not, then check to see a Spack environment has been installed. A Spack environment will load several libraries at once (similar to how Conda loads several python packages at once). You can usually use:

spack activate ENVIRONMENT-NAME

and

despacktivate

to enter and exit the environment.

For more information about Spack, see these detailed instructions.

Next, check if there is are manual library installation

If your computer system does not use a module manager then the netCDF libraries may have already been installed by your IT staff in one of the usual Unix folder locations (such as /usr/lib or /usr/local/lib). If this is the case, ask your sysadmin or IT staff where these libraries reside.

Once you know the location of the compiler and netCDF libraries, you can set the proper environment variables for GEOS-Chem.

Finally, install libraries by using Spack

If your system has none of the required software packages that GEOS-Chem needs, you can use the Spack package manager to install them. For detailed instructions on using Spack, please see this wiki page. We also recommend that you view the following tutorial videos on our GEOS-Chem Youtube Channel (https://www.youtube.com/c/geos-chem), which will walk you through the installation process.

Spack 1 thumbnail.png
Click HERE to view!!
Spack 2 thumbnail.png
Click HERE to view!!

--Bob Yantosca (talk) 15:02, 23 November 2020 (UTC)



Previous | Next | Getting Started with GEOS-Chem