Downloading GEOS-Chem source code (13.0.0 and later versions)

From Geos-chem
Revision as of 19:37, 11 January 2021 by Bmy (Talk | contribs) (Step 2: (Optional) Create a new branch in the GCClassic folder)

Jump to: navigation, search

Previous | Next | Getting Started with GEOS-Chem

  1. Minimum system requirements
  2. Installing required software
  3. Configuring your computational environment
  4. Downloading source code
  5. Downloading data directories
  6. Creating run directories
  7. Configuring runs
  8. Compiling
  9. Running
  10. Output files
  11. Python tools for use with GEOS-Chem
  12. Coding and debugging
  13. Further reading

The GEOS-Chem code has been split into separate Github repositories

Starting with GEOS-Chem 13.0.0 and later versions, the GEOS-Chem source code has been split up into 3 Github repositories:

Name Stored at Description
GEOS-Chem This is the GEOS-Chem "science codebase" repository. It contains all of the GEOS-Chem science code, plus:
  • Scripts to create GEOS-Chem run directories
  • Scripts to create GEOS-Chem integration tests
  • Interfaces (i.e. the driver programs) for GEOS-Chem "Classic", GCHP, etc.
HEMCO This is the HEMCO (Harmonized Emission Component) repository. It contains code to read and regrid data such as emissions, met fields, chemistry data, etc.
GCClassic A lightweight wrapper that encompasses GEOS-Chem and HEMCO. We say that GCClassic is the superproject (i.e. top-level source code folder), and that GEOS-Chem (science codebase) and HEMCO are submodules.

You may be wondering why this was done. Recent structural updates to GCHP and HEMCO have also necessitated corresponding structural changes to GEOS-Chem "Classic". In particular, HEMCO is no longer being developed exclusively for GEOS-Chem. It is now also being developed for the NCAR models (CESM2 and next-generation models) as well as for models at NOAA. Because of this, it made sense to split off HEMCO from GEOS-Chem and to store HEMCO in its own Github repository. We hope that this will spur feedback and innovation from users outside of the GEOS-Chem community.

This new setup also has the advantage that the GEOS-Chem code itself no longer can be described as a self-contained model, but of a science codebase that can be integrated into several modeling contexts: as GEOS-Chem "Classic", as GCHP, as GEOS-Chem within the NASA/GEOS ESM, as GEOS-Chem within CESM, etc. This better aligns with our GEOS-Chem Vision and Mission statements.

Step 1. Clone the GCClassic superproject repository to your local disk

Type this command to download the latest stable GEOS-Chem "Classic" version:

 git clone

This will clone a fresh copy of the GCClassic superproject from Github to your computer system. If you do not specify a name, this command will clone the superproject into a folder named GCClassic. By default, the git clone command will create a local folder with the same name as the remote repository.

SIDE NOTE: You can clone the GEOS-Chem superproject into a differently-named folder by specifying a new name after the URL. For example, typing:
       git clone GCClassic.13.0.0
will clone the GCClassic superproject into a local folder named GCClassic 13.0.0. Here the 13.0.0 refers to the GEOS-Chem version number, which is always reported in X.Y.Z notation.

Once you type the git clone command, you will see output similar to this.

Cloning into 'GCClassic'...
remote: Enumerating objects: 34, done.
remote: Counting objects: 100% (34/34), done.
remote: Compressing objects: 100% (25/25), done.
remote: Total 737 (delta 12), reused 31 (delta 9), pack-reused 703
Receiving objects: 100% (737/737), 138.79 KiB | 1.46 MiB/s, done.
Resolving deltas: 100% (383/383), done.

Step 2: (Optional) Create a new branch in the GCClassic folder

NOTE: Step 2 is recommended if you plan on adding your own source code changes to GEOS-Chem. But if you only wish to use the "out-of-the-box" code in src/GEOS-Chem without making any modifications, you may skip this step.

Now take a look at the Git history of the GCClassic folder. Type:

cd GCClassic
gitk --all &

and a GitK window will pop up. It should look similar to the image below (but not exactly, as these images were created before the official release of 13.0.0):

Gc13 1.png

Notice that it has placed you on the main branch. This is the branch corresponding to the latest stable version of GEOS-Chem. Other developments in the Git history may be more recent, but these correspond to items in development, and should be ignored (unless you are a GEOS-Chem developer who needs to work with the "cutting-edge" code).

Before doing anything else, you should create a new branch for your own GEOS-Chem work that is separate from the main branch. New code should never be added directly into main, but into a branch that can be merged into main later. Best practice is to use a descriptive name for the branch such as feature/UpdatedKppMechanism, bugfix/WetDepFixes, etc. For this tutorial, the branch name feature/myGeosChemWork will suffice.

The easiest way is to create this new branch is to type these Git commands:

git branch feature/myGeosChemWork
git checkout feature/myGeosChemWork

If you go back to the GitK window (and hit F5 to refresh), you'll see that the branch feature/myGeosChemWork has now been checked out.

Gc13 2.png

Step 3: Examine the contents of the GCClassic folder

Now get a directory listing for the GCClassic superproject folder. Type:

ls -CF

You should see the following content:

CMakeLists.txt  LICENSE  run@  src/

Here CMakeLists.txt is a file needed by the CMake build system, run@ is a symbolic link and src/ is a folder.

You might surmise that the GEOS-Chem and HEMCO source codes are contained in the src/ folder. Type:

ls -CF src/*

and you will see this output:

src/CMakeLists.txt  src/gc_classic_version.H@  src/main.F90@



Another CMake file, more symbolic links and empty src/GEOS-Chem and src/HEMCO folders. Where are the GEOS-Chem and HEMCO codes?

Step 4. Fetch the GEOS-Chem and HEMCO source codes

The /src/GEOS-Chem and src/HEMCO code folders are empty because the GEOS-Chem and HEMCO source codes have not been "fetched" into the GCClassic superproject folder. This is because GEOS-Chem and HEMCO are tracked as Git submodules by the GCClassic superproject.

Think of the GCClassic superproject as a "historian" for the GEOS-Chem and HEMCO submodules (which will be stored in the src/GEOS-Chem and src/HEMCO folders, respectively). For example, when a programmer checks in new commits in src/GEOS-Chem or in src/HEMCO, the programmer must also make a corresponding commit to the GCClassic superproject. This commit informs the GCClassic superproject about the updates that were in the src/GEOS-Chem or src/HEMCO folders. In other words, the GCClassic superproject repository must not only keep track of its own Git history, but also of the Git histories of the GEOS-Chem and HEMCO repositories as well. That is why we say GCClassic is like a "historian" for GEOS-Chem and HEMCO repositories.

To check out the GEOS-Chem and HEMCO source code at the proper points in their version history, type:

git submodule update --init --recursive

You will see output similar to this:

  Submodule 'src/GEOS-Chem' ( registered for path 'src/GEOS-Chem'
  Submodule 'src/HEMCO' ( registered for path 'src/HEMCO'
  Cloning into 'GCClassic/src/GEOS-Chem'...
  Cloning into 'GCClassic/src/HEMCO'...
  Submodule path 'src/GEOS-Chem': checked out '22c503be96fa2dd848eb2fba142beb6d92a09889'
  Submodule path 'src/HEMCO': checked out 'edf987e03f23be2d7588324bd62a52eb9c646248'

The Submodule path statements indicate the commits on which the src/GEOS-Chem and src/HEMCO<tt> codes were placed on. More on this in a bit.

If we now get a directory listing:

 ls -CF src/*

we see that the <tt>src/GEOS-Chem and src/HEMCO folders contain directory structures full of source code:

 src/CMakeLists.txt  src/gc_classic_version.H@  src/main.F90@

 APM/            CMakeScripts/  GeosUtil/  History/     lib/         ObsPack/   run/
 AUTHORS.txt     doc/           GTMM/      Interfaces/  LICENSE.txt  PKUCPL/
 bin/            GeosCore/      Headers/   ISORROPIA/   mod/
 CMakeLists.txt  GeosRad/       help/      KPP/         NcdfUtil/    REVISIONS

 AUTHORS.txt  CMakeLists.txt  CMakeScripts/  LICENSE.txt  run/  src/

and now you can see the various files and subdirectories that make up the GEOS-Chem and HEMCO source codes.

Pro tip: Define an alias for the git submodule update command

Because you will use the git submodule update command very often, we recommend that you define an alias for it. Simply add this text to your ~/.bashrc file:

alias gsu="git submodule update --init --recursive"

and then apply the changes with:

source ~/.bashrc

Now you can type gsu instead of git submodule update --init --recursive.

Step 5. (Optional) Create a new branch in src/GEOS-Chem

NOTE: Step 5 is recommended if you plan on adding your own source code changes to GEOS-Chem. But if you only wish to run the "out-of-the-box" code in src/GEOS-Chem without making any modifications, you may skip this step.

When you fetch the code in the GEOS-Chem and HEMCO submodules with the git submodule update --init --recursive command (as described above), the GEOS-Chem and HEMCO submodule codes will be in detached HEAD state. In other words, the code is checked out but a branch is not created. Adding new code to a detached HEAD state is very dangerous and should be avoided. You should instead make a branch at the same point as the detached HEAD, and then add your own modifications into that branch.

Navigate from the GCClassic superproject folder to the GEOS-Chem submodule:

cd src/GEOS-Chem

and then use the GitK browser to examine the code:

gitk --all &

You'll see output similar to (but maybe not exactly) like this:

Gc13 3.png

The text highlighted in gray shows the point in the Git history at which the src/GEOS-Chem submodule currently is located. You'll want to make your new branch here. Type these Git commands:

git branch feature/myGeosChemWork
git checkout feature/myGeosChemWork

Although you can use any branch name that you'd like, best practice is to create a branch in src/GEOS-Chem with the same branch name that you created in GCClassic.

Now if you return to the GitK window (and hit F5 to refresh), you'll see that the feature/myGeosChemWork branch has been created and checked out at the same location in the Git history as the detached HEAD state.

Gc13 4.png

Now it is safe for you to add your own modifications into this branch.

Step 6. (Optional) Check out the main branch in src/HEMCO

NOTE: Step 6 is recommended if you you are a HEMCO developer and plan on adding your own source code changes to HEMCO. But if you only wish to use the "out-of-the-box" code in src/HEMCO, you may skip this step.

Now let's look at the state of the code in HEMCO. Type:

 cd ../HEMCO
 gitk --all &

This will pop open a new GitK window. As you can see, the HEMCO source code will be in detached HEAD state, as shown below:

Gc13 6.png

We can see that the HEMCO source code is at the most recent commit in its Git history. This is indicated by the commit that is highlighted in gray text. This commit also corresponds to the position of the remotes/origin/main branch (i.e. the main branch at the Github repository

We can now check out a local branch named main at this most recent location in the HEMCO Git history. Type:

 git branch main
 git checkout main

Return to the GitK window (and press F5 to refresh). You will now see that the local main branch has been created.

Gc13 7.png

Unless you are a HEMCO developer, you will probably never need to make any modifications to the HEMCO source code. Therefore it is OK to leave the code in the src/HEMCO folder checked out on the main branch. However, if you anticipate that you will be modifying the code in src/HEMCO, you should create a feature branch in which to add your updates.

Code directory structure

You may now skip ahead to the GEOS-Chem directory structure chapter.

Further reading

Previous | Next | Getting Started with GEOS-Chem