Using Git with GEOS-Chem

From Geos-chem
Revision as of 14:23, 10 May 2010 by Bmy (talk | contribs) (→‎Merging)
Jump to navigation Jump to search

This page describes how to use the Git version control system to download and manage the GEOS-Chem source code package.

Obtaining and installing Git

You will need to make sure that Git is installed on your system. Please see this wiki post for more information.

Downloading a new GEOS-Chem version

Initial download

Code directory

The GEOS-Chem source code repository has now been migrated from CVS to Git, and is now available for remote download. We recommend that you download each new version of GEOS-Chem into a separate source code directory.

Users outside the Jacob group should use the following syntax:

git clone git://git.as.harvard.edu/bmy/GEOS-Chem/ LOCAL_DIR_NAME

Users at the Jacob group should instead use (on Rhea):

git clone /as/pub/git/bmy/GEOS-Chem LOCAL_DIR_NAME

where LOCAL-DIR-NAME is the name of the local directory on your disk into which the GEOS-Chem source code files will be placed. It is up to you to pick LOCAL-DIR-NAME; some possible options are:

  • Code.vXX-YY-ZZ
  • GEOS-Chem.vXX-YY-ZZ
  • G-C.vXX-YY-ZZ

Let's assume in the following examples that you want to download a copy of the GEOS-Chem repository for v8-03-01. Typing this command:

git clone git://git.as.harvard.edu/bmy/GEOS-Chem/ Code.v8-03-01

will create a "clone" of the remote GEOS-Chem/ repository into your local Code.v8-03-01 directory. The various system files are all stored in the Code.v8-03-01/.git subdirectory. The files in Code.v8-03-01/ are an exact copy of what was on the Git repository server at the time you downloaded it.

Run directories

Each of the GEOS-Chem run directories is saved as a separate Git repository, rather than putting them into a single repository. This is because some of the files (i.e. restart files) can be very large.

The download options are:

  git clone git://git.as.harvard.edu/bmy/GEOS-Chem-rundirs/DIRECTORY-OPTION LOCAL_DIR_NAME

Users at the Jacob group should instead use (on Rhea):

git clone /as/pub/git/bmy/GEOS-Chem-rundirs/DIRECTORY-OPTION LOCAL_DIR_NAME

where DIRECTORY-OPTION may be one of the following:

2x2.5/geos4/SOA 2 x 2.5 G4 fullchem w/ SOA simulation
2x2.5/geos4/dicarbonyls 2 x 2.5 G4 fullchem w/ dicarbonyls option
2x2.5/geos4/isoprene 2 x 2.5 G4 fullchem w/ Caltech isoprene scheme
2x2.5/geos4/standard 2 x 2.5 G4 fullchem standard (43 tracers)


2x2.5/geos5/SOA 2 x 2.5 G5 fullchem w/ SOA simulation
2x2.5/geos5/dicarbonyls 2 x 2.5 G5 fullchem w/ dicarbonyls option
2x2.5/geos5/isoprene 2 x 2.5 G5 fullchem w/ Caltech isoprene scheme
2x2.5/geos5/standard 2 x 2.5 G5 fullchem standard (43 tracers)


4x5/geos4/SOA 4 x 5 G4 fullchem w/ SOA simulation
4x5/geos4/dicarbonyls 4 x 5 G4 fullchem w/ dicarbonyls option
4x5/geos4/isoprene 4 x 5 G4 fullchem w/ Caltech isoprene scheme
4x5/geos4/standard 4 x 5 G4 fullchem standard (43 tracers)


4x5/geos5/SOA 4 x 5 G5 fullchem w/ SOA simulation
4x5/geos5/dicarbonyls 4 x 5 G5 fullchem w/ dicarbonyls option
4x5/geos5/isoprene 4 x 5 G5 fullchem w/ Caltech isoprene scheme
4x5/geos5/standard 4 x 5 G5 fullchem standard (43 tracers)


benchmark 4 x 5 G5 1-month benchmark simulation


LOCAL-DIR-NAME is the name of the local directory on your disk into which the GEOS-Chem source code files will be placed.

Ignoring files

Git also allows you to ignore certain types of files that we don't need to track (e.g. anything that can be built from the source code). These typically include:

  • Object files (*.o)
  • Library files (*.a)
  • Module files (*.mod)
  • Autosave files (*~)
  • Executable files (geos, geostomas)

You can tell Git that you don't want these files to be tracked by editing the .git/info/exclude file in your source code directory. Open this file in your favorite text editor and edit it to look like this:

# git-ls-files --others --exclude-from=.git/info/exclude
# Lines that start with '#' are comments.
# For a project mostly in C, the following would be a good set of
# exclude patterns (uncomment them if you want to use them):
*.[oa]
*.mod
*~
geos
geostomas

Viewing the revision history

The best way to examine the contents of your Git-backed GEOS-Chem source code is to use the gitk viewer. There are two ways to do this:

(1) Change into the Code.v8-03-01 directory and start gitk as follows:

cd Code.v8-03-01
gitk &

(2) Or if you are using the git gui GUI browser (more on that below), you can invoke gitk from the Repository/Visualize master's History menu item.

At the top left of the gitk screen, you will see the graph of revisions. Each dot represents a commit, along with the log message that accompanied each commit.

Note that the most recent commit (i.e. the line at the very top), there are 2 green boxes at the top, one named master and one named origin:

origin
This was the state of the repository on the remote server when you checked it out for the first time. Therefore, this is the "pristine", unchanged code that you got from the download.
master
This is the current state of the local repository now. Since we haven't done anything to the code yet, the master and origin point to the same commit.

If you click on any of the commits in the top left window, in the window below, you will see the log message and a list of changes to the source code. The old code is marked in RED and the new code is marked in GREEN. At right you will also see a list of files that were changed during the commit.

So it's really easy to see how the code has evolved with gitk.

Making revisions

Using the GUI browser

We recommend using the git gui for source code management. Start this in your Code.v8-03-01 subdirectory:

cd Code.v8-03-01
git gui &

On the left there are 2 windows:

Unstaged Changes
An unstaged change is a modification that Git does not know about yet. If you modified any files since the last commit, then they should be displayed in this window. Also, right above this window you will find the name of the current checked-out branch.
Staged Changes
These are changes that Git will add to the repository the very next time you make a commit.

In general, anytime you need to modify the source code, you should NOT do it on the master branch. You should create a new branch for your modifications. Then you can test your modifications ad nauseum until you are sure that everything is functioning as it should. When your modifications are complete, you can merge them back into the master branch. Then the branch you created can be deleted.

The advantage of this approach is that if you ever need to start over from scratch, you can just go back to the master branch and you will get back the state of the code before your modifications were added.

Creating branches

To create a new branch, go to the Branch/Create on the menu (or type CTRL-N). You will get a dialog box that prompts you for the new branch name. Type a unique name and then click OK.

You should pick names that have meaning to you. Some good branch names are:

  • Bug_fix_sulfate_mod
  • CO2_simulation
  • KPP_with_isoprene
  • Methane_simulation

etc. To switch between branches, go to Branch/Checkout on the menu and pick the name of the branch you would like to switch to. The current branch name will be displayed just below the menu at top left.

Once you have created your branch and have checked it out, then you may begin making modifications to the source code with your favorite text editor.

Committing

With Git, you should commit frequently, such as when you have completed making revisions to a file or group of files. Commits that are made on one branch will not affect the other branches.

Committing is best done with the git gui. Basically you follow these steps:

  1. To force the git gui to show the latest changes, you can pick Commit/Rescan from the menu (or type the F5 key).
  2. You should get a list of files in the Unstaged Changes window. Clicking on the icons on the left of the file names will send them to the Staged Changes window. Once they are in the Staged Changes this means that Git will add them to the repository on the very next commit. Note: Clicking on the icon of the files in the Staged Changes moves back the file to the Unstaged Changes window.
  3. Type a Commit message in the bottom right window. See this example of a good commit message. Some pointers are:
    1. The first line should only be 50 characters or less and succinctly describe the commit
    2. Then leave a blank line
    3. Then add more in-depth text that describes the commit
    4. Then click on the Signed-off by button. This will add your name, email address, and a timestamp.
  4. There are two radio buttons above the Commit message window.
    1. New commit: This is the default. Assumes we are making a totally new commit.
    2. Amend last commit: If for whatever reason we need to update the last commit message, pick this button.
  5. Then when your commit message is done, you can click on the Commit button.

Then if you start the gitk viewer, your new commit should be visible.

Merging

Before you merge your changes back into the mainline master branch, you can consider making a patch file. This is a file containing the differences between your current branch and the master branch. Please see the section creating a patch file to share with others below.

When you are ready to merge your changes back into the mainline master branch, then you can follow this procedure.

  1. Switch back to the master branch by selecting Branch/Checkout from the menu (or type CTRL-O). You will be given a dialog box of available branches. Select master and press OK.
  2. From the menu, pick Merge/Local Merge (or CTRL-M).

This should merge your changes back into master. If you then use the the gitk viewer, then the merge you just made should be visible.

Tagging

Git also allows you to tag a particular commit with an alphanumeric string for easy reference. This tag will allow users to just refer to the tag name using git pull.

Tagging is best done outside of the git gui. You can just type:

git tag GEOS-Chem v8-03-01
git tag GEOS-Chem v8-03-01-patched

etc. at the Unix command line.

NOTE: Tagging is something that typically only the GEOS-Chem support team will do.

Deleting branches

Once you have merged your changes back into the master branch, you may delete the branch you just created. In the git gui, go to the Branch/Delete menu item. You will be given a dialog box where you can select the name of the branch you wish to delete.

Sharing your revisions with others (and vice versa)

One of the really nice features of Git is that it can create patch files, or files which contain a list of changes that can be imported into someone else's local Git repository. Using patch files totally obviates the need of having to merge differences between codes manually.

Creating a patch file to share with others

To create a patch file of your master branch, type the following text:

git format-patch master --stdout > my-patch-file.diff

This will pipe the output from the git format-patch command to a file named by you. You can then include the patch file as an email attachment and send it to other GEOS-Chem users, or the GEOS-Chem Support Team.

NOTE: You can create a patch file from other branches by just supplying the name of the branch (i.e. replace the text master in the above example w/ the name of the branch whose revisions you want to share).

Reading a patch file into your local repository

Other users can also send you their source code revisions as patch files. To ingest their changes into your local Git repository you should first make a new branch. Follow this procedure:

  1. Pick Branch/Create from the menu (or type CTRL-N). Give your branch a descriptive name like Updates_from_xxxx" that will serve as a mnemonic.
  2. Pick Branch/Checkout from the menu (or type CTRL-O) and switch to the branch you just created.
  3. To ingest the other person's source code changes, type:
     git am < their-patch-file.diff

You can then test the other person's revisions in the separate branch until you are sure they are OK. Then you can merge them back into the master branch as described above.

Getting updates from the remote repository

When a new GEOS-Chem version is released, we recommend that you download it into a new local directory with the git clone command.

However, there may be times when "patches" (i.e. minor updates to fix bugs or other issues) need to be applied to an existing GEOS-Chem version. The easiest way to obtain patches is to use the git pull command, as follows:

  1. Change to your local code directory (e.g. Code.v8-03-01)
  2. Make a new branch named patch (or something similar).
  3. Check out the patch branch. Now we are ready to obtain the updates from the remote server.
  4. Use the git pull command to download the updated files, as follows:
    1. Jacob group users type: git pull /as/pub/git/bmy/gamap2 master
    2. All others users type: git pull git://git.as.harvard.edu/bmy/GEOS-Chem master
  5. Test: compilation and few time steps to make sure everything is fine
  6. Check out the master branch.
  7. Merge the patch branch into your master branch.
  8. Delete the patch branch.

This will merge the changes from the master branch of the remote repository into your master branch.

--Bob Y. 13:04, 16 March 2010 (EDT)

In summary

We recommend using the git gui because of its user-friendly interface. The following operations are best done from the GUI interface:

  1. Creating and checking out branches
  2. Committing code
  3. Merging code
  4. Deleting branches
  5. Examining revision history (you may also use gitk as standalone)

The following operations are best done from the command line:

  1. git cloneInitial download of repository
  2. git push:  Send changes to a remote repository
  3. git pullGet changes from a remote repository
  4. git tagAttach a label to a particular commit
  5. git format-patchCreate a patch

--Bob Y. 10:14, 19 March 2010 (EDT)