Difference between revisions of "Running GCHP: Basics"
|Line 14:||Line 14:|
== Overview ==
== Overview ==
This page presents the basic information needed to run
This page presents the basic information needed to run GCHP. The default GCHP run directories are configured for a 1-hr simulation at c24 resolution using native resolution meteorology, six cores, and one node. This simple configuration is a good test case to check that GCHP runs on your system. Typically the TransportTracer simulation requires about 50G and the standard and benchmark simulations require about 110G.
== How to Run GCHP ==
== How to Run GCHP ==
Revision as of 22:47, 6 March 2019
- Hardware and Software Requirements
- Downloading Source Code and Data Directories
- Obtaining a Run Directory
- Setting Up the GCHP Environment
- Basic Example Run
- Configuring a Run
- Output Data
- Developing GCHP
- Run Configuration Files
This page presents the basic information needed to run GCHP. The default GCHP run directories are configured for a 1-hr simulation at c24 resolution using native resolution meteorology, six cores, and one node. This simple configuration is a good test case to check that GCHP runs on your system. Typically the TransportTracer simulation requires about 50G and the standard and benchmark simulations require about 110G. More advanced instructions for configuring your GCHP run with different settings is in the next chapter.
Prior to running GCHP, always run through the following checklist to ensure everything is set up properly.
- Your run directory contains the executable geos.
- All symbolic links are present in your run directory and point to a valid path. These include TileFiles, MetDir, MainDataDir, ChemDataDir, CodeDir, and an initial restart file at the grid resolution you will run at.
- The input meteorology resolution and source are as you intend (inspect with "grep MetDir ExtData.rc" and "file MetDir"). Note: for versions 12.1.0 and later, create a new run directory if you wish to change Met source.
- You have looked through and set all configurable settings in runConfig.sh (discussed in the next chapter)
- You have a run script (see below for information about run scripts)
- The resource allocation in runConfig.sh and your run script are consistent (# nodes and cores).
- The run script sources your environment file that you used for compiling GCHP (gchp.env for version 12.1.0 and later).
- If reusing a run directory, you have archived your last run or discarded it with 'make cleanup_output' (optional but recommended; discussed in the next chapter)
How to Run GCHP
You can run GCHP locally from within your run directory (interactively) or by submitting your run to your cluster's job scheduler. To make running GCHP simpler there is a folder in the GCHP run directory called runScriptSamples that contains example scripts to run GCHP. Each file includes additional steps to make the run process easier, including sourcing your environment file so all libraries are loaded, deleting file cap_restart from any previous runs, sourcing config file runConfig.sh (more on this next chapter), and sending standard output to a log file. cap_restart is a text file output by GCHP with the simulation end date, and in some instances GCHP will attempt to start new runs at that date. It is therefore good practice to delete the file when rerunning within the same a run directory.
Use example run script gchp.local.run to run GCHP locally on your machine. Before running, check that you have at least 6 cores available at your disposal. Then copy gchp.local.run to the main level of your run directory and type the following at the command prompt:
If your run crashes during transport then you need additional memory. Either request an interactive session on your cluster with additional memory or consider running GCHP as a batch job by submitting your run to a job scheduler.
Running as a Batch Job
The recommended job script example is gchp.run which is custom for use with SLURM on the Harvard University Odyssey cluster. However, it may be adapted for other systems. You may also adapt the interactive run script gchp.local.run for your system as well. The "multirun" scripts are for submitting multiple consecutive jobs in a row and are more advanced. Read more about that option in the chapter on configuring a run later in this manual.
Example run scripts send standard output to file gchp.log by default and require manually configuring your job-specific resources such as number of cores and nodes. If using versions prior to 12.1.0 then you must also manually add your environment filename to the script; later versions simply source local symbolic link gchp.env which you set to point to your environment file during run directory setup.
If using SLURM, submit your batch job with this command:
Job submission is different for other systems. For example, to submit a Grid Engine batch file, type:
If your computational cluster uses a different job scheduler (e.g. LSF or PBS), then check with your IT staff about how to submit batch jobs. Please also consider submitting your working run script for inclusion in the run script examples folder in future versions.
Verifying a Successful Run
There are several ways to verify that your run was successful.
- NetCDF files are present in the OutputDir subdirectory.
- gchp.log ends with timing information for the run.
- Your scheduler log (e.g. output from SLURM) does not contain any obvious errors.
- gchp.log contains text with format "AGCM Date: YYYY/MM/DD Time: HH:mm:ss" for each timestep (e.g. 00:10, 00:20, 00:30, 00:40, 00:50, and 01:00 for a 1-hr run).
If it looks like something went wrong, check all log files (type "ls *.log" in run directory to list them) as well as your scheduler output file (if one exists) to determine where there may have been an error. Beware that if you have a problem in one of your configuration files then you will likely see a MAPL error with traceback to the GCHP/Shared directory. Review all of your configuration files to ensure you have proper setup. Errors in "CAP" typically indicate an error with your start time, end time, and/or duration set in runConfig.sh (more on this file in the next chapter). Errors in "ExtData" often indicate an error with your input files specified in either HEMCO_Config.rc or ExtData.rc. Errors in "HISTORY" are related to your configured output in HISTORY.rc
GCHP errors can be cryptic. If you find yourself debugging within MAPL then you may be on the wrong track as most issues can be resolved by updating the run settings. If you cannot figure out where you are going wrong please create an issue on the GCHP GitHub issue tracker located at https://github.com/geoschem/gchp/issues.