Difference between revisions of "Running GCHP: Basics"
(→Running Interactively on SLURM)
|Line 78:||Line 78:|
== Verifying a Successful Run ==
== Verifying a Successful Run ==
Revision as of 19:30, 8 August 2018
- Hardware and Software Requirements
- Downloading Source Code
- Obtaining a Run Directory
- Setting Up the GCHP Environment
- Basic Example Run
- Run Configuration Files
- Advanced Run Examples
- Output Data
- Developing GCHP
The default GCHP run directories are configured for a 1-hr simulation at c24 resolution using 0.25x0.325 GEOS-FP meteorology, six cores, and one node. This simple configuration is a good test case to check that GCHP runs on your system. This page presents the basic information needed to run GCHP for this test case.
Prior to running GCHP, always run through the following checklist to ensure everything is set up properly:
- Your run directory contains the executable geos.
- All symbolic links are present in your run directory and point to a valid path. These include TileFiles, MetDir, MainDataDir, ChemDataDir, CodeDir, and an initial restart file.
- The input meteorology resolution in ExtData.rc (inspect with "grep MetDir ExtData.rc") and MetDir (inspect with "file MetDir") are as you intend.
- File runConfig.sh has all run settings that you intend to use.
- The restart file grid resolution matches the cubed sphere resolution set in runConfig.sh
- You have a run script. See runScriptSamples/ for examples. gchp.run is the most basic example.
- The resource allocation in runConfig.sh and your run script are consistent.
- The run script sources the bashrc file that you used for compiling GCHP.
- File cap_restart is not present in the run directory. If it is present, you can manually delete it or do "make cleanup_output" to remove files from your previous run. If you want to save files from your previous run, you can use the archiveRun.sh script to save them prior to cleaning up the run directory (e.g. ./archive_run.sh my_saved_run)
You can run GCHP by executing the appropriate run command directly on the command line from within your run directory or by submitting your run as a batch job.
Running as a Batch Job
Sample run scripts are included in the runScriptSamples/ subdirectory for submitting your run as a scheduled job. All example run scripts send standard output to file GCHP.log by default and require manually configuring your bashrc filename and job-specific resources such as number of cores and nodes. Unless otherwise noted in the run script filename, all sample run scripts assume use of SLURM (simple linux utility for resource management). If your system is not SLURM, you can adapt the sample run scripts to work on your system.
To submit your SLURM batch file, simply type:
Job submission is different for other system. For example, to submit a Grid Engine batch file, type:
If your computational cluster uses a different job scheduler (e.g. LSF or PBS), then check with your IT staff about how to submit batch jobs.
Running Interactively on SLURM
Before running GCHP interactively, check that your environment is set up properly and you have at least 6 cores available with 6G memory per core. Then execute the following command from within your run directory:
srun -n 6 --mpi=pmi2 ./geos 2>&1 | tee GCHP.log
This command can be broken down as follows:
|Command||What it does|
|srun ... ./geos||Runs executable geos as a parallel job|
|-n 6||Specifies how many individual CPU cores are requested for the run. The number given here should always be the total number of cores, regardless of how many nodes they are spread over. The number of CPUs that you request must be a multiple of 6 (at least one core for each of the cubed-sphere faces, and the same number of cores for each face).|
|--mpi-pmi2||Specifies usage of the MVAPICH2 implementation of MPI. Do not include this if not using MVAPICH2.|
|2>&1 | tee GCHP.log||Specifies that all MAPL output, both standard and error, be written to both the screen and to file GCHP.log.|
You can also adapt a GCHP run script for interactive use. Simply comment out the scheduler-specific code such as #SBATCH headers for SLURM.
Verifying a Successful Run
There are several ways to verify that your run was successful.
- NetCDF files are present in the OutputDir subdirectory.
- GCHP.log ends with timing information for the run.
- Your scheduler log (e.g. output from SLURM) does not contain any obvious errors.
- GCHP.log contains text with format "AGCM Date: YYYY/MM/DD Time: HH:mm:ss" for each timestep (e.g. 00:10, 00:20, 00:30, 00:40, 00:50, and 01:00 for a 1-hr run).
If it looks like something went wrong, check all log files (type "ls *.log" in run directory to list them) as well as your scheduler output file (if one exists) to determine where there may have been an error. Beware that if you have a problem in one of your configuration files then you will likely see a MAPL error with traceback to the GCHP/Shared directory. Review all of your configuration files to ensure you have proper setup.
GCHP errors can be cryptic. If you find yourself debugging within MAPL then you may be on the wrong track as most issues can be resolved by updating the run settings. Please send an email to the GEOS-Chem Support Team if you hit a wall deciphering the problem. You can also reach out the GCHP community in the GCHP Slack workspace.