Timing tests with GEOS-Chem 12.0.0
Contents
Overview
The GEOS-Chem Support Team has created a timing test package that you can use to determine the performance of GEOS-Chem on your system. The time test runs the GEOS-Chem 12.0.0 release code (in GEOS-Chem "Classic" mode) for 7 model days with the "Standard" chemistry mechanism. Our experience has shown that a 7-day simulation will give a more accurate timing result than a 1-day simulation. This is because much of the file I/O (i.e. HEMCO reading annual or monthly-mean emissions fields) occurs on the first day of a run.
Installation
If you haven't already, download the GEOS-Chem 12.0.0 source code and unit tester using:
# Get the GEOS-Chem 12.0.0 source code git clone https://github.com/geoschem/geos-chem Code.12.0.0 cd Code.12.0.0 git checkout -b 12.0.0 cd .. # Get the GEOS-Chem 12.0.0 unit tester git clone https://github.com/geoschem/geos-chem-unittest UT cd UT git checkout -b 12.0.0+PrintTimeFix
NOTE: Checking out the tag 12.0.0+PrintTimeFix will make sure that you have an updated version of the printTime script, which is used to print out the results from a time test simulation log file.
Create your run directory using the GEOS-Chem UnitTest by following these instructions. To copy the gc_timing run directory, make sure you have the following lines in your CopyRunDirs.input file:
#--------|-----------|------|------------|------------|------------|---------| # MET | GRID | NEST | SIMULATION | START DATE | END DATE | EXTRA? | #--------|-----------|------|------------|------------|------------|---------| geosfp 4x5 - gc_timing 2016070100 2016070800 -
Also make sure to
Compilation (versions 12.0.0 thru 12.5.0)
To build the code for the timing tests, you must first clean the source code and run directory of any files left over from a previous run:
cd geosfp_4x5_gc_timing make superclean
If you wish to perform a timing test using the binary punch (aka bpch) diagnostics, use this command to build GEOS-Chem:
make -j4 TIMERS=1 mpbuild 2>&1 log.build
Or if you wish to perform a timing test using the netCDF diagnostics, use this command:
make -j4 NC_DIAG=y TIMERS=1 mpbuild 2>&1 log.build
Information about the options used for the compilation (as well as the compiler version) will be printed to the file lastbuild.mp.
Compilation (versions 12.6.0 and later)
NOTE: In GEOS-Chem 12.6.0 the compiling commands were changed The mpbuild target was replaced with build.
To build the code for the timing tests, you must first clean the source code and run directory of any files left over from a previous run:
cd geosfp_4x5_gc_timing make superclean
If you wish to perform a timing test using the binary punch (aka bpch) diagnostics, use this command to build GEOS-Chem:
make -j4 TIMERS=1 build
Information about the options used for the compilation (as well as the compiler version) will be printed to the GEOS-Chem_log_files#The_lastbuild_file|lastbuild log file]]. Also, the output of the compilation will be sent to compile.log.
Performing the timing test
To run the code, follow the instructions in the
geosfp_4x5_gc_timing/README
file. We have provided sample run scripts that you can use to submit jobs:
geosfp_4x5_gc_timing/doTimeTest # Submit job directly geosfp_4x5_gc_timing/doTimeTest.slurm # Submit job using the SLURM scheduler geosfp_4x5_gc_timing/doTimeTest.pbs # Submit job using the PBS scheduler (in GEOS-Chem 12.7.0 and later)
Special instructions for using SLURM
As described in the README file, if you are going to submit a time test via the SLURM scheduler, please take a moment to edit the #SBATCH tags at the top of the doTimeTest.slurm file:
#SBATCH -c NUMBER_OF_CORES #SBATCH -N 1 #SBATCH -t 06:00:00 #SBATCH -p PARTITION NAME #SBATCH --mem=10000 #SBATCH --exclusive #SBATCH --mail-type=ALL #SBATCH --mail-user=MY_EMAIL_ADDRESS #SBATCH -o ./doTimeTest.log.%j #SBATCH -e ./doTimeTest.log.%j
Replace
- NUMBER_OF_CORES with the number of cores that your time test will use,
- PARTITION_NAME with the name of the queue that your simulation will run in, and
- MY_EMAIL_ADDRESS with the email address where you will receive SLURM notifications.
Also note: We recommend to leave the #SBATCH --exclusive tag. This will reserve a whole node for the run, even if you do not use all of the cores on the node. This will yield in more accurate timing results, as it will prevent other jobs from "backfilling" onto the node.
To submit the job, type:
sbatch doTimeTest.slurm
The regular GEOS-Chem output as well as timing information will be sent to a log file named:
doTimeTest.log.ID
where ID is either the SLURM job ID # or the process ID.
Displaying the test results
You can print out the timing results with the printTime script:
cd geosfp_4x5_gc_timing ./printTime doTimeTest.log.ID
which will display results similar to this:
GEOS-Chem 7-Model-Day Time Test Results =============================================================================== Machine information ------------------------------------------------------------------------------- Machine or node name: : holy7c19114.rc.fas.harvard.edu CPU vendor : GenuineIntel CPU model name : Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz CPU speed [MHz] : 2557.160 GEOS-Chem information ------------------------------------------------------------------------------- GEOS-Chem Version : 12.0.0 Last commit : Remove obsolete, commented-out code from several modules Commit date : Thu Aug 9 16:59:22 2018 -0400 Compiler version : gfortran 7.1.0 Compilation options : geosfp 4x5 standard traceback bpch_diag no_reduced timers Simulation information ------------------------------------------------------------------------------- Simulation start date : 20160701 000000 Simulation end date : 20160708 000000 Number of CPUs used : 12 Total CPU time [s] : 82657.95 Wall clock time [s] : 9090.28 CPU / Wall ratio : 9.093 % of ideal performace : 75.78
You can then use these results to fill in the table below.
--Bob Yantosca (talk) 20:18, 13 August 2018 (UTC)
Table of 7-model-day run times
The following timing test results were done with the "out-of-the-box" GEOS-Chem 12.0.0 release code configuration.
- All jobs used GEOS-FP meteorology at 4° x 5° resolution.
- Jobs started on model date 2016/07/01 00:00 GMT and finished on 2016/07/08 00:00 GMT.
- The code was compiled from the run directory (geosfp_4x5_gc_timing) with one of the standard options:
- make -j4 TIMERS=1 mpbuild
- This sets: MET=geosfp GRID=4x5 CHEM=Standard UCX=y NO_REDUCED=n TRACEBACK=n BOUNDS=n FPE=n DEBUG=n NO_ISO=n NEST=n BPCH_DIAG=y
- make -j4 NC_DIAG=y TIMERS=1 mpbuild
- This sets: MET=geosfp GRID=4x5 CHEM=Standard UCX=y NO_REDUCED=n TRACEBACK=n BOUNDS=n FPE=n DEBUG=n NO_ISO=n NEST=n NC_DIAG=y
- All timing test jobs are assumed to use binary punch diagnostics (BPCH_DIAG=y) by default except where otherwise noted.
- make -j4 TIMERS=1 mpbuild
- Wall clock times are listed from fastest to slowest, for the same number of CPUs.
- It's OK to round CPU and wall clock times to the nearest second, for clarity.
- Feel free to add your own results below!
Timing tests using 32 cores
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | CPU time | Wall time | CPU / Wall ratio |
% ideal | Notes |
---|---|---|---|---|---|---|---|---|---|
Bob Yantosca (Harvard) | holy7c09216.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1672.289 | 136933 38:02:13 |
4654 01:17:35 |
29.425 | 91.95% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c13205.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1227.925 | 113147 31:25:48 |
5338 01:28:59 |
21.198 | 66.24% | BPCH_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c13209.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1252.453 | 151201 42:00:00 |
5539 01:32:20 |
27.296 | 85.30% | NC_DIAG=1 TIMERS=1 |
Bob Yantosca (Harvard) | holy7c15105.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1235.964 | 122252 33:57:32 |
5721 01:35:20 |
21.370 | 66.78% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c15106.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1262.871 | 123799 34:23:20 |
6207 01:43:26 |
19.946 | 62.33% | NC_DIAG=y TIMERS=1
|
Bob Yantosca (Harvard) | holy7c07106.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1200.117 | 122860 34:07:41 |
6307 01:45:07 |
20.196 | 63.11% | NC_DIAG=y TIMERS=1 -ffast-math |
Timing tests using 24 cores
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | CPU time | Wall time | CPU / Wall ratio |
% ideal | Notes |
---|---|---|---|---|---|---|---|---|---|
Bob Yantosca (Harvard) | holy7c19112.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2679.878 | 114550 31:49:09 |
5181 01:26:20 |
22.108 | 92.12% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c15104.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1298.472 | 93157 25:52:37 |
5697 01:34:55 |
16.353 | 68.14% | BPCH_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c19106.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2704.898 | 122925 34:08:46 |
6004 01:40:05 |
20.473 | 85.31% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c11214.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2781.843 | 101403 28:10:01 |
6139 01:42:18 |
16.518 | 68.82% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c17304.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2999.964 | 99394.14 27:36:32 |
6353.72 01:45:54 |
15.529 | 64.70% | NC_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c11215.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2757.48 | 103622 28:47:02 |
6680 01:51:18 |
15.513 | 64.64% | NC_DIAG=y TIMERS=1 |
Timing tests using 16 cores
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | CPU time | Wall time | CPU / Wall ratio |
% ideal | Notes |
---|---|---|---|---|---|---|---|---|---|
Bob Yantosca (Harvard) | holy7c05205.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2729.835 | 88891 24:41:31 |
5887 01:38:06 |
15.100 | 94.38% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c17316.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1490.015 | 78153 21:42:32 |
6726 01:52:05 |
11.443 | 71.52% | BPCH_DIAG=y TIMERS=1 -ffast-math
|
Bob Yantosca (Harvard) | holy7c03315.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2536.242 | 102716 28:31:53 |
6933 01:55:34 |
14.815 | 92.59% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c09215.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2499.984 | 100401 27:53:20 |
7086 01:58:05 |
14.169 | 88.56% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c15302.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1295.929 | 87989 24:26:28 |
7455 02:04:16 |
11.802 | 73.76% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c13209.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2600.554 | 97272 27:01:12 |
8871 02:27:50 |
10.965 | 68.53% | NC_DIAG=y TIMERS=1 -ffast-math
|
Timing tests using 12 cores
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | CPU time | Wall time | CPU / Wall ratio |
% ideal | Notes |
---|---|---|---|---|---|---|---|---|---|
Bob Yantosca (Harvard) | holy7c05210.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2499.984 | 76898 21:21:36 |
6666 01:51:11 |
11.536 | 96.13% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c13313.rc.fas.harvard.edu gfortran 7.1 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1294.699 | 63631 17:38:42 |
6945 01:55:44 |
9.163 | 76.35% | BPCH_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c13310.rc.fas.harvard.edu gfortran 7.1 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2599.98 | 69639 19:20:39 |
7899 02:11:39 |
8.816 | 73.47% | NC_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c03303.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1994.097 | 84611 23:30:10 |
7943 02:12:21 |
10.653 | 88.77% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c05310.rc.fas.harvard.edu gfortran 7.1 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2599.980 | 76182 21:09:43 |
8577 02:25:55 |
8.882 | 70.93% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c19114.rc.fas.harvard.edu gfortran 7.1 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2557.16 | 82658 22:57:40 |
9090 02:31:30 |
9.093 | 75.78% | BPCH_DIAG=y TIMERS=1
|
Timing tests using 8 cores
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | CPU time | Wall time | CPU / Wall ratio |
% ideal | Notes |
---|---|---|---|---|---|---|---|---|---|
Bob Yantosca (Harvard) | holy7c03103.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2727.375 | 67163 18:39:21 |
8738 02:25:37 |
7.687 | 96.08% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c01105.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1900.253 | 68464 19:01:05 |
9266 02:34:26 |
7.389 | 92.36% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c01114.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2944.265 | 60457 16:47:44 |
9584 02:39:43 |
6.308 | 78.85% | BPCH_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c07108.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1258.769 | 60566 16:49:26 |
9734 02:42:14 |
6.222 | 77.77% | NC_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c05310.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1293.222 | 65767 18:16:27 |
10084 02:48:04 |
6.522 | 81.53% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c17102.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1500.433 | 66796 18:33:16 |
10642 02:57:22 |
6.277 | 78.46% | NC_DIAG=y TIMERS=1 |
Timing tests using 4 cores
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | CPU time | Wall time | CPU / Wall ratio |
% ideal | Notes |
---|---|---|---|---|---|---|---|---|---|
Bob Yantosca (Harvard) | holy7c01106.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1200.117 | 48795 13:33:14 |
14155 03:55:55 |
3.447 | 86.18% | BPCH_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c11216.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2499.984 | 49472 13:44:31 |
14155 03:58:55 |
3.451 | 86.28% | NC_DIAG=y TIMERS=1 -ffast-math |
Bob Yantosca (Harvard) | holy7c01113.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 2915.554 | 56866 15:47:46 |
14810 04:06:50 |
3.840 | 95.99% | NC_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c17111.rc.fas.harvard.edu ifort 17.0.4 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1631.191 | 55835 15:30:35 |
15553 04:19:13 |
3.590 | 89.75% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c05310.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1293.222 | 56953 15:49:13 |
16097 04:28:17 |
3.54 | 88.45% | BPCH_DIAG=y TIMERS=1 |
Bob Yantosca (Harvard) | holy7c17102.rc.fas.harvard.edu gfortran 7.1.0 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz | 1500.433 | 59050 16:24:10 |
17037 04:43:57 |
3.466 | 86.65% | NC_DIAG=y TIMERS=1 |
Graphs of 7-model-day run times
Scalability across cores
The above plot shows the run times for various GEOS-Chem "Classic" 12.0.0 configurations. Bob Yantosca generated the above plot using the timing results listed in the table above. For a more in-depth analysis of these timing results, please see this presentation.
--Bob Yantosca (talk) 20:32, 16 August 2018 (UTC)
Time spent in each GEOS-Chem operation
The above plots show how much time is spent in each GEOS-Chem operation for several different GEOS-Chem "Classic" 12.0.0 configurations. Bob Yantosca generated the above plots using the timing results listed in the table above. For a more in-depth analysis of these timing results, please see this presentation.
--Bob Yantosca (talk) 20:43, 16 August 2018 (UTC)
GCHP timing tests
Help us pool performance information across systems by contributing your GCHP run information on our GCHP Timing Tests page.
--Bob Yantosca (talk) 17:58, 17 August 2018 (UTC)