Timing tests with GEOS-Chem 12.0.0

From Geos-chem
Jump to: navigation, search


GEOS-Chem v11-02-final will also carry the designation GEOS-Chem 12.0.0. We are migrating to a purely numeric versioning system in order to adhere more closely to software development best practices. For a complete description of the new versioning system, please see our GEOS-Chem version numbering system wiki page.



Overview

The GEOS-Chem Support Team has created a timing test package that you can use to determine the performance of GEOS-Chem on your system. The time test runs the GEOS-Chem 12.0.0 release code (in GEOS-Chem "Classic" mode) for 7 model days with the "Standard" chemistry mechanism. Our experience has shown that a 7-day simulation will give a more accurate timing result than a 1-day simulation. This is because much of the file I/O (i.e. HEMCO reading annual or monthly-mean emissions fields) occurs on the first day of a run.

Installation

If you haven't already, download the GEOS-Chem 12.0.0 source code and unit tester using:

# Get the GEOS-Chem 12.0.0 source code
git clone https://github.com/geoschem/geos-chem Code.12.0.0
cd Code.12.0.0
git checkout -b 12.0.0
cd ..

# Get the GEOS-Chem 12.0.0 unit tester
git clone https://github.com/geoschem/geos-chem-unittest UT
cd UT
git checkout -b 12.0.0+PrintTimeFix

NOTE: Checking out the tag 12.0.0+PrintTimeFix will make sure that you have an updated version of the printTime script, which is used to print out the results from a time test simulation log file.

Create your run directory using the GEOS-Chem UnitTest by following these instructions. To copy the gc_timing run directory, make sure you have the following lines in your CopyRunDirs.input file:

#--------|-----------|------|------------|------------|------------|---------|
# MET    | GRID      | NEST | SIMULATION | START DATE | END DATE   | EXTRA?  |
#--------|-----------|------|------------|------------|------------|---------|
 geosfp   4x5         -      gc_timing    2016070100   2016070800   -

Also make sure to

Compilation

To build the code for the timing tests, you must first clean the source code and run directory of any files left over from a previous run:

 cd geosfp_4x5_gc_timing
 make superclean

If you wish to perform a timing test using the binary punch (aka bpch) diagnostics, use this command to build GEOS-Chem:

 make -j4 TIMERS=1 mpbuild 2>&1 log.build

Or if you wish to perform a timing test using the netCDF diagnostics, use this command:

 make -j4 NC_DIAG=y TIMERS=1 mpbuild 2>&1 log.build

Information about the options used for the compilation (as well as the compiler version) will be printed to the file lastbuild.mp.

Performing the timing test

To run the code, follow the instructions in the

 geosfp_4x5_gc_timing/README 

file. We have provided sample run scripts that you can use to submit jobs:

 geosfp_4x5_gc_timing/doTimeTest          # Submit job directly
 geosfp_4x5_gc_timing/doTimeTest.slurm    # Submit job using the SLURM scheduler  

Special instructions for using SLURM

As described in the README file, if you are going to submit a time test via the SLURM scheduler, please take a moment to edit the #SBATCH tags at the top of the doTimeTest.slurm file:
        #SBATCH -c NUMBER_OF_CORES
        #SBATCH -N 1
        #SBATCH -t 06:00:00
        #SBATCH -p PARTITION NAME
        #SBATCH --mem=10000
        #SBATCH --exclusive
        #SBATCH --mail-type=ALL
        #SBATCH --mail-user=MY_EMAIL_ADDRESS
        #SBATCH -o ./doTimeTest.log.%j
        #SBATCH -e ./doTimeTest.log.%j
Replace
  • NUMBER_OF_CORES with the number of cores that your time test will use,
  • PARTITION_NAME with the name of the queue that your simulation will run in, and
  • MY_EMAIL_ADDRESS with the email address where you will receive SLURM notifications.

Also note: We recommend to leave the #SBATCH --exclusive tag. This will reserve a whole node for the run, even if you do not use all of the cores on the node. This will yield in more accurate timing results, as it will prevent other jobs from "backfilling" onto the node.

To submit the job, type:
       sbatch doTimeTest.slurm

The regular GEOS-Chem output as well as timing information will be sent to a log file named:

 doTimeTest.log.ID

where ID is either the SLURM job ID # or the process ID.

Displaying the test results

You can print out the timing results with the printTime script:

 cd geosfp_4x5_gc_timing
 ./printTime doTimeTest.log.ID

which will display results similar to this:

GEOS-Chem 7-Model-Day Time Test Results
===============================================================================

Machine information
-------------------------------------------------------------------------------
Machine or node name: : holy7c19114.rc.fas.harvard.edu
CPU vendor            : GenuineIntel
CPU model name        : Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz
CPU speed [MHz]       : 2557.160

GEOS-Chem information
-------------------------------------------------------------------------------
GEOS-Chem Version     : 12.0.0
Last commit           : Remove obsolete, commented-out code from several modules 
Commit date           : Thu Aug 9 16:59:22 2018 -0400 
Compiler version      : gfortran 7.1.0
Compilation options   : geosfp 4x5 standard traceback bpch_diag no_reduced timers

Simulation information
-------------------------------------------------------------------------------
Simulation start date : 20160701 000000
Simulation end date   : 20160708 000000
Number of CPUs used   : 12
Total CPU time  [s]   : 82657.95
Wall clock time [s]   : 9090.28
CPU / Wall ratio      : 9.093
% of ideal performace : 75.78

You can then use these results to fill in the table below.

--Bob Yantosca (talk) 20:18, 13 August 2018 (UTC)

Table of 7-model-day run times

The following timing test results were done with the "out-of-the-box" GEOS-Chem 12.0.0 release code configuration.

  • All jobs used GEOS-FP meteorology at 4° x 5° resolution.
  • Jobs started on model date 2016/07/01 00:00 GMT and finished on 2016/07/08 00:00 GMT.
  • The code was compiled from the run directory (geosfp_4x5_gc_timing) with one of the standard options:
    • make -j4 TIMERS=1 mpbuild
      • This sets: MET=geosfp GRID=4x5 CHEM=Standard UCX=y NO_REDUCED=n TRACEBACK=n BOUNDS=n FPE=n DEBUG=n NO_ISO=n NEST=n BPCH_DIAG=y
    • make -j4 NC_DIAG=y TIMERS=1 mpbuild
      • This sets: MET=geosfp GRID=4x5 CHEM=Standard UCX=y NO_REDUCED=n TRACEBACK=n BOUNDS=n FPE=n DEBUG=n NO_ISO=n NEST=n NC_DIAG=y
    • All timing test jobs are assumed to use binary punch diagnostics (BPCH_DIAG=y) by default except where otherwise noted.
  • Wall clock times are listed from fastest to slowest, for the same number of CPUs.
  • It's OK to round CPU and wall clock times to the nearest second, for clarity.
  • Feel free to add your own results below!

Timing tests using 32 cores

Submitter Machine or Node
and Compiler
CPU vendor CPU model Speed [MHz] CPU time Wall time CPU / Wall
ratio
% ideal Notes
Bob Yantosca (Harvard) holy7c09216.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1672.289 136933
38:02:13
4654
01:17:35
29.425 91.95% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c13205.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1227.925 113147
31:25:48
5338
01:28:59
21.198 66.24% BPCH_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c13209.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1252.453 151201
42:00:00
5539
01:32:20
27.296 85.30% NC_DIAG=1 TIMERS=1
Bob Yantosca (Harvard) holy7c15105.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1235.964 122252
33:57:32
5721
01:35:20
21.370 66.78% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c15106.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1262.871 123799
34:23:20
6207
01:43:26
19.946 62.33% NC_DIAG=y TIMERS=1


Bob Yantosca (Harvard) holy7c07106.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1200.117 122860
34:07:41
6307
01:45:07
20.196 63.11% NC_DIAG=y TIMERS=1
-ffast-math

Timing tests using 24 cores

Submitter Machine or Node
and Compiler
CPU vendor CPU model Speed [MHz] CPU time Wall time CPU / Wall
ratio
% ideal Notes
Bob Yantosca (Harvard) holy7c19112.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2679.878 114550
31:49:09
5181
01:26:20
22.108 92.12% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c15104.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1298.472 93157
25:52:37
5697
01:34:55
16.353 68.14% BPCH_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c19106.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2704.898 122925
34:08:46
6004
01:40:05
20.473 85.31% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c11214.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2781.843 101403
28:10:01
6139
01:42:18
16.518 68.82% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c17304.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2999.964 99394.14
27:36:32
6353.72
01:45:54
15.529 64.70% NC_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c11215.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2757.48 103622
28:47:02
6680
01:51:18
15.513 64.64% NC_DIAG=y TIMERS=1

Timing tests using 16 cores

Submitter Machine or Node
and Compiler
CPU vendor CPU model Speed [MHz] CPU time Wall time CPU / Wall
ratio
% ideal Notes
Bob Yantosca (Harvard) holy7c05205.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2729.835 88891
24:41:31
5887
01:38:06
15.100 94.38% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c17316.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1490.015 78153
21:42:32
6726
01:52:05
11.443 71.52% BPCH_DIAG=y TIMERS=1
-ffast-math
  • NOTE: Might have been affected by /n/regal disk issues
Bob Yantosca (Harvard) holy7c03315.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2536.242 102716
28:31:53
6933
01:55:34
14.815 92.59% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c09215.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2499.984 100401
27:53:20
7086
01:58:05
14.169 88.56% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c15302.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1295.929 87989
24:26:28
7455
02:04:16
11.802 73.76% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c13209.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2600.554 97272
27:01:12
8871
02:27:50
10.965 68.53% NC_DIAG=y TIMERS=1
-ffast-math
  • NOTE: Might have been affected by /n/regal disk issues

Timing tests using 12 cores

Submitter Machine or Node
and Compiler
CPU vendor CPU model Speed [MHz] CPU time Wall time CPU / Wall
ratio
% ideal Notes
Bob Yantosca (Harvard) holy7c05210.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2499.984 76898
21:21:36
6666
01:51:11
11.536 96.13% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c13313.rc.fas.harvard.edu
gfortran 7.1
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1294.699 63631
17:38:42
6945
01:55:44
9.163 76.35% BPCH_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c13310.rc.fas.harvard.edu
gfortran 7.1
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2599.98 69639
19:20:39
7899
02:11:39
8.816 73.47% NC_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c03303.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1994.097 84611
23:30:10
7943
02:12:21
10.653 88.77% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c05310.rc.fas.harvard.edu
gfortran 7.1
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2599.980 76182
21:09:43
8577
02:25:55
8.882 70.93% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c19114.rc.fas.harvard.edu
gfortran 7.1
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2557.16 82658
22:57:40
9090
02:31:30
9.093 75.78% BPCH_DIAG=y TIMERS=1
  • NOTE: Might have been affected by issues on /n/regal disk

Timing tests using 8 cores

Submitter Machine or Node
and Compiler
CPU vendor CPU model Speed [MHz] CPU time Wall time CPU / Wall
ratio
% ideal Notes
Bob Yantosca (Harvard) holy7c03103.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2727.375 67163
18:39:21
8738
02:25:37
7.687 96.08% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c01105.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1900.253 68464
19:01:05
9266
02:34:26
7.389 92.36% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c01114.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2944.265 60457
16:47:44
9584
02:39:43
6.308 78.85% BPCH_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c07108.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1258.769 60566
16:49:26
9734
02:42:14
6.222 77.77% NC_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c05310.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1293.222 65767
18:16:27
10084
02:48:04
6.522 81.53% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c17102.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1500.433 66796
18:33:16
10642
02:57:22
6.277 78.46% NC_DIAG=y TIMERS=1

Timing tests using 4 cores

Submitter Machine or Node
and Compiler
CPU vendor CPU model Speed [MHz] CPU time Wall time CPU / Wall
ratio
% ideal Notes
Bob Yantosca (Harvard) holy7c01106.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1200.117 48795
13:33:14
14155
03:55:55
3.447 86.18% BPCH_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c11216.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2499.984 49472
13:44:31
14155
03:58:55
3.451 86.28% NC_DIAG=y TIMERS=1
-ffast-math
Bob Yantosca (Harvard) holy7c01113.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 2915.554 56866
15:47:46
14810
04:06:50
3.840 95.99% NC_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c17111.rc.fas.harvard.edu
ifort 17.0.4
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1631.191 55835
15:30:35
15553
04:19:13
3.590 89.75% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c05310.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1293.222 56953
15:49:13
16097
04:28:17
3.54 88.45% BPCH_DIAG=y TIMERS=1
Bob Yantosca (Harvard) holy7c17102.rc.fas.harvard.edu
gfortran 7.1.0
GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz 1500.433 59050
16:24:10
17037
04:43:57
3.466 86.65% NC_DIAG=y TIMERS=1

Graphs of 7-model-day run times

Scalability across cores

Scalability 7day 12.0.0.png

The above plot shows the run times for various GEOS-Chem "Classic" 12.0.0 configurations. Bob Yantosca generated the above plot using the timing results listed in the table above. For a more in-depth analysis of these timing results, please see this presentation.

--Bob Yantosca (talk) 20:32, 16 August 2018 (UTC)

Time spent in each GEOS-Chem operation

Ops gf bpch 12.0.0.png

Ops gf nc 12.0.0.png

The above plots show how much time is spent in each GEOS-Chem operation for several different GEOS-Chem "Classic" 12.0.0 configurations. Bob Yantosca generated the above plots using the timing results listed in the table above. For a more in-depth analysis of these timing results, please see this presentation.

--Bob Yantosca (talk) 20:43, 16 August 2018 (UTC)

GCHP timing tests

Help us pool performance information across systems by contributing your GCHP run information on our GCHP Timing Tests page.

--Bob Yantosca (talk) 17:58, 17 August 2018 (UTC)