Timing tests with GEOS-Chem v10-01
On this page we show the results of timing tests done with GEOS-Chem v10-01.
Contents
Overview
The GEOS-Chem Support Team has created a timing test package that you can use to determine the performance of GEOS-Chem on your system. The time test runs the GEOS-Chem v10-01 public release code for 7 model days with the "benchmark" chemistry mechanism. Our experience has shown that a 7-day simulation will give a more accurate timing result than a 1-day simulation. This is because much of the file I/O (i.e. HEMCO reading annual or monthly-mean emissions fields) occurs on the first day of a run.
Installation
To install the time test package on your system with:
wget "ftp://ftp.as.harvard.edu/gcgrid/geos-chem/7day_tests/gc_timing_v10.tar.gz" tar xvzf gc_timing_v10.tar.gz
Compilation
To build the code, follow these steps:
cd gc_timing/run.v10-01 make realclean make -j4 mpbuild > log.build
Performing the timing test
To run the code, follow the instructions in the
gc_timing/run.v10-01/README
file. We have provided sample run scripts that you can use to submit jobs:
gc_timing/run.v10-01/doTimeTest # Submit job directly gc_timing/run.v10-01/doTimeTest.slurm # Submit job using the SLURM scheduler
The regular GEOS-Chem output as well as timing information will be sent to a log file named:
doTimeTest.log.ID
where ID is either the SLURM job ID # or the process ID.
Displaying the test results
You can print out the timing results with the printTime script:
cd gc_timing/run.v10-01 ./printTime doTimeTest.log.ID
which will display results similar to this:
GEOS-Chem Time Test output ==================================================================== Machine or node name: : holyseas04.rc.fas.harvard.edu CPU vendor : AuthenticAMD CPU model name : AMD Opteron(tm) Processor 6376 CPU speed [MHz] : 2300.078 Number of CPUs used : 8 Simulation start date : 20130701 000000 Simulation end date : 20130708 000000 Total CPU time [s] : 55287.61 Wall clock time [s] : 7999.61 CPU / Wall ratio : 6.9113 % of ideal performace : 86.39
You can then use these results to fill in the table below.
--Bob Yantosca (talk) 19:06, 30 November 2015 (UTC)
Table of 7-model-day run times
The following timing test results were done with the "out-of-the-box" GEOS-Chem v10-01 public release code configuration.
- All jobs used GEOS-FP meteorology at 4° x 5° resolution.
- Jobs started on model date 2013/07/01 00:00 GMT and finished on 2013/07/08 00:00 GMT.
- The code was compiled from the run directory (run.v10-01) with the the standard option make -j4 mpbuild. This sets the following compilation variables:
- MET=geosfp GRID=4x5 CHEM=benchmark UCX=y NO_REDUCED=n TRACEBACK=n BOUNDS=n FPE=n DEBUG=n NO_ISO=n NEST=n
- Wall clock times are listed from fastest to slowest, for the same number of CPUs. (Bands of white and cyan in the table indicate different number of CPUs.)
- It's OK to round CPU and wall clock times to the nearest second, for clarity.
Submitter | Machine or Node and Compiler |
CPU vendor | CPU model | Speed [MHz] | # of CPUs |
CPU time | Wall time | CPU / Wall ratio |
% of ideal |
---|---|---|---|---|---|---|---|---|---|
Mat Evans (York/NCAS) | earth0.york.ac.uk ifort Version 13.0.1.117 |
GenuineIntel / SGU UV-2000 | Intel(R) Xeon(R) CPU E5-4650L 0 @ 2.60GHz | 2600.153 | 64 | 98821.79 s 27:27:01 |
1841.46 s 00:30:41 |
53.6649 | 83.85 |
Luke Schiferl (MIT) | hopper.louvre.mit.edu ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz | 2497.656 | 48 | 47530 s 13:12:10 |
1350 s 00:22:30 |
35.2186 | 73.37 |
Dylan Jones (U. Toronto) | fujita02.atmosp.physics.utoronto.ca ifort 13.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz | 2599.951 | 32 | 44498.61 s 12:21:39 |
1618.98 s 00:26:59 |
27.4856 | 85.89 |
Yanko Davila (CU Boulder) | node39 ifort 11.1.069 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz | 2400.00 | 32 | 40000.39 s 11:06:40 |
1641.58 s 00:27:22 |
24.367 | 76.15 |
Mat Evans (York/NCAS) | earth0.york.ac.uk ifort Version 13.0.1.117 |
GenuineIntel / SGU UV-2000 | Intel(R) Xeon(R) CPU E5-4650L 0 @ 2.60GHz | 2600.153 | 32 | 49170.2 s 13:39:3- |
1775.27 s 00:29:34 |
27.6973 | 86.55 |
Dylan Jones (U. Toronto) | gpc-f148n065-ib0 ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz | 1200.000 | 32 | 47187.63 s 13:06:28 |
2003.3 s 00:33:23 |
23.5546 | 73.61 |
Jenny Fisher (U. Wollongong) | hpcn11.local ifort 2015 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2300.055 | 32 | 84236.18 s 23:23:56 |
3217.73 s 00:53:38 |
26.1788 | 81.81 |
Luke Schiferl (MIT) | hopper.louvre.mit.edu ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz | 2500.000 | 24 | 29985 s 08:19:45 |
1519 s 00:25:19 |
19.7377 | 82.24 |
Luke Schiferl (MIT) | turner.louvre.mit.edu ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU X5675 @ 3.07GHz | 3068.000 | 24 | 38914 s 10:48:34 |
1965 s 00:32:45 |
19.8021 | 82.51 |
Yanko Davila (CU Boulder) | node30 ifort 11.1.069 |
GenuineIntel | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz | 2670.00 | 24 | 42881.39 s 11:54:41 |
2262.19 s 00:37:42 |
18.9557 | 78.98 |
Huang Shan (Tsinghua) | yxw.tsinghua.edu.cn | GenuineIntel | Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz | 2799.978 | 20 | 24062 s 06:41:02 |
1422 s 00:23:42 |
16.9264 | 84.63 |
Junwei Xu (Dalhousie) | newnode7 ifort 11.1 |
GenuineIntel | Intel(R) Xeon(R) CPU X5660 @ 2.80GHz | 2801.000 | 20 | 37485 s 10:24:45 |
2481 s 00:41:21 |
15.1095 | 75.55 |
Huang Shan (Tsinghua) | yxw.tsinghua.edu.cn | GenuineIntel | Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz | 2799.978 | 16 | 20523 s 05:42:03 |
1479 s 00:24:39 |
13.8802 | 86.75 |
Melissa Sulprizio (GCST) | regal18.rc.fas.harvard.edu ifort 11.1.069 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20 GHz | 2199.822 | 16 | 24559 s 06:49:19 |
1866 s 00:31:06 |
13.1594 | 82.25 |
Mat Evans (York/NCAS) | earth0.york.ac.uk ifort Version 13.0.1.117 |
GenuineIntel / SGU UV-2000 | Intel(R) Xeon(R) CPU E5-4650L 0 @ 2.60GHz | 2600.153 | 16 | 29962.15 s 08:19:22 |
2088.55 s 00:34:48 |
14.3459 | 89.66 |
Mat Evans (York/NCAS) | earth0.york.ac.uk ifort Version 16.0.1 |
GenuineIntel / SGU UV-2000 | Intel(R) Xeon(R) CPU E5-4650L 0 @ 2.60GHz | 2600.153 | 16 | 29551 s 08:12:31 |
2064 s 00:34:25 |
14.3459 | 89.45 |
Jenny Fisher (U. Wollongong / NCI) | r3199 (Raijin @ NCI) ifort 12.1.9.293 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz | 2601.00 | 16 | 22150.52 s 06:09:11 |
2660.78 s 00:44:20 |
12.4368 | 77.73 |
Melissa Sulprizio (GCST) | fry-02.as.harvard.edu ifort 11.1.069 |
GenuineIntel | Westmere E56xx/L56xx/X56xx (Nehalem-C) | 2925.998 | 16 | 35221 s 9:47:01 |
2734 s 00:45:34 |
12.8978 | 80.61 |
Dylan Jones (U. Toronto) | gpc-f145n006-ib0 ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU E5540 @ 2.53GHz | 2533.553 | 16 | 41140.44 s 11:25:40 |
3132.50 s 00:52:12 |
13.1334 | 82.08 |
Junwei Xu (Dalhousie) | cl081.dal.acenet.ca ifort 11.1 |
AuthenticAMD | Quad-Core AMD Opteron(tm) Processor 8384 | 2700.212 | 16 | 43277.22 s 12:01:17 |
3471.67 s 00:57:51 |
12.4658 | 77.90 |
Jenny Fisher (U. Wollongong) | hpcn11.local ifort 2015 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2299.992 | 16 | 50992.52 s 14:09:53 |
3725.06 s 01:02:05 |
13.689 | 85.56 |
Dylan Jones (U. Toronto) | fujita07.atmosp.physics.utoronto.ca ifort 11.1.080 |
GenuineIntel | Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 2394.067 | 16 | 53232.25 s 14:47:12 |
3844.04 s 01:04:04 |
13.848 | 86.55 |
Luke Schiferl (MIT) | hopper.louvre.mit.edu ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz | 2197.558 | 12 | 16879 s 04:39:49 |
1639 s 00:27:19 |
10.2994 | 85.83 |
Huang Shan (Tsinghua) | yxw.tsinghua.edu.cn | GenuineIntel | Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz | 2799.978 | 12 | 18410 s 05:06:50 |
1724 s 00:28:44 |
10.6758 | 88.97 |
Melissa Sulprizio (GCST) | regal18.rc.fas.harvard.edu ifort 11.1.069 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20 GHz | 2199.822 | 12 | 21718 s 06:01:58 |
2127 s 00:35:27 |
10.2086 | 85.07 |
Melissa Sulprizio (GCST) | fry-02.as.harvard.edu ifort 11.1.069 |
GenuineIntel | Westmere E56xx/L56xx/X56xx (Nehalem-C) | 2925.998 | 12 | 25443 s 7:04:03 |
2575 s 00:42:55 |
9.9881 | 82.34 |
Luke Schiferl (MIT) | turner.louvre.mit.edu ifort 12.1.3 |
GenuineIntel | Intel(R) Xeon(R) CPU X5675 @ 3.07GHz | 3068.000 | 12 | 32342 s 08:59:02 |
2989 s 00:49:49 |
10.8211 | 90.18 |
Karl Seltzer/Barron Henderson (Duke/UF) | c6a-s12.ufhpc ifort 12.1.5 |
AuthenticAMD | AMD Opteron(tm) Processor 6378 | 2400.038 | 12 | 41023 s 11:23:44 |
4268 s 01:11:08 |
9.6108 | 80.09 |
Zahra Hosseini (RWDI) | private PGI 14.7 (optimization -O1) |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz | 2600.157 | 12 | 68523 s 19:02:03 |
6319 s 01:45:19 |
10.8440 | 90.37 |
Jenny Fisher (U. Wollongong / NCI) | r105 (Raijin @ NCI) ifort 12.1.9.293 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz | 2601.00 | 8 | 18535.98 s 05:08:56 |
2660.78 s 00:44:21 |
6.9664 | 87.08 |
Mat Evans (York/NCAS) | earth0.york.ac.uk ifort Version 13.0.1.117 |
GenuineIntel / SGU UV-2000 | Intel(R) Xeon(R) CPU E5-4650L 0 @ 2.60GHz | 2600.153 | 8 | 20082 s 05:34:42 |
2681 s 00:44:40 |
7.4884 | 93.61 |
Melissa Sulprizio (GCST) | regal17.rc.fas.harvard.edu ifort 11.1.069 |
GenuineIntel | Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20 GHz | 2199.849 | 8 | 20398 s 05:39:58 |
2837 s 00:47:17 |
7.2045 | 90.06 |
Melissa Sulprizio (GCST) | fry-01.as.harvard.edu ifort 11.1.069 |
GenuineIntel | Westmere E56xx/L56xx/X56xx (Nehalem-C) | 2925.998 | 8 | 23048 s 06:24:08 |
3312 s 00:55:12 |
6.9611 | 87.01 |
Bob Yantosca (GCST) | fry-01.as.harvard.edu ifort 11.1.069 |
GenuineIntel | Westmere E56xx/L56xx/X56xx (Nehalem-C) | 2925.998 | 8 | 24234 s 06:43:54 |
3456 s 00:57:36 |
7.0114 | 87.64 |
Bob Yantosca (GCST) | fry-02.as.harvard.edu ifort 11.1.069 |
GenuineIntel | Westmere E56xx/L56xx/X56xx (Nehalem-C) | 2925.998 | 8 | 25222 s 07:00:22 |
3583 s 00:59:43 |
7.0397 | 88.0 |
Prasad Kasibhatla (Duke University) | fire.nicholas.duke.edu ifort 12.1.4 |
GenuineIntel | Intel(R) Xeon(R) CPU X5460 @ 3.16 GHz | 3158.993 | 8 | 29337 s 08:08:57 |
4317 s 01:11:57 |
6.7961 | 84.95 |
Bob Yantosca (GCST) | holyseas03.rc.fas.harvard.edu ifort 11.1.069 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2300.024 | 8 | 32972 s 09:09:32 |
5054 s 01:24:14 |
6.5241 | 81.55 |
Jenny Fisher (U. Wollongong) | hpcn01.local ifort 2015 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2299.983 | 8 | 37536.33 s 10:25:36 |
5146.54 s 01:25:47 |
7.2935 | 91.17 |
Bob Yantosca (GCST) | holyseas02.rc.fas.harvard.edu ifort 11.1.069 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2300.054 | 8 | 33722 s 09:22:02 |
5281 s 01:28:01 |
6.385 | 79.81 |
Melissa Sulprizio (GCST) | holyseas01.rc.fas.harvard.edu ifort 11.1.069 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2299.936 | 8 | 37379 s 10:22:59 |
5477 s 01:31:17 |
6.8353 | 85.44 |
Dylan Jones (U. Toronto) | fujita04.atmosp.physics.utoronto.ca ifort 11.1.080 |
GenuineIntel | Intel(R) Xeon(R) CPU E5410 @ 2.33GHz | 2327.499 | 8 | 43270.24 s 12:01:10 |
6033.45 s 01:40:33 |
7.1717 | 89.65 |
Karl Seltzer/Barron Henderson (Duke/UF) | c6a-s12.ufhpc ifort 12.1.5 |
AuthenticAMD | AMD Opteron(tm) Processor 6378 | 2399.936 | 8 | 35988 s 9:59:48 |
6137 s 01:42:17 |
5.8641 | 73.3 |
Bob Yantosca (GCST) | holyseas04.rc.fas.harvard.edu ifort 11.1.069 |
AuthenticAMD | AMD Opteron(tm) Processor 6376 | 2300.078 | 8 | 55288 s 15:21:28 |
8000 s 02:13:20 |
6.9113 | 86.39 |
Junwei Xu (Dalhousie) | dal.acenet.ca ifort 11.1 |
AuthenticAMD | Quad-Core AMD Opteron(tm) Processor 8384 | 2700.000 | 1 | 23443 s 06:30:43 |
23645 s 06:34:05 |
0.9915 | 99.15 |
A quick glance at the table shows that timing tests that used the Intel Fortran Compiler to compile GEOS-Chem always ran more slowly on machines with AMD CPUs than on machines with Intel CPUs. This is a long-standing issue. The Intel Fortran Compiler is known to optimize best on GenuineIntel CPUs.
--Bob Yantosca (talk) 16:46, 11 December 2015 (UTC)
Graph of 7-model-day run times
The plot below is a graphical representation of the table from the above section.
As you can see, the time tests that were done using AMD CPUs were consistently slower than those done using Intel CPUs. This is caused by the Intel Fortran Compiler not optimizing well on non-Intel chips. This has been a known issue for a long time.
Therefore, if you are contemplating purchasing a new machine for GEOS-Chem simulations, we recommend that you purchase one with Intel CPUs. This will give you the best performance..
--Bob Yantosca (talk) 19:21, 17 December 2015 (UTC)
For more information
For more information, please see our Guide to GEOS-Chem performance.
--Bob Yantosca (talk) 21:17, 15 December 2016 (UTC)