Profiling GEOS-Chem: Difference between revisions

From Geos-chem
Jump to navigation Jump to search
Line 29: Line 29:
|[[PGI Fortran compiler|PGI Fortran compiler (<tt>pgfortran</tt>)]]
|[[PGI Fortran compiler|PGI Fortran compiler (<tt>pgfortran</tt>)]]
|<tt>-pg</tt>
|<tt>-pg</tt>
|}
Once you have compiled GEOS-Chem with <tt>GPROF=y</tt>, you can run GEOS-Chem as you normally would.  GEOS-Chem will create a file called <tt>gmon.out</tt> containing the profiling information.  To convert this file to text table, you need to run the <tt>gprof</tt> program.  The syntax for calling <tt>gprof</tt> is:
gprof [optional commands] [name of GEOS-Chem executable file] gmon.out > [name of text file with results]
To create a simple table of time spent in each GEOS-Chem subroutine (also called a [https://sourceware.org/binutils/docs/gprof/Flat-Profile.html#Flat-Profile flat profile]), you can call <tt>gprof</tt> as follows:
gprof geos.mp gmon.out > GEOS-Chem_Profile_Results.txt
where <tt>geos.mp</tt> is the GEOS-Chem executable file. (It may have a different name than <tt>geos.mp</tt> depending on how you compiled GEOS-Chem, in which run directory.) The <tt>GEOS-Chem_Profile_results.txt</tt> file will contain output similar to this:
Flat profile:
Each sample counts as 0.01 seconds.
  %  cumulative  self              self    total         
  time  seconds  seconds    calls  Ks/call  Ks/call  name   
  16.50    801.03  801.03 11317370    0.00    0.00  gckpp_linearalgebra_mp_kppdecomp_
  9.63  1268.29  467.26 43751785    0.00    0.00  gckpp_linearalgebra_mp_kppsolve_
  6.29  1573.53  305.24 26835672    0.00    0.00  gckpp_function_mp_fun_
  4.09  1772.06  198.53    17402    0.00    0.00  tpcore_fvdas_mod_mp_fzppm_
  4.04  1967.96  195.90  6834149    0.00    0.00  gckpp_jacobian_mp_jac_sp_
  2.86  2106.99  139.03      72    0.00    0.04  flexchem_mod_mp_do_flexchem_
  2.23  2215.23  108.25 182368327    0.00    0.00  gckpp_linearalgebra_mp_waxpy_
  2.20  2322.10  106.87    45915    0.00    0.00  convection_mod_mp_do_merra_convection_
  2.05  2421.82    99.72      648    0.00    0.00  dao_mod_mp_airqnt_
  2.05  2521.43    99.61    3051    0.00    0.00  fast_jx_mod_mp_blkslv_
  2.05  2620.81    99.38    3487    0.00    0.00  fast_jx_mod_mp_opmie_
  1.80  2708.34    87.53      144    0.00    0.00  convection_mod_mp_do_convection_
  1.78  2794.64    86.30    6533    0.00    0.00  vdiff_mod_mp_vdiff_
  ... etc ...
The columns
{| border=1 cellspacing=0 cellpadding=5
|-valign="top" bgcolor="#CCCCCC"
!width="200px"|Column
!width="800px"|Description
|-valign="top"
|<tt>% time</tt>
|This is the percentage of the total execution time your program spent in this function. These should all add up to 100%.
|-valign="top"
|<tt>cumulative seconds</tt>
|This is the cumulative total number of seconds the computer spent executing this functions, plus the time spent in all the functions above this one in this table.
|-valign="top"
|<tt>self seconds</tt>
|This is the number of seconds accounted for by this function alone. The flat profile listing is sorted first by this number.
|-valign="top"
|<tt>calls</tt>
|This is the total number of times the function was called. If the function was never called, or the number of times it was called cannot be determined (probably because the function was not compiled with profiling enabled), the calls field is blank.
|-valign="top"
|<tt>self ms/call</tt>
|This represents the average number of milliseconds spent in this function per call, if this function is profiled. Otherwise, this field is blank for this
function.
|-valign="top"
|<tt>total ms/call</tt>
|This represents the average number of milliseconds spent in this function and its descendants per call, if this function is profiled. Otherwise, this field is blank for this function. This is the only field in the flat profile that uses call graph analysis.
|-valign="top"
|<tt>name</tt>
|This is the name of the function. The flat profile is sorted by this field alphabetically after the self seconds and calls fields are sorted.


|}
|}

Revision as of 17:59, 14 December 2016

File:Page is under construction.jpg

Overview

Profiling GEOS-Chem with gprof

The GNU profiler (gprof) is a free, open source software package that you can use to determine which subroutines of GEOS-Chem are taking the most time. If your computer system has the GNU Compiler Collection (GCC)] already installed (as do most modern Unix/Linux clusters), then gprof should also be available for you to use.

GEOS-Chem v11-01 and higher versions have support for gprof. To profile GEOS-Chem, you must compile with the GPROF=y Makefile option, e.g.:

make -j4 GPROF=y ...etc. other makefile options ...

This will add the following compiler switches to the compilation sequence:

Compiler Switch to invoke gprof
Intel Fortran compiler (ifort) -p
GNU Fortran compiler (gfortran) -pg
PGI Fortran compiler (pgfortran) -pg

Once you have compiled GEOS-Chem with GPROF=y, you can run GEOS-Chem as you normally would. GEOS-Chem will create a file called gmon.out containing the profiling information. To convert this file to text table, you need to run the gprof program. The syntax for calling gprof is:

gprof [optional commands] [name of GEOS-Chem executable file] gmon.out > [name of text file with results]

To create a simple table of time spent in each GEOS-Chem subroutine (also called a flat profile), you can call gprof as follows:

gprof geos.mp gmon.out > GEOS-Chem_Profile_Results.txt

where geos.mp is the GEOS-Chem executable file. (It may have a different name than geos.mp depending on how you compiled GEOS-Chem, in which run directory.) The GEOS-Chem_Profile_results.txt file will contain output similar to this:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  Ks/call  Ks/call  name    
 16.50    801.03   801.03 11317370     0.00     0.00  gckpp_linearalgebra_mp_kppdecomp_
  9.63   1268.29   467.26 43751785     0.00     0.00  gckpp_linearalgebra_mp_kppsolve_
  6.29   1573.53   305.24 26835672     0.00     0.00  gckpp_function_mp_fun_
  4.09   1772.06   198.53    17402     0.00     0.00  tpcore_fvdas_mod_mp_fzppm_
  4.04   1967.96   195.90  6834149     0.00     0.00  gckpp_jacobian_mp_jac_sp_
  2.86   2106.99   139.03       72     0.00     0.04  flexchem_mod_mp_do_flexchem_
  2.23   2215.23   108.25 182368327     0.00     0.00  gckpp_linearalgebra_mp_waxpy_
  2.20   2322.10   106.87    45915     0.00     0.00  convection_mod_mp_do_merra_convection_
  2.05   2421.82    99.72      648     0.00     0.00  dao_mod_mp_airqnt_
  2.05   2521.43    99.61     3051     0.00     0.00  fast_jx_mod_mp_blkslv_
  2.05   2620.81    99.38     3487     0.00     0.00  fast_jx_mod_mp_opmie_
  1.80   2708.34    87.53      144     0.00     0.00  convection_mod_mp_do_convection_
  1.78   2794.64    86.30     6533     0.00     0.00  vdiff_mod_mp_vdiff_
  ... etc ...

The columns

Column Description
% time This is the percentage of the total execution time your program spent in this function. These should all add up to 100%.
cumulative seconds This is the cumulative total number of seconds the computer spent executing this functions, plus the time spent in all the functions above this one in this table.
self seconds This is the number of seconds accounted for by this function alone. The flat profile listing is sorted first by this number.
calls This is the total number of times the function was called. If the function was never called, or the number of times it was called cannot be determined (probably because the function was not compiled with profiling enabled), the calls field is blank.
self ms/call This represents the average number of milliseconds spent in this function per call, if this function is profiled. Otherwise, this field is blank for this

function.

total ms/call This represents the average number of milliseconds spent in this function and its descendants per call, if this function is profiled. Otherwise, this field is blank for this function. This is the only field in the flat profile that uses call graph analysis.
name This is the name of the function. The flat profile is sorted by this field alphabetically after the self seconds and calls fields are sorted.

Profiling GEOS-Chem with TAU

Information to be added