Difference between revisions of "User talk:Salvatore Farina"

From Geos-chem
Jump to: navigation, search
(Overview)
(moved the page to a new home.)
 
Line 1: Line 1:
This page describes how to acquire the latest source code, data, and libraries required to build and run GEOS-Chem with TOMAS microphysics on the ace-net glooscap cluster.
+
If you're here, you're probably looking for the [[TOMAS setup guide]]!
 
+
== Overview ==
+
 
+
The latest public release of GEOS-Chem with TOMAS does not include many of the recent developments in aerosol science.  It also cannot take advantage of parallel computing technologies.
+
However, the 'bleeding edge' code has many recent developments in GEOS-Chem/TOMAS that are not included in the public release, including parallel computing.
+
 
+
== Getting Set Up ==
+
 
+
=== Code ===
+
You can grab the absolute latest code from my source directory on glooscap:
+
cp -r /home/sfarina/source/GC_Bleeding_Edge/ ~
+
 
+
or, (safer) you can grab my latest "snapshot"
+
cp /home/sfarina/source/GC_BE_snapshot-latest.tgz .
+
 
+
=== Libraries ===
+
'''geos-chem-libraries-intel11''' is a bundle of software required to build and run the latest version of GEOS-Chem.
+
Included in this package:
+
* Intel Ifort Fortran compiler - v11.1 - required to build geoschem
+
* NetCDF - Network Common Data Format libraries - required to read and write certain datasets
+
* HDF5 - Hierarchical Data Format - required to read and write certain datasets
+
* other dependencies - required for netcdf and hdf5
+
 
+
You can copy this folder as a tarball from /home/sfarina/gclibs.tgz or simply extract it directly to your home directory:
+
cd ~
+
tar zxvf /home/sfarina/gclibs.tgz
+
 
+
This will extract the libraries folder to your home directory.
+
 
+
=== Environment ===
+
In order to get the compiler to run and recognize the libraries described above, some environment variables must be set.  Below is an excerpt from my ''.bashrc''.
+
 
+
ROOT_LIBRARY_DIR="/home/sfarina/geos-chem-libraries-intel11"
+
GC_BIN=$ROOT_LIBRARY_DIR/bin
+
GC_INCLUDE=$ROOT_LIBRARY_DIR/include
+
GC_LIB=$ROOT_LIBRARY_DIR/lib
+
export GC_BIN
+
export GC_INCLUDE
+
export GC_LIB
+
+
export FC="ifort"
+
+
export LD_LIBRARY_PATH="/home/sfarina/geos-chem-libraries-intel11/lib"
+
export PATH="/home/sfarina/geos-chem-libraries-intel11/Compiler/11.1/080/bin/intel64:/home/sfarina/opt/bin:$PATH"
+
export LD_LIBRARY_PATH="/usr/local/gnu/lib64:/usr/local/gnu/lib:/home/sfarina/geos-chem-libraries-intel11/lib:/home/sfarina/geos-chem-libraries-intel11/Compiler/11.1/080/lib/intel64/:/home/sfarina/geos-chem-libraries-intel11/Compiler/11.1/080/idb/lib/intel64"
+
export INTEL_LICENSE_FILE="/home/sfarina/geos-chem-libraries-intel11/software/intel/Compiler/11.1/080/Licenses"
+
source /home/sfarina/geos-chem-libraries-intel11/Compiler/11.1/080/bin/ifortvars.sh intel64
+
+
ulimit -S -s unlimited
+
 
+
If you are using bash, you can copy/paste this to your ''.bashrc''. Once the compiler and libraries are installed in ''~/geos-chem-libraries-intel11'' change instances of ''sfarina'' to your username.
+
source ~/.bashrc
+
ifort --version
+
 
+
If ifort returns
+
ifort (IFORT) 11.1 20101201
+
you should be all set to start compiling
+
 
+
=== Data ===
+
To set up the necessary data for GEOS-Chem, simply
+
cd ~
+
ln -s /home/sfarina/data .
+
 
+
This will allow you to link to my data directory, which is mostly a collection of links to the data at ''/home/rmartin/group/ctm/'' with some changes due to recent GC development.
+
'''DO NOT''' copy this directory, as it is many many many gigabytes, and is probably beyond your disk quota on glooscap.
+
 
+
== Building GEOS-Chem/TOMAS ==
+
 
+
=== Compiler ===
+
Please note that the '''ONLY VERSION''' of the intel compiler which reliably compiles a working executable of geos-chem with TOMAS is version 11.1.
+
Installation is described above in the libraries section.
+
 
+
=== Make ===
+
Glooscap allows you to use multicore interactive shells to do heavy processing. I invoke a 16 core shell to build geoschem. put this in your .bashrc
+
alias pshell16="qrsh -V -cwd -l h_rt=08:00:00 -l h_vmem=2.0G -l h_stack=12.5G -N IA_16 -pe openmp 16 bash"
+
alias pshell8="qrsh -V -cwd -l h_rt=08:00:00 -l h_vmem=2.0G -l h_stack=12.5G -N IA_8 -pe openmp 8 bash"
+
 
+
Then you can do
+
cd YOUR_CODE_DIR/GC_Bleeding_Edge/GeosCore
+
pshell16
+
make -j16 tomas40
+
 
+
This will build GEOS-Chem with 40 bin TOMAS using 16 processors at a time. As an added bonus, this will not choke up the rest of the users on glooscap.
+
 
+
The available target names are:
+
tomas                <--TOMAS 30
+
tomas12
+
tomas15
+
tomas40
+
 
+
==== Important! ====
+
When changing tomas versions, always always always do
+
make realclean
+
 
+
== Running GEOS-Chem with TOMAS ==
+
 
+
=== Run Directories ===
+
There are run directories for each of the tomas versions at:
+
/net/samqfs/pierce/sfarina/standard_run_directories/
+
 
+
Copy the tarballs (named 40.tgz, 30.tgz, etc.) to a standard location. You can then do
+
tar zxvf YOUR_STANDARD_LOCATION/40.tgz
+
to extract the appropriate run directory to your current working directory. The folder will be named ''run.TOMASXX'', where ''XX'' is 12, 15,30, or 40 depending on the version you would like to run.
+
 
+
Once you have the appropriate version of geostomas compiled and your run directory extracted, copy the executable to your run directory.
+
 
+
=== input.geos ===
+
The input.geos file is where most of the runtime options for geoschem are configured.
+
There are currently no TOMAS specific entries in the input.geos file, save for diagnostic output quantities.
+
Please see the [http://acmg.seas.harvard.edu/geos/doc/man/chapter_5.html#5.2.1 Users' Guide] for more information.
+
 
+
=== Submitting Jobs to the Parallel Queue ===
+
In each folder is a file called ''parallel.sh''. Below is a description of some of the parameters:
+
#!/bin/bash
+
# $ -S /bin/bash
+
./etc/profile
+
#$ -o job_output
+
#$ -l h_rt=100:00:00                            #wall clock time requested from grid engine. Lower request times will have higher priority in the queue
+
#$ -l h_vmem=2.0G                              #vmem requested from grid engine. 2.0 is sufficient for all versions at 4x5 and TOMAS15 at 2x2.5 on 16 cores
+
#$ -l h_stack=12.5G                            #stack memory requested from grid engine
+
#$ -N RUN_NAM                                  #a name for your run
+
#$ -pe openmp 16                                #number of cores you are requesting from grid engine
+
#$ -cwd                                        #inherit properties from your current shell
+
export OMP_NUM_THREADS=16                      #number of openMP threads
+
export KMP_STACKSIZE=500000000                  #stacksize memory limit for each thread
+
+
ulimit -t unlimited              # cputime
+
ulimit -f unlimited              # filesize
+
ulimit -c unlimited              # coredumpsize
+
ulimit -m unlimited              # memoryuse
+
ulimit -l unlimited              # memorylocked
+
+
cd YOUR_RUN DIRECTORY
+
./geostomas > log
+
 
+
You'll need to edit it slightly (run name and working directory), then run:
+
qsub parallel.sh
+
 
+
You can check on the status in the queue with
+
qstat
+
 
+
You can watch the logfile output of your simulation with
+
tail -f log
+
 
+
With some minimal editing, you can find some summary information from your runs using the script here
+
/net/samqfs/pierce/sfarina/testruns/informed/hourstat.sh
+
 
+
== Developing ==
+
Writing for GEOS_Chem is pretty straightforward.  Please try to follow the [http://acmg.seas.harvard.edu/geos/doc/man/appendix_7.html style guide] as much as possible.  Most of TOMAS is contained within tomas_mod.F90, and you should be able to find what you need with a little work and a few invocations of ''grep''. If you can't find what you need, '''ask'''.
+
 
+
=== Version Control ===
+
Git! You should definitely use [http://git-scm.com/‎ git] to track your changes. I have a copy built & installed at ''/home/sfarina/opt/bin/git'' that you can probably either copy or just use.
+
==== Setup ====
+
I have a copy of git installed at
+
/home/sfarina/opt/bin
+
You can either use this executable or build it yourself from source. To use this executable, add the following to your .bashrc
+
export PATH="/home/sfarina/opt/bin:$PATH"
+
 
+
==== Branching and Commits ====
+
 
+
Once you have ''git'' installed, make a separate branch for yourself as soon as you make a copy of the code, this way we can easily trade/track updates/advances/bugfixes.
+
git checkout -b MY_NEW_BRANCH
+
vi fictional_example_mod.F90
+
git status
+
git add fictional_example_mod.F90
+
git commit
+
 
+
==== Patching ====
+
If I make some new changes to my branch of code, you will need to do a patch and merge. My current branch in git is called '''tomasmerge'''.  If I provide you with '''update.patch''', this should do the trick:
+
git checkout tomasmerge
+
git apply update.patch
+
git checkout MY_BRANCH
+
git merge tomasmerge
+
 
+
==== Reference ====
+
There are many useful resources for git on the web. Here are some I found useful:
+
* [http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging Branching and Merging]
+
* [http://ariejan.net/2009/10/26/how-to-create-and-apply-a-patch-with-git/ Creating and Applying Patches]
+
* [http://lostechies.com/joshuaflanagan/2010/09/03/use-gitk-to-understand-git/ Understanding git through gitk]
+
 
+
=== Debugging ===
+
There are two major ways of debugging: inserting massive amounts of print statements, or using a debugger. Both are useful.
+
 
+
ifort comes with a debugger similar to gdb: iidb.
+
geos-chem-libraries-intel11/Compiler/11.1/080/bin/intel64/iidb
+
In order to use it, you must compile geostomas as follows
+
make realclean
+
make DEBUG=yes tomas
+
 
+
Apart from the debugger and normal print statements, TOMAS has a very useful builtin called ''DEBUGPRINT'', that prints the values of the TOMAS size bins in a big table.
+
 
+
== Post Processing ==
+
Now that you've successfully run the model, there are a few more hurdles to inspect your data.
+
 
+
=== Installing IDL ===
+
Copy the IDL / gamap scripts from my home directory.
+
cp -r ~sfarina/IDL ~
+
 
+
Edit the following as needed, and add it to your .bashrc
+
IDL_STARTUP="/home/sfarina/IDL/idl_startup/idl_startup.pro"
+
IDL_DIR="/usr/local/itt/idl/idl80/"
+
IDL_PATH="$IDL_DIR:/home/sfarina/IDL"
+
module load idl/8.0
+
 
+
=== Processing ===
+
GEOS-Chem currently outputs all data in the form of a binary punch file (.bpch).  These files must be handled using IDL.  The process is outlined below:
+
 
+
==== Copy ====
+
Copy the relevant files to your postprocessing directory for a given run
+
ctm.bpch
+
diaginfo.dat
+
tracerinfo.dat
+
proc_one.pro
+
averageCNCCN_XX.py            <-- XX is TOMAS version
+
plotCNCCN.py
+
 
+
==== Split ====
+
Use the script Bpch_Sep_Sal interactively from within the IDL environment to ctm.bpch into separate months
+
For example, to extract august, 2005 from ctm.bpch
+
idl
+
> Bpch_Sep_Sal,'ctm.bpch','ctm.08.bpch',Tau0=nymd2tau(20050801)
+
> exit
+
 
+
==== Create netcdf output ====
+
Using the IDL script proc_one.pro, we extract information from the monthly .bpch files and save it to the standard netCDF
+
Edit proc_one.pro to use the correct infile/outfiles
+
Execute proc_one from your shell:
+
idl proc_one.pro
+
 
+
==== Counting CN and CCN ====
+
Run averageCNCCN_XX.py, where XX is the model version
+
For example, to bin and average the August results from TOMAS15:
+
./averageCNCCN_15.py 08
+
 
+
==== Plotting the Results====
+
Edit your directory name to be of the format YYY_run.TOMASXX, where YYY is a run number, and XX is the TOMAS version.
+
plotCNCCN.py will automatically detect the model version and customize map names.
+
To plot the surface and zonal average concentrations of CN3, CN10, CN40, and CN80 for august:
+
./plotCNCCN.py 08
+
 
+
Once you have completed this process, you will have a zonal and surface level map of CN3, CN10, CN40 and CN80 predicted by the model.
+
 
+
==== NCview ====
+
You can also use ncview on the file ctm.nc to view individual species concentrations or nucleation rates.
+
ncview ctm.nc
+
ncview ctm_nuc.nc
+
 
+
== Other Advice / Issues==
+
* If you have followed these instructions and geoschem crashes without any output, try (un)commenting the ''"welcome to geoschem"'' and the following ''call flush'' lines from main.F
+
* I use the GNU Bourne Again SHell (bash).  I suggest you do the same.  The csh is fine, but I have written all of my scripts using bash. Your life will probably be easier if you use bash.
+
* It is a good idea to TAKE NOTES on the details of your simulations.
+
* Making a backup of your code and any important files is a good idea. Making two backups is a better idea.
+
* if you have any questions or you are running into trouble, ''please ask'' either myself, Sajeev, or Jeff for help.  I am usually able to respond to emails within a day, and am willing to use gchat or skype if need be.
+
 
+
--[[User:Salvatore Farina|Salvatore Farina]] 17:28, 25 July 2013 (EDT)
+

Latest revision as of 15:57, 26 July 2013

If you're here, you're probably looking for the TOMAS setup guide!