GAMAP tips and tricks

From Geos-chem
Jump to navigation Jump to search

On this page we list some very helpful tips on how to accomplish various tasks with the GAMAP routines. Please also see our GAMAP bugs and fixes page for a list of outstanding and resolved GAMAP bugs.

General GAMAP usage

GAMAP and IDL 7

Version 7 of IDL comes with a new development environment (called Workbench) based on Eclipse. If you are not using the IDL Workbench, you will not notice any change. But if you do, you will see a major upgrade, ... and may soon have a problem with the !PATH definition. For GAMAP to work properly, we provide an idl_startup.pro template file for you to modify for your system. In this routine, you used to see the !PATH defined like that:

IdlPath = Expand_Path( '+' + !PATH,  /All_Dirs ) 

!PATH       = EXPAND_PATH('+~/IDL/gamap2/', /ALL_DIRS) + ':' + $
              EXPAND_PATH('+~/IDL/idlpro/')            + ':' + $
              EXPAND_PATH('+~/IDL/jhuapl/')            + ':' + $
              IdlPath

Now, for GAMAP to work with the new IDL Workbench, you must use the following syntax instead:

; Call the IDL routine PATH_SEP to get the path separator token 
; (i.e. the character that you need to separate directories in !PATH). 
; This token is different depending on which OS you are using.
Sep     = Path_Sep( /Search_Path )

; Use the PREF_SET command to append your directories into the default
; IDL search path.  This approach is necessary if you want to use the 
; IDL Workbench in IDL 7.0 and later versions. (phs, bmy, 4/15/08)
PREF_SET, 'IDL_PATH',                                        $
           EXPAND_PATH('+~/IDL/gamap2/', /ALL_DIRS ) + Sep + $
           EXPAND_PATH('+~/IDL/idlpro/'            ) + Sep + $
           EXPAND_PATH('+~/IDL/jhuapl/'            ) + Sep + $
           '<IDL_DEFAULT>', /COMMIT

NOTES:

  1. The PREF_SET command was introduced in IDL 6.2.
  2. This syntax can be added to your idl_startup.pro file even if you do not use the IDL Workbench. More about that issue can be found here.
  3. You can add/delete directories to the IDL_PATH corresponding to the particular directory structure in your space. In this particular example we are adding the gamap2, idlpro, and jhuapl directories to IDL_PATH. If, for example, you don't have an idlpro directory, you can of course omit that.
  4. The EXPAND_PATH function should expand a relative file path (i.e. with a ~) to a full file path. However, there may be some flavors of Linux for which ~ might be problematic.
  5. Adding a + before the directory name in the call to EXPAND_PATH will make IDL search the directory and all of its subdirectories for files of the appropriate type (*.pro, *.sav) for the given path.
  6. Using the /ALL_DIRS keyword in the call to EXPAND_PATH, such as EXPAND_PATH( '+~/IDL/gamap2/', /ALL_DIRS ), will cause IDL to return the full path names of all of the subdirectories under ~/IDL/gamap2, regardless of whether or not they contain *.pro or *.sav files. (IDL's default behavior is to only return path names for those directories containing *.pro or *.sav files.)
  7. If the EXPAND_PATH function for some reason does not expand the filepaths, then you can specify them manually. Windows users might need to start the string with the drive letter (e.g. C:\).
  8. It is best to create the <IDL_DEFAULT> string with PREF_SET. Do not attempt to add the file paths directly to the IDLDE or IDL Workbench GUI.

--phs 16:42, 14 April 2008 (EDT)

--Bob Y. 13:49, 7 July 2008 (EDT)

Usage of TRACERINFO.DAT and DIAGINFO.DAT

User A said he keeps getting this message:

% UPDATE_TRACER_DATAINFO: WARNING: Use_DataInfo appears independent from global array!

User B said:

My ctm.bpch files have duplicate data blocks for some parameters. Something's wrong!

Answer:

These are symptoms of diaginfo.dat or/and tracerinfo.dat that do not match the binary punch file (user B), or that the bpch file was created with inadequate diag/tracerinfo.dat files and end up with duplicate 999 tracers without tracername for example (user B). The solution is to use correct diag/tracerinfo.dat when reading and creating bpch files. GEOS-Chem now writes those files for each runs in your run directory. If you use them when reading run outputs, you will not have a problem.

The first time a GAMAP routine is used, it will look for and read these files. First, in the current directory, then in the !path variable (which means it will used the default ones in ../gamap2/input_files/). So, to read those corresponding to your run you can:

  1. start your IDL session in your run directory
  2. start your IDL anywhere, and then: cd, 'your_run_directory', before using GAMAP
  3. copy your diag/tracerinfo.dat from your run dir to your ../gamap2/input_files/ (i.e., overwrite default ones), before using GAMAP
  4. Advance users can also, at anytime in their session, force GAMAP to read specific diag/tracerinfo.dat:
ctm_diaginfo, /all, /force_reading, filename='your_diaginfo.dat'
ctm_tracerinfo, /all, /force_reading, filename='your_tracerinfo.dat'

If you are using a default location, double check that it is really used with the following query:

print, file_which('tracerinfo.dat')

--phs 15:52, 4 April 2008 (EDT)

Date and time

This section has been split off into its own wiki page, Date and Time Computations with GAMAP.

Text manipulation with GAMAP

This section has been split off into its own wiki page, Text manipulation with GAMAP.

Memory management and CTM_MAKE_DATAINFO

When using CTM_MAKE_DATAINFO, you create few pointers and allocate memory to pointed data. Three scenarios to free that memory are possible.

(1) By default, GAMAP keeps track of the pointers in a global structure, and you can clean up the memory by calling CTM_CLEANUP (with or without the keyword /NO_GC):

    ctm_make_datainfo(data, ...)
    CTM_WriteBpch, DataInfo, FileInfo, FileName=file
    ctm_cleanup

(2) If you use the keyword /NO_GLOBAL when calling CTM_MAKE_DATAINFO, the story is a little bit more subtle. Now, GAMAP has no idea of the created pointers. If you use CTM_CLEANUP without the keyword /NO_GC, everything is clean up because there is a call to heap_gc. But you are also loosing **all other** pointers and objects.

    ctm_make_datainfo(data, ..., /No_global)
    CTM_WriteBpch, DataInfo, FileInfo, FileName=file
    ctm_cleanup

Note if you do not call ctm_cleanup or call it with /No_GC, then the memory allocated by CTM_MAKE_DATAINFO is still allocated and the pointers that refers to it are alive... until you exit the routine (unless they are passed back). Once you are out of the routine, this memory remains allocated but is useless since unaccessible, in other words you have memory leak.

(3) So, if you want to keep some objects and/or pointers alive in your code (i.e., you do not want to call heap_gc or ctm_cleanup,/no_gc), you need to free only the created pointers as follows:

    ctm_make_datainfo(data, datainfo, fileinfo, ..., /No_global)
    CTM_WriteBpch, DataInfo, FileInfo, FileName=file
    ptr_free, DataInfo.data
    ptr_free, fileinfo.gridinfo

To free only the heap memory created by multiple calls to CTM_MAKE_DATAINFO, the procedure is:

  for D=0L, NTracers-1L do begin
     
     Success = CTM_Make_DataInfo( Data[*,*,D], DataInfo, FileInfo, ..., /No_Global )
     
     ArrDataInfo = D eq 0l ? [ DataInfo ] : [ ArrDataInfo, DataInfo ]
     
     if D ne Ntracers-1l then ptr_free, Fileinfo.gridinfo
     
  endfor
     
  CTM_WriteBpch, ArrDataInfo, FileInfo, FileName=OutFileName
     
  for d=0, n_elements(ArrDataInfo)-1l do ptr_free, ArrDataInfo[d].data
  ptr_free, fileinfo.gridinfo

--phs 10:28, 13 February 2008

GAMAP routines for statistical analysis

GAMAP ships with several routins for computing various statistical quantities:

CUM_TOTAL
Computes the cumulative total of an array.
MEAN
Computes the mean value of an array of any # of dimensions.
ORG_CORR
Calculates the reduced major-axis.
PERCENTILES
Computes percentiles of an array
QQNORM
Sorts a data array, assigns assign actual "probability" and calculates the expected deviation from the mean.
RUN_AV
Computes the running average or running total of a data vector.

Also, the IDL routine MOMENT computes the 4 statistical moments of a data set: mean, variance, skewness, and kurtosis.

--Bob Y. 09:42, 8 July 2008 (EDT)

Binary punch file output

Reading binary punch files directly from IDL

The best way to read a GEOS-Chem binary punch file is by using the GAMAP main program gamap.pro. More experienced users may also use one of the lower-level routines (e.g. CTM_GET_DATA or CTM_GET_DATABLOCK) to read the file. However, if for any reason you need to read a binary punch file directly from IDL (i.e. without using GAMAP, CTM_GET_DATA, or CTM_GET_DATABLOCK), then you have a couple of options:

  1. The GAMAP routine BPCH_TEST (which is located in the file_io subdirectory of the main GAMAP directory) can be used to quickly parse a binary punch file. BPCH_TEST prints out the header information from the file as well as the min and max data values. This is very useful for debugging. You can look at the structure of the file gamap2/file_io/bpch_test.pro to see how the binary punch file is read.
  2. If you are more accustomed to working with R, GrADS, Matlab, or any other visualization package that takes netCDF files as input, then you might want to use the GAMAP routines BPCH2NC or BPCH2COARDS to first convert the binary punch file into netCDF format.

--Bob Y. 13:28, 23 May 2008 (EDT)

Splitting and joining binary punch files

GAMAP comes with two routines for working with GEOS-Chem binary punch file output directly. Routine BPCH_SEP will extract a data block from a big binary punch file and save it to a separate file:

Colette Heald (heald@atmos.colostate.edu) wrote:

I did a year-long run starting from 20011201 and all the output appears to have saved in one file labeled ctm.bpch.v7-04-13-tra54.2001120100
So first off I'm wondering, is there a way to save each monthly mean as a separate ctm.bpch file? Secondly, the output file won't read into gamap - I thought the file might be corrupt, but actually it seems to read in fine with bpch_test. When I do gamap, fi='<filename>', I get this error:
   Error (-261) reading file
   ctm.bpch.v7-04-13-tra54.2001120100
    in block 3912
   POINT_LUN: Negative postition argument not allowed. Position: -2147377716,
   Unit: 21
   File: ctm.bpch.v7-04-13-tra54.2001120100
I get the same error when I try to use ctm_get_datablock to pull out one tracer. Do you think the file is salvageable? Have you seen this problem before? I'm wondering if the file is just too big...

Bob Yantosca (yantosca@seas.harvard.edu) replied:

Right now the code can't save to separate bpch files. However, you can split them up wtih bpch_sep.pro into individual files in post-processing. For example:
   pro split

      ; Splits into monthly bpch files (change as necesary)
  
      ; The big input file
      InFile = 'ctm.bpch.v8-01-01-geos5-Run0.2005010100'
      ctm_cleanup

      ; Split the data in the big file for 20050101 into a separate file
      bpch_sep, InFile, 'ctm.bpch.v8-01-01-geos5-Run0.20050101', $
         tau=nymd2tau( 20050101 )
   end

GAMAP routine BPCH_LINK will concatenate data blocks from several smaller bpch files and save them into a large bpch file.

   ; Consolidates data from the 'ctm.bpch.*' files
   ; into a single file named 'new.ctm.bpch'
   BPCH_LINK, 'ctm.bpch.*', 'new.ctm.bpch'

--Bmy 16:53, 22 April 2008 (EDT)

Renaming and Regridding APROD/GPROD restart files

When using secondary aerosols, you need a restart_aprod_gprod.YYYYMMDDhh file. Unlike the general restart file used for GEOS-Chem runs, it cannot be simply renamed and ready to use for another simulation date.

Renaming APROD/GPROD restart files

You must rewrite you restart_gprod_aprod.YYYYMMDDhh so that the date in the filename is the same as the one in the datablock headers. A new routine in GAMAP v2.12 will do it for you: ../gamap2/date_time/rewrite_agprod.pro

Regridding APROD/GPROD restart files

You can regrid a 2ndary aerosol restart file with the same routines used to regrid the standard restart file. However you need to tell the routines to regrid all tracers with the keyword diagn=0 (which means "all tracers"):

regridh_restart, ..., diagn=0
regridv_restart, ..., diagn=0

--phs 14:51, 4 April 2008 (EDT)

Combining output from timeseries files

Ray Nassar (ray@io.as.harvard.edu) wrote:

I just have a quick question, is there any GAMAP routine that can average all hourly data blocks in a timeseries file to make a single daily average bpch file?
It appears like I could use the average option of ctm_sum.pro but I do not quite understand where the averaged data goes and afterwards would still have to write to bpch.

Philippe Le Sager (plesager@seas.harvard.edu) replied:

Check the
   /gamap2/timeseries/gc_combine_nd49.pro
   /gamap2/timeseries/gc_combine_nd48.pro 
routines, which combines daily bpch files (nd48 & nd49 output, but also met field input files) into 4D data blocks. It has many options. You can extract a subset of data according to time and/or location, or process the data (moving average, daily max, shift to local time). You can either save the data into a new bpch file or just get an array of data in output.
I wrote a tutorial that gives some pointers here.
-Philippe

--Bob Y. 09:43, 1 April 2008 (EDT)

Under certain circumstances, a minor bug can occur in routine gc_combine_nd49.pro. See this description on our GAMAP bugs & fixes page for more information. This bug is slated for release in the next GAMAP release (v2-13).

--Bob Y. 13:48, 8 September 2008 (EDT)

Adding tracers to a restart file

Duncan Fairlie (t.d.fairlie@nasa.gov) wrote:

I need to construct a new restart file with 55 tracers from an existing full chem (43 tracer) restart file. The extra 12 tracers will include dust-nitrate, dust-sulfate, and dust-alkalinity (4 tracers each). I just want to start them uniform 1.0e-15 or so.
I can probably figure out how to do this, but wondered if you have something off the shelf that can be used to add several blank tracers to an existing restart file .

Bob Yantosca (yantosca@seas.harvard.edu) replied:

Maybe the easiest way to do this is to use GAMAP and BPCH_RENUMBER. You can load 2 files

   gamap, file=file1
   gamap, file=file2
   gamap, /nofile

then type s1-55 (for all 55 data blocks) and this will save out all data blocks to a new file (you'll be prompted for the name).

Then you can hack into IDL code bpch_renumber.pro to renumber some of the tracer numbers. Or even better yet, renumber the tracer numbers on the file2 before loading it into GAMAP and saving to the 55-tracer file.

Error -262 encountered when reading a bpch file

The following error was encountered when using GAMAP to read a binary punch file:

Error (-262) reading file ctm.bpch in block 2510
POINT_LUN: Negative position argument not allowed.
Position: -2147068380, Unit:21  File: ctm.bpch

The cause of this error is that the file being read had a size of 2.430 GB. Internal GAMAP routine CTM_READ3DB_HEADER uses a long integer variable to compute the offset in bytes from the start of the file to each individual data block -- that way it can tell GAMAP "skip" to each data block directly. However, the long integer type cannot represent numbers much larger than 2e9. If we try to represent a value larger than this in a long integer variable, the value can "wrap around" to a negative number, which is what caused the error above.

Here is a demonstration of the maximum value of a long integer type in IDL:

IDL> a = 2430111016L

a = 2430111016
     ^
% Long integer constant must be less than 2147483648.

Therefore, if we try to use GAMAP to read any files larger than 2,147,483,648 bytes, we will encounter this type of error.

The solution: you should use GAMAP routine BPCH_SEP to split larger bpch files into smaller ones (perhaps one file for each day). Basically you would call BPCH_SEP repeatedly such as:

bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060101 )
bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060102 )
bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060103 )
...

The %date% token in the output filename will be replaced with the YYYYMMDD date. The above commands will create output files:

ctm.bpch.20060101
ctm.bpch.20060102
ctm.bpch.20060103
...

which will contain all of the data blocks for dates 20060101, 20060102, 20060103, ...

Then you can use GAMAP to read the smaller bpch files. Philippe Le Sager has pointed out that some of the newer IDL versions support 64-bit integer (LON64). Perhaps in a future version of GAMAP we will make the file pointer variable of this data type. However we must be careful not to make a modification that would "break" the existing functionality of GAMAP for those users who may only have the 32-bit version of IDL.

--Bob Y. 09:23, 25 November 2008 (EST)

Endian issues when reading/writing binary punch files

Rynda Hudman (hudman@berkeley.edu) wrote:

Recently, something seemed to change w/r/t reading binary files and writing with ctm_writebpch.pro (on prometheus.as.harvard.edu). I [finally] realized it was something to do with swap_endian from little to big. What happened?

Bob Yantosca (yantosca@seas.harvard.edu) wrote:

Are you using the latest GAMAP? In GAMAP v2-12 we updated all routines that read from binary files to add the
  Open_File, ..., Swap_Endian=Little_Endian(), ...
keyword. This will do the byteswapping if you are reading/writing bpch on a little-endian machine. All of the machines now at Harvard (including prometheus.as.harvard.edu) are little-endian now.
If you were using a version prior to v2-12, then some routines may not have had this fix. So that might have been the problem. Updating your GAMAP should fix it.

--Bob Y. 09:33, 19 November 2008 (EST)

Graphics output

Color

In GAMAP v2-12, the color table handling routine MYCT has been rewritten to allow you to do the following:

  • Use all of the standard IDL color tables
  • Use all of the ColorBrewer color tables
  • Use several customized color tables from previous GAMAP versions (e.g. DIAL, WhGrYlRd, etc.)

For more information, please see this document: GAMAP color tables and their basic usage.

--Bob Y. 10:38, 18 July 2008 (EDT)

Also, if you are using a monitor with a 16-bit color display, you may encounter problems when trying to create images via screengrab. Please see this post on our GAMAP bugs & fixes page for information about how to work around this.

--Bob Y. 13:27, 8 September 2008 (EDT)

Customizing a GAMAP color table

The best way to customize a color table is to

  1. Load a GAMAP color table with MYCT
  2. Obtain the R, G, B vectors from the color table
  3. Manipulate those as you like

Helen Wang (hwang@cfa.harvard.edu) wrote:

[I need to] change the color table in one of the figures to a discrete color bar. I looked at the Gamap color table. It seems that color table 117 could be the most useful. However, I would need to move the top two purple colors to the bottom before the two blues and exchange the positions of the two oranges and two reds. Basically, I'd like to follow the visible light spectra. Could you please tell me how to do this?

Philippe Le Sager (plesager@seas.harvard.edu) replied:

The fastest way to do that is to get the RGB vectors and then to apply a shift on them. You could do it on the fly but this is prone to many mistake (the shift has to be applied to a limit set of indices). So I would use the last MyCt (you can get mine in ~phs/IDL/gamap2/colors/), it lets you define a user-defined color table.
To define yours, follow those steps:
   myct, 117, /no_std
   tvlct, r,g,b,/get
   print, ( transpose([[r],[g],[b]]) ) [*,0:9]
that will give you the numbers you need in three columns. Shift and put into the user-defined RGB vectors.

--Bob Y. 10:23, 6 November 2008 (EST)

Creating PDF files

The current version of IDL (v7.x) cannot save directly to Adobe PDF format. The best way to create PDF files from IDL is to first create a PostScript file, and then use the utility ps2pdf to create a PDF file from the PostScript file. Most Unix or Linux distributions should come with a version of ps2pdf already installed.

For example:

IDL> open_device, /ps, bits=8, color, file='myplot.ps'
IDL> plot, findgen(100), color=!myct.black
IDL> close_device
IDL> spawn, 'ps2pdf myplot.ps'

This will create a file named myplot.pdf. The advantage of using PDF files is that they may be displayed from within a web page. Also, PDF files are generally smaller in size than the equivalent PostScript file.

Making movies from GAMAP output

The best way to make movies with GAMAP is to save out individual frames as GIF images, and then use a 3rd-party GIF utility to concatenate those into an animated GIF.

GIFsicle

You can obtain the GIFsicle distribution from http://www.lcdf.org/gifsicle/. When you build gifsicle, the following executables will be created:

gifsicle
Utility to concatenate individual GIFs into an animated GIF. Can also be used to extract individual frames from an animated GIF image.
gifview
A lightweight GIF viewer for X. It can display animated GIFs as slideshows, one frame at a time, or as animations.
gifdiff
Compares two GIF images for identical visual appearance

Using GIFsicle to create an animated GIF from individual GIF's:

gifsicle --delay=10 --loop *.gif > anim.gif 

WhirlGIF

You can obtain WhirlGIF from http://hpux.cs.utah.edu/hppd/hpux/Networking/WWW/whirlgif-3.04/.

Using WhirlGIF to create an animated GIF from individual GIFs:

whirlgif -loop -time 10 -o anim.gif *.gif

ImageMagick

Philippe Le Sager (plesager@seas.harvard.edu) wrote:

You can also use "convert" at the command line to get animated GIF. This is a powerful command line, but the basic for animated gif is:
     convert -delay 20 image.* image.gif
You need to be in the proper directory. The input images are numbered like image.01, image.02, and so on. If the image file names differ too much you will have to explicitly type them, before the output file name, which is last). The delay is the time interval. I think it is in ms.
You can get more info and tips at:

NOTE: GIFsicle, WhirlGIF, and ImageMagick have already been installed on the Harvard Linux machines.

Regridding

Regridding T42 to GEOS-Chem grids

Ben Miller (bmiller@fas.harvard.edu) wrote:

Justin and I have been working on translating some emission fluxes and deposition velocity data from TOMCAT's T42 grid (128x64, with some variance in latitude step-size) into Geos 4x5 gridboxes. Do you know of any existing routines in IDL or some other language that we could adapt to make the two grids talk to each other?

Philippe Le Sager (plesager@seas.harvard.edu) replied:

There is nothing in IDL to deal with any kind of grid. However (you may have already found it) the RETRO tools (M. Schultz) can regrid data from the T42 to regular grids. It deals with NetCDF files apparently, but ascii/nc conversion is not a problem.
Have a look at the example in the regrid routine of the following package: MGS_Regrid (V2).

--Bob Y. 09:27, 8 August 2008 (EDT)

Regridding from a Lambert conformal grid to a lat-lon grid

Rynda Hudman (hudman@berkeley.edu) wrote:

I am trying to plot some North American Regional Reanalysis (NARR) data and compare it to the GEOS precipitation fields. NARR is on a Lambert Conformal grid, so it is irregular and each lat and lon have 2-dimensional arrays defining their 4 corners.
I would like to interpolate/regrid this onto a uniform grid for comparison with GEOS. Could you guys offer some suggestions?

Philippe Le Sager (plesager@seas.harvard.edu) wrote:

You need to look at the "gridding and interpolation" chapter of the IDL guide/help. You can directly go to the MIN_CURVE_SURF function though. This is the one I have mostly used. Works like a wonder.

--Bob Y. 10:32, 6 November 2008 (EST)