File I/O with GAMAP

From Geos-chem
Revision as of 21:07, 26 November 2008 by Bmy (talk | contribs)
Jump to navigation Jump to search

On this page we list some tips & tricks for reading and writing files with GAMAP. Please also see the following pages:

--Bob Y. 16:06, 26 November 2008 (EST)

Binary punch (bpch) file I/O

Reading binary punch files directly from IDL

The best way to read a GEOS-Chem binary punch file is by using the GAMAP main program gamap.pro. More experienced users may also use one of the lower-level routines (e.g. CTM_GET_DATA or CTM_GET_DATABLOCK) to read the file. However, if for any reason you need to read a binary punch file directly from IDL (i.e. without using GAMAP, CTM_GET_DATA, or CTM_GET_DATABLOCK), then you have a couple of options:

  1. The GAMAP routine BPCH_TEST (which is located in the file_io subdirectory of the main GAMAP directory) can be used to quickly parse a binary punch file. BPCH_TEST prints out the header information from the file as well as the min and max data values. This is very useful for debugging. You can look at the structure of the file gamap2/file_io/bpch_test.pro to see how the binary punch file is read.
  2. If you are more accustomed to working with R, GrADS, Matlab, or any other visualization package that takes netCDF files as input, then you might want to use the GAMAP routines BPCH2NC or BPCH2COARDS to first convert the binary punch file into netCDF format.

--Bob Y. 13:28, 23 May 2008 (EDT)

Splitting and joining binary punch files

GAMAP comes with two routines for working with GEOS-Chem binary punch file output directly. Routine BPCH_SEP will extract a data block from a big binary punch file and save it to a separate file:

Colette Heald (heald@atmos.colostate.edu) wrote:

I did a year-long run starting from 20011201 and all the output appears to have saved in one file labeled ctm.bpch.v7-04-13-tra54.2001120100
So first off I'm wondering, is there a way to save each monthly mean as a separate ctm.bpch file? Secondly, the output file won't read into gamap - I thought the file might be corrupt, but actually it seems to read in fine with bpch_test. When I do gamap, fi='<filename>', I get this error:
   Error (-261) reading file
   ctm.bpch.v7-04-13-tra54.2001120100
    in block 3912
   POINT_LUN: Negative postition argument not allowed. Position: -2147377716,
   Unit: 21
   File: ctm.bpch.v7-04-13-tra54.2001120100
I get the same error when I try to use ctm_get_datablock to pull out one tracer. Do you think the file is salvageable? Have you seen this problem before? I'm wondering if the file is just too big...

Bob Yantosca (yantosca@seas.harvard.edu) replied:

Right now the code can't save to separate bpch files. However, you can split them up wtih bpch_sep.pro into individual files in post-processing. For example:
   pro split

      ; Splits into monthly bpch files (change as necesary)
  
      ; The big input file
      InFile = 'ctm.bpch.v8-01-01-geos5-Run0.2005010100'
      ctm_cleanup

      ; Split the data in the big file for 20050101 into a separate file
      bpch_sep, InFile, 'ctm.bpch.v8-01-01-geos5-Run0.20050101', $
         tau=nymd2tau( 20050101 )
   end

GAMAP routine BPCH_LINK will concatenate data blocks from several smaller bpch files and save them into a large bpch file.

   ; Consolidates data from the 'ctm.bpch.*' files
   ; into a single file named 'new.ctm.bpch'
   BPCH_LINK, 'ctm.bpch.*', 'new.ctm.bpch'

--Bmy 16:53, 22 April 2008 (EDT)

Renaming and Regridding APROD/GPROD restart files

When using secondary aerosols, you need a restart_aprod_gprod.YYYYMMDDhh file. Unlike the general restart file used for GEOS-Chem runs, it cannot be simply renamed and ready to use for another simulation date.

Renaming APROD/GPROD restart files

You must rewrite you restart_gprod_aprod.YYYYMMDDhh so that the date in the filename is the same as the one in the datablock headers. A new routine in GAMAP v2.12 will do it for you: ../gamap2/date_time/rewrite_agprod.pro

Regridding APROD/GPROD restart files

You can regrid a 2ndary aerosol restart file with the same routines used to regrid the standard restart file. However you need to tell the routines to regrid all tracers with the keyword diagn=0 (which means "all tracers"):

regridh_restart, ..., diagn=0
regridv_restart, ..., diagn=0

--phs 14:51, 4 April 2008 (EDT)

Combining output from timeseries files

Ray Nassar (ray@io.as.harvard.edu) wrote:

I just have a quick question, is there any GAMAP routine that can average all hourly data blocks in a timeseries file to make a single daily average bpch file?
It appears like I could use the average option of ctm_sum.pro but I do not quite understand where the averaged data goes and afterwards would still have to write to bpch.

Philippe Le Sager (plesager@seas.harvard.edu) replied:

Check the
   /gamap2/timeseries/gc_combine_nd49.pro
   /gamap2/timeseries/gc_combine_nd48.pro 
routines, which combines daily bpch files (nd48 & nd49 output, but also met field input files) into 4D data blocks. It has many options. You can extract a subset of data according to time and/or location, or process the data (moving average, daily max, shift to local time). You can either save the data into a new bpch file or just get an array of data in output.
I wrote a tutorial that gives some pointers here.
-Philippe

--Bob Y. 09:43, 1 April 2008 (EDT)

Under certain circumstances, a minor bug can occur in routine gc_combine_nd49.pro. See this description on our GAMAP bugs & fixes page for more information. This bug is slated for release in the next GAMAP release (v2-13).

--Bob Y. 13:48, 8 September 2008 (EDT)

Adding tracers to a restart file

Duncan Fairlie (t.d.fairlie@nasa.gov) wrote:

I need to construct a new restart file with 55 tracers from an existing full chem (43 tracer) restart file. The extra 12 tracers will include dust-nitrate, dust-sulfate, and dust-alkalinity (4 tracers each). I just want to start them uniform 1.0e-15 or so.
I can probably figure out how to do this, but wondered if you have something off the shelf that can be used to add several blank tracers to an existing restart file .

Bob Yantosca (yantosca@seas.harvard.edu) replied:

Maybe the easiest way to do this is to use GAMAP and BPCH_RENUMBER. You can load 2 files

   gamap, file=file1
   gamap, file=file2
   gamap, /nofile

then type s1-55 (for all 55 data blocks) and this will save out all data blocks to a new file (you'll be prompted for the name).

Then you can hack into IDL code bpch_renumber.pro to renumber some of the tracer numbers. Or even better yet, renumber the tracer numbers on the file2 before loading it into GAMAP and saving to the 55-tracer file.

Error -262 encountered when reading a bpch file

The following error was encountered when using GAMAP to read a binary punch file:

Error (-262) reading file ctm.bpch in block 2510
POINT_LUN: Negative position argument not allowed.
Position: -2147068380, Unit:21  File: ctm.bpch

The cause of this error is that the file being read had a size of 2.430 GB. Internal GAMAP routine CTM_READ3DB_HEADER uses a long integer variable to compute the offset in bytes from the start of the file to each individual data block -- that way it can tell GAMAP "skip" to each data block directly. However, the long integer type cannot represent numbers much larger than 2e9. If we try to represent a value larger than this in a long integer variable, the value can "wrap around" to a negative number, which is what caused the error above.

Here is a demonstration of the maximum value of a long integer type in IDL:

IDL> a = 2430111016L

a = 2430111016
     ^
% Long integer constant must be less than 2147483648.

Therefore, if we try to use GAMAP to read any files larger than 2,147,483,648 bytes, we will encounter this type of error.

The solution: you should use GAMAP routine BPCH_SEP to split larger bpch files into smaller ones (perhaps one file for each day). Basically you would call BPCH_SEP repeatedly such as:

bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060101 )
bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060102 )
bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060103 )
...

The %date% token in the output filename will be replaced with the YYYYMMDD date. The above commands will create output files:

ctm.bpch.20060101
ctm.bpch.20060102
ctm.bpch.20060103
...

which will contain all of the data blocks for dates 20060101, 20060102, 20060103, ...

Then you can use GAMAP to read the smaller bpch files. Philippe Le Sager has pointed out that some of the newer IDL versions support 64-bit integer (LON64). Perhaps in a future version of GAMAP we will make the file pointer variable of this data type. However we must be careful not to make a modification that would "break" the existing functionality of GAMAP for those users who may only have the 32-bit version of IDL.

--Bob Y. 09:23, 25 November 2008 (EST)

Endian issues when reading/writing binary punch files

Rynda Hudman (hudman@berkeley.edu) wrote:

Recently, something seemed to change w/r/t reading binary files and writing with ctm_writebpch.pro (on prometheus.as.harvard.edu). I [finally] realized it was something to do with swap_endian from little to big. What happened?

Bob Yantosca (yantosca@seas.harvard.edu) wrote:

Are you using the latest GAMAP? In GAMAP v2-12 we updated all routines that read from binary files to add the
  Open_File, ..., Swap_Endian=Little_Endian(), ...
keyword. This will do the byteswapping if you are reading/writing bpch on a little-endian machine. All of the machines now at Harvard (including prometheus.as.harvard.edu) are little-endian now.
If you were using a version prior to v2-12, then some routines may not have had this fix. So that might have been the problem. Updating your GAMAP should fix it.

--Bob Y. 09:33, 19 November 2008 (EST)

netCDF file I/O