File I/O with GAMAP
On this page we list some tips & tricks for reading and writing files with GAMAP. Please also see the following pages:
- General GAMAP usage
- Color and graphics with GAMAP
- Regridding with GAMAP
- Date and time computations with GAMAP
- Text manipulation with GAMAP
--Bob Y. 16:06, 26 November 2008 (EST)
CTM_GET_DATA and CTM_GET_DATABLOCK
Overview
Routine CTM_GET_DATA is the main GAMAP data reader. It is called by the top-level GAMAP routine every time you read a file from disk. However, you may also call CTM_GET_DATA in a standalone format in your own data processing routines.
CTM_GET_DATA can be used to read data from several different file formats, including:
- Binary punch file v2.0
- GEOS-Chem binary met field data format
- netCDF data format
- Created by routine BPCH2NC
- COARDS-compliant netCDF created by routine BPCH2COARDS
- Output files from the NASA GMI model
- GEOS-FP met data files for input to GEOS-Chem
- NOTE: IDL 7.1 and lower versions can only read netCDF-3 format. IDL 8 and higher versions should be able to read netCDF-4 format.
- HDF4-EOS data format
- The old GISS-II CTM ASCII punch file format
You can use the CTM_GET_DATA routine to read multiple data blocks from a file. It will save output into the GAMAP DATAINFO structure.
GAMAP routine CTM_GET_DATABLOCK is a convenience wrapper for routines CTM_GET_DATA and CTM_EXTRACT. You can use CTM_GET_DATABLOCK when you want to read only one data block from a file, and/or if you want to cut the data block to less than global size. You can use the LON, LAT, LEV, ALT, and/or PRESS keywords to specify the longitude, latitude, and altitude ranges of the returned data.
For more information, please see the GAMAP Users' Guide.
--Bob Y. 11:12, 20 January 2012 (EST)
Using CTM_GET_DATA to read more than one category
Dylan Millet wrote:
- I've run into what seems like a bug in CTM_GET_DATA. Basically if you call for multiple diagnostics you can run into unexpected results. For instance, try:
ctm_get_data, datainfo, filename=basefile, $ ['ANTHSRCE', 'BIOGSRCE'], $ tracer=[21001, 1005], $ /quiet, tau0=tauvals
- Aiming to get biogenic emissions of ISOP (tracer 21001 in BIOGSRCE) and anthropogenic emissions of ALK4 (tracer 1005 in ANTHSRCE).
- But in addition to those, this also returns anthropogenic emissions of NOx (tracer 1 in ANTHSRCE) and biogenic emissions of MBO (tracer 5 in BIOGSRCE). In other words it seems the offset information is lost. So one needs to use separate calls to ctm_get_data if using multiple diagnostics, otherwise you may not get back the tracers I expect.
- This may not be a bug per se, but it seems like a really easy way to get the wrong answer without realizing it. Especially if you're looping through datablocks and adding fluxes together, for example.
Bob Yantosca wrote:
- I think this may be a design limitation of CTM_GET_DATA. I have always used this workaround, though:
Ctm_get_data, datainfo1, 'anthsrce', tracer=1, ... Ctm_get_data, datainfo2, 'biogsrce', tracer=5, ... Datainfo = [ DataInfo1, DataInfo2 ] UnDefine, DataInfo1 UnDefine, Datainfo2
- And then refer to each data block via the DATAINFO structure in your do loop. Then DataInfo is a combined array of structures, so you can reach each data block with code such as:
Data = *( DataInfo[N].Data )
- Also, you don't need the long tracer number (i.e. 21001), you can just do "1". I think it does a modulo operation to drop the spacing internally.
--Bob Y. 10:15, 20 December 2011 (EST)
Using CTM_GET_DATA with the STATUS keyword
Here I document an advanced usage of CTM_GET_DATA. Sometimes CTM_GET_DATA (or a routine that calls it) crashes with the following message:
% CTM_RETRIEVE_DATA: *** SERIOUS: More than 1 item found!
This message indicates more than one datablock with the same metadata has been found, when only one was expected. If you call CTM_GET_DATA with the (non documented...) STATUS keyword, then the message disappears. Only the metadata are read, and the data pointers in the datainfo structures are null pointers. You cannot dereference them. To access the data, you must call CTM_READ_DATA instead:
; read metadata only ctm_get_data, di, ....., status=2 ; then find the index you want in the array of structure di ... ; then get the data ( wanted_array = *di[ wanted_index ] is not valid since pointer is null) ctm_read_data, wanted_array, di[ wanted_index ], result=r
This method has been used for example in gc_combine_nd48.pro of GAMAP v2.13.
--phs 17:48, 8 October 2009 (EDT)
Calling CTM_CLEANUP after reading data
Every time you call CTM_GET_DATA or CTM_GET_DATABLOCK, GAMAP stores a list of the data blocks that were read from disk in globally-accessible FILEINFO and DATAINFO structures. These structures contain information about each data block (i.e. category name, tracer name, units, molecular weight, etc) and each file (logical unit number, type of file, etc.).
The GAMAP helper routine CTM_CLEANUP can be used to free the global FILEINFO and DATAINFO structures that are stored in memory. CTM_CLEANUP will also close all open files. This will revert GAMAP to its initial state.
Because IDL has a limited amount of memory (and can only have a certain number of files open at any given time), we recommend that you call CTM_CLEANUP frequently. In particular, if you are trying to read in a long timeseries of data blocks, then you may need to call CTM_CLEANUP before each call to CTM_GET_DATA or CTM_GET_DATABLOCK.
--Bob Y. 11:12, 20 January 2012 (EST)
Using UNDEFINE to free variable memory
Because each IDL session has a limited amount of memory, we recommend that you undefine large data arrays once you are done using them. The GAMAP helper routine UNDEFINE will remove a variable from memory. For example:
; Read NOx data from disk Success = CTM_GET_DATABLOCK( Data, 'IJ-AVG-$', Tracer=1, File='myfile.bpch', ... ) ; Error message if not successful if ( ~Success ) then MESSAGE, 'Could not read data!' ; Save data into column arrays for further processing Column1 = Data[ 23, 34, * ] Column2 = Data[ 100, 27, * ] Column3 = Data[ 56, 85, * ] ; Free the large DATA array UNDEFINE, Data
In this example, once we save the data values into the smaller column arrays, we no longer need to keep the large DATA array. We can then free the memory assigned to the DATA array by using UNDEFINE.
--Bob Y. 11:29, 20 January 2012 (EST)
Binary punch (bpch) file I/O
Reading binary punch files directly from IDL
The best way to read a GEOS-Chem binary punch file is by using the GAMAP main program gamap.pro. More experienced users may also use one of the lower-level routines (e.g. CTM_GET_DATA or CTM_GET_DATABLOCK) to read the file. However, if for any reason you need to read a binary punch file directly from IDL (i.e. without using GAMAP, CTM_GET_DATA, or CTM_GET_DATABLOCK), then you have a couple of options:
- The GAMAP routine BPCH_TEST (which is located in the file_io subdirectory of the main GAMAP directory) can be used to quickly parse a binary punch file. BPCH_TEST prints out the header information from the file as well as the min and max data values. This is very useful for debugging. You can look at the structure of the file gamap2/file_io/bpch_test.pro to see how the binary punch file is read.
- If you are more accustomed to working with R, GrADS, Matlab, or any other visualization package that takes netCDF files as input, then you might want to use the GAMAP routines BPCH2NC or BPCH2COARDS to first convert the binary punch file into netCDF format.
--Bob Y. 13:28, 23 May 2008 (EDT)
Splitting and joining binary punch files
GAMAP comes with two routines for working with GEOS-Chem binary punch file output directly. Routine BPCH_SEP will extract a data block from a big binary punch file and save it to a separate file:
Colette Heald wrote:
- I did a year-long run starting from 20011201 and all the output appears to have saved in one file labeled ctm.bpch.v7-04-13-tra54.2001120100
- So first off I'm wondering, is there a way to save each monthly mean as a separate ctm.bpch file? Secondly, the output file won't read into gamap - I thought the file might be corrupt, but actually it seems to read in fine with bpch_test. When I do gamap, fi='<filename>', I get this error:
Error (-261) reading file ctm.bpch.v7-04-13-tra54.2001120100 in block 3912 POINT_LUN: Negative postition argument not allowed. Position: -2147377716, Unit: 21 File: ctm.bpch.v7-04-13-tra54.2001120100
- I get the same error when I try to use ctm_get_datablock to pull out one tracer. Do you think the file is salvageable? Have you seen this problem before? I'm wondering if the file is just too big...
Bob Yantosca replied:
- Right now the code can't save to separate bpch files. However, you can split them up wtih bpch_sep.pro into individual files in post-processing. For example:
pro split ; Splits into monthly bpch files (change as necesary) ; The big input file InFile = 'ctm.bpch.v8-01-01-geos5-Run0.2005010100' ctm_cleanup ; Split the data in the big file for 20050101 into a separate file bpch_sep, InFile, 'ctm.bpch.v8-01-01-geos5-Run0.20050101', $ tau=nymd2tau( 20050101 ) end
GAMAP routine BPCH_LINK will concatenate data blocks from several smaller bpch files and save them into a large bpch file.
; Consolidates data from the 'ctm.bpch.*' files ; into a single file named 'new.ctm.bpch' BPCH_LINK, 'ctm.bpch.*', 'new.ctm.bpch'
--Bmy 16:53, 22 April 2008 (EDT)
Creating, Renaming and Regridding APROD/GPROD restart files
When using secondary aerosols, you need a APROD/GPROD restart file:
- restart_aprod_gprod.YYYYMMDDhh for GEOS-Chem versions v7-04-11 to v8-02-04
- soaprod.YYYYMMDDhh for GEOS-Chem versions v8-03-01 and after
Unlike the general restart file used for GEOS-Chem runs, it cannot be simply renamed and ready to use for another simulation date.
Requisite for versions v7-04-11 to v8-02-04
Before processing the restart_aprod_gprod.YYYYMMDDhh files with the Gamap package, be sure that your diaginfo.dat and your tracerinfo.dat have the following:
in diaginfo.dat: 6000 IJ-GPROD GPROD for SOA partitioning 6000 IJ-APROD APROD for SOA partitioning in tracerinfo.dat: #============================================================================== # OFFSET=9000: SOA PARTITION fractions (IJ-GPROD and IJ-APROD) #============================================================================== # NAME ]x[FULLNAME ][ MOLWT][C][ TRACER][ SCALE]x[UNIT PROD01 PROD01 1.000E+00 1 6001 1.000E+00 kg/kg PROD02 PROD02 1.000E+00 1 6002 1.000E+00 kg/kg PROD03 PROD03 1.000E+00 1 6003 1.000E+00 kg/kg PROD04 PROD04 1.000E+00 1 6004 1.000E+00 kg/kg PROD05 PROD05 1.000E+00 1 6005 1.000E+00 kg/kg PROD06 PROD06 1.000E+00 1 6006 1.000E+00 kg/kg PROD07 PROD07 1.000E+00 1 6007 1.000E+00 kg/kg PROD08 PROD08 1.000E+00 1 6008 1.000E+00 kg/kg PROD09 PROD09 1.000E+00 1 6009 1.000E+00 kg/kg PROD10 PROD10 1.000E+00 1 6010 1.000E+00 kg/kg PROD11 PROD11 1.000E+00 1 6011 1.000E+00 kg/kg PROD12 PROD12 1.000E+00 1 6012 1.000E+00 kg/kg PROD13 PROD13 1.000E+00 1 6013 1.000E+00 kg/kg PROD14 PROD14 1.000E+00 1 6014 1.000E+00 kg/kg PROD15 PROD15 1.000E+00 1 6015 1.000E+00 kg/kg PROD16 PROD16 1.000E+00 1 6016 1.000E+00 kg/kg PROD17 PROD17 1.000E+00 1 6017 1.000E+00 kg/kg PROD18 PROD18 1.000E+00 1 6018 1.000E+00 kg/kg
Requisite for versions v8-03-01 and after
Before processing the soaprod.YYYYMMDDhh files with the Gamap package, be sure that your diaginfo.dat and your tracerinfo.dat have the following:
in diaginfo.dat: 6000 SOAGPROD GPROD for SOA partitioning 6000 SOAAPROD APROD for SOA partitioning in tracerinfo.dat: #============================================================================== # SOA GPROD & APROD [kg/kg] #============================================================================== PROD01 PROD01 1.000E+00 1 6001 1.000E+00 kg/kg PROD02 PROD02 1.000E+00 1 6002 1.000E+00 kg/kg PROD03 PROD03 1.000E+00 1 6003 1.000E+00 kg/kg PROD04 PROD04 1.000E+00 1 6004 1.000E+00 kg/kg PROD05 PROD05 1.000E+00 1 6005 1.000E+00 kg/kg PROD06 PROD06 1.000E+00 1 6006 1.000E+00 kg/kg PROD07 PROD07 1.000E+00 1 6007 1.000E+00 kg/kg PROD08 PROD08 1.000E+00 1 6008 1.000E+00 kg/kg PROD09 PROD09 1.000E+00 1 6009 1.000E+00 kg/kg PROD10 PROD10 1.000E+00 1 6010 1.000E+00 kg/kg PROD11 PROD11 1.000E+00 1 6011 1.000E+00 kg/kg PROD12 PROD12 1.000E+00 1 6012 1.000E+00 kg/kg PROD13 PROD13 1.000E+00 1 6013 1.000E+00 kg/kg PROD14 PROD14 1.000E+00 1 6014 1.000E+00 kg/kg PROD15 PROD15 1.000E+00 1 6015 1.000E+00 kg/kg PROD16 PROD16 1.000E+00 1 6016 1.000E+00 kg/kg PROD17 PROD17 1.000E+00 1 6017 1.000E+00 kg/kg PROD18 PROD18 1.000E+00 1 6018 1.000E+00 kg/kg PROD19 PROD19 1.000E+00 1 6019 1.000E+00 kg/kg PROD20 PROD20 1.000E+00 1 6020 1.000E+00 kg/kg PROD21 PROD21 1.000E+00 1 6021 1.000E+00 kg/kg PROD22 PROD22 1.000E+00 1 6022 1.000E+00 kg/kg PROD23 PROD23 1.000E+00 1 6023 1.000E+00 kg/kg PROD24 PROD24 1.000E+00 1 6024 1.000E+00 kg/kg PROD25 PROD25 1.000E+00 1 6025 1.000E+00 kg/kg PROD26 PROD26 1.000E+00 1 6026 1.000E+00 kg/kg PROD27 PROD27 1.000E+00 1 6027 1.000E+00 kg/kg PROD28 PROD28 1.000E+00 1 6028 1.000E+00 kg/kg
Renaming APROD/GPROD restart files
You must rewrite your restart_gprod_aprod.YYYYMMDDhh or soaprod.YYYYMMDDhh so that the date in the filename is the same as the one in the datablock headers. A new routine in GAMAP v2.12 will do it for you: ../gamap2/date_time/rewrite_agprod.pro
Regridding APROD/GPROD restart files
You can regrid a 2ndary aerosol restart file with the same routines used to regrid the standard restart file. However you need to tell the routines to regrid all tracers with the keyword diagn=0 (which means "all tracers"):
regridh_restart, ..., diagn=0 regridv_restart, ..., diagn=0
Creating soaprod.YYYYMMDDhh restart files
NOTE: This method is only available for GEOS-Chem v8-03-01 and after.
To create a soaprod.YYYYMMDDhh file from scratch you need to follow these steps:
- In GeosCore/main.f, uncomment the lines:
!! use this to make initial soaprod files !CALL SET_SOAPROD !CALL FIRST_APRODGPROD() !CALL MAKE_SOAPROD_FILE( GET_NYMDb(), GET_NHMSb() ) !goto 9999 !!
- Compile GEOS-Chem as usual.
- Run GEOS-Chem with the setup you want to use.
The GEOS-Chem run will end very quickly and should create the file soaprod.YYYYMMDDhh in your run directory. The tag YYYYNMMDDhh is replaced by the beginning date of your run.
Then, you need to comment again the lines in GeosCore/main.f and re-compile GEOS-Chem before to run for production.
--Ccarouge 10:35, 6 October 2010 (EDT)
Combining output from timeseries files
Ray Nassar wrote:
- I just have a quick question, is there any GAMAP routine that can average all hourly data blocks in a timeseries file to make a single daily average bpch file?
- It appears like I could use the average option of ctm_sum.pro but I do not quite understand where the averaged data goes and afterwards would still have to write to bpch.
Philippe Le Sager replied:
- Check the
/gamap2/timeseries/gc_combine_nd49.pro /gamap2/timeseries/gc_combine_nd48.pro
- routines, which combines daily bpch files (nd48 & nd49 output, but also met field input files) into 4D data blocks. It has many options. You can extract a subset of data according to time and/or location, or process the data (moving average, daily max, shift to local time). You can either save the data into a new bpch file or just get an array of data in output.
- I wrote a tutorial that gives some pointers here.
- -Philippe
--Bob Y. 09:43, 1 April 2008 (EDT)
Under certain circumstances, a minor bug can occur in routine gc_combine_nd49.pro. See this description on our GAMAP bugs & fixes page for more information. This bug is slated for release in the next GAMAP release (v2-13).
--Bob Y. 13:48, 8 September 2008 (EDT)
Adding tracers to a restart file
Duncan Fairlie (t.d.fairlie@nasa.gov) wrote:
- I need to construct a new restart file with 55 tracers from an existing full chem (43 tracer) restart file. The extra 12 tracers will include dust-nitrate, dust-sulfate, and dust-alkalinity (4 tracers each). I just want to start them uniform 1.0e-15 or so.
- I can probably figure out how to do this, but wondered if you have something off the shelf that can be used to add several blank tracers to an existing restart file .
Bob Yantosca (yantosca@seas.harvard.edu) replied:
Maybe the easiest way to do this is to use GAMAP and BPCH_RENUMBER. You can load 2 files
gamap, file=file1 gamap, file=file2 gamap, /nofile
then type s1-55 (for all 55 data blocks) and this will save out all data blocks to a new file (you'll be prompted for the name).
Then you can hack into IDL code bpch_renumber.pro to renumber some of the tracer numbers. Or even better yet, renumber the tracer numbers on the file2 before loading it into GAMAP and saving to the 55-tracer file.
Error -262 encountered when reading a bpch file
The following error was encountered when using GAMAP to read a binary punch file:
Error (-262) reading file ctm.bpch in block 2510 POINT_LUN: Negative position argument not allowed. Position: -2147068380, Unit:21 File: ctm.bpch
The cause of this error is that the file being read had a size of 2.430 GB. Internal GAMAP routine CTM_READ3DB_HEADER uses a long integer variable to compute the offset in bytes from the start of the file to each individual data block -- that way it can tell GAMAP "skip" to each data block directly. However, the long integer type cannot represent numbers much larger than 2e9. If we try to represent a value larger than this in a long integer variable, the value can "wrap around" to a negative number, which is what caused the error above.
Here is a demonstration of the maximum value of a long integer type in IDL:
IDL> a = 2430111016L a = 2430111016 ^ % Long integer constant must be less than 2147483648.
Therefore, if we try to use GAMAP to read any files larger than 2,147,483,648 bytes, we will encounter this type of error.
The solution: you should use GAMAP routine BPCH_SEP to split larger bpch files into smaller ones (perhaps one file for each day). Basically you would call BPCH_SEP repeatedly such as:
bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060101 ) bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060102 ) bpch_sep, 'ctm.bpch', 'ctm.bpch.%date%', tau0=nymd2tau( 20060103 ) ...
The %date% token in the output filename will be replaced with the YYYYMMDD date. The above commands will create output files:
ctm.bpch.20060101 ctm.bpch.20060102 ctm.bpch.20060103 ...
which will contain all of the data blocks for dates 20060101, 20060102, 20060103, ...
Then you can use GAMAP to read the smaller bpch files. Philippe Le Sager has pointed out that some of the newer IDL versions support 64-bit integer (LON64). Perhaps in a future version of GAMAP we will make the file pointer variable of this data type. However we must be careful not to make a modification that would "break" the existing functionality of GAMAP for those users who may only have the 32-bit version of IDL.
--Bob Y. 09:23, 25 November 2008 (EST)
Solution
Prasad Kasibhatla has submitted a fix for this situation. We have to change the type of the file pointer variables from LONG (aka INTEGER*4) to LONG64 (aka INTEGER*8). As described on our Floating point math issues wiki page, a LONG variable can only reach values of about +/-2E9. This means that the variables that point to the file cannot access locations beyond about 2.4 GB of a single file, thus causing the error.
The file modifications are as follows:
(1) In routine gamap2/internals/create3dhstru.pro, change this line:
FilePos : -1L, $ ; position of data in file ilun
to
FilePos : -1LL, $ ; position of data in file ilun
(2) In routine gamap2/internals/ctm_read3db_header.pro, add the following two lines near the top of the program:
; Define filepos and newpos variables as LONG64 so that we can ; read BPCH files larger than 2.4 GB (psk, bmy, 2/23/12) filepos = 0LL newpos = 0LL
EDIT (D A Ridley): I found that I also had to perform the following step for it to work! (3) In gamap2/internals/ctm_read3db_header.pro, cast the variable "skip" as LONG64 on line 283:
newpos = filepos + long64(skip)
--David R. 11:12, 17 May 2014 (EST)
The "LL" is shorthand for declaring a LONG64 number.
We will add this fix into the next GAMAP release (v2-16).
--Bob Y. 12:12, 23 February 2012 (EST)
Endian issues when reading/writing binary punch files
Rynda Hudman wrote:
- Recently, something seemed to change w/r/t reading binary files and writing with ctm_writebpch.pro (on prometheus.as.harvard.edu). I [finally] realized it was something to do with swap_endian from little to big. What happened?
Bob Yantosca wrote:
- Are you using the latest GAMAP? In GAMAP v2-12 we updated all routines that read from binary files to add the
Open_File, ..., Swap_Endian=Little_Endian(), ...
- keyword. This will do the byteswapping if you are reading/writing bpch on a little-endian machine. All of the machines now at Harvard (including prometheus.as.harvard.edu) are little-endian now.
- If you were using a version prior to v2-12, then some routines may not have had this fix. So that might have been the problem. Updating your GAMAP should fix it.
--Bob Y. 09:33, 19 November 2008 (EST)
GEOS-Chem met field I/O
Using test_met.pro to check met field files
Moeko Yoshitomi wrote:
- I am afraid of that the GCAP met files have some weird values at GMT date & time 2047/07/26 00:00. Is there a way to directly read the GCAP meteorology files without using GAMAP?
Bob Yantosca replied:
- You can read any met field file (GCAP or GEOS) with the GAMAP routine:
test_met, filename='20470726.a6.4x5'
- This will print out the min & max of the data (see example below), which is very useful to see if there are corrupted values, as are listed below. You can also get a plot of the met fields for a given level by using the PLOTLEVEL keyword.
20470726 030000 T 173.094833 314.135101 20470726 030000 U -125.676346 170.471573 20470726 030000 V -119.282471 103.119499 20470726 030000 OPTDEPTH 0.000000 33.720524 20470726 030000 CLDF 0.000000 0.969475 20470726 030000 MOISTQ -46.055706*************** 20470726 030000 Q 0.000000*************** 20470726 030000 UPDE 0.000000 0.578976 20470726 030000 DNDE -0.252207 -0.000000 20470726 030000 ENTRAIN 0.000000 0.177645 20470726 030000 DETRAINE 0.000000 0.557239 20470726 030000 UPDN 0.000000 0.225111 20470726 030000 DNDN -0.056653 -0.000000 20470726 030000 DETRAINN 0.000000 0.140580
- It looks like there is definitely a problem in that day in the met data, in the MOISTQ and Q fields. If you use the /EXPFORMAT keyword to print out data in exponential rather than floating point format you get:
test_met, filename='20470726.a6.4x5', /expformat 20470726 030000 MOISTQ -4.605571e+01 1.869630e+34 20470726 030000 Q 8.480171e-33 7.736141e+34
so it looks like you have ~1e34 values for MOISTQ and Q. This can be a floating-point problem, and is near to the maximum representable value for a REAL*4 variable. My guess is that there was a problem in the original regridding of the met data.
--Bob Y. 10:02, 29 May 2009 (EDT)
HDF4 and HDF4-EOS file I/O
CTM_GET_DATA (and by extension, CTM_GET_DATABLOCK) can read a file in HDF4-EOS, provided that you have properly set up your diaginfo.dat and tracerinfo.dat files.
GAMAP also provides some helper routines for reading/writing HDF4 and HDF4-EOS data files. Please see the following sections of the GAMAP User's Guide for more information:
--Bob Y. 11:45, 20 January 2012 (EST)
HDF5 and HDF5-EOS file I/O
H5_BROWSER
IDL 6 and 7 has a function called H5_BROWSER which lets you look at the contents of an HDF5 file interactively:
A = H5_BROWSER( FILE )
This will pop open a dialog box which allows you to navigate through the file's contents. If you click on a dataset in the file, H5_BROWSER will show a quick plot, so that you can see at a glance if the data inthe file matches your expectations.
It is recommended to use H5_BROWSER to examine an HDF file before reading it. You can quickly look up the names of all the variables that you want to extract from the file ahead of time.
Reading data from an HDF5 or HDF5-EOS file
The basic way you read data from an HDF5 file is as follows:
; Specify the file name (it's a GOME file w/ a long name) ; Technically this is an HDF-EOS5 file but we can read it as if it were an HDF5 file! FILE = 'GOME_xxx_1B_M02_20070501062959Z_20070501081159Z_R_O_20080615193546Z.337p4_356p1.brs.hcho.he5' ; First make sure the file is a HDF5 or HDF5-EOS file IF ( H5F_IS_HDF5( FILE ) eq 0 ) then MESSAGE, 'Not an HDF-5 file!' ; Open the HDF file and get the file ID # (FID) fId = H5F_OPEN( FILE ) IF ( fId lt 0 ) then MESSAGE, 'Error opening file!' ;---------------------------------------------------------------------------------- ; These next 3 commands are contained in convenience function ; "read_h5dataset.pro", to be released in GAMAP v2-13!!! ; Get the dataset ID (dId) for the variable we want to read ; NOTE: "Column" is the HDF5-EOS swath name & "CloudFraction" is the variable name! dId = H5D_OPEN( fid, "/Column/CloudFraction" ) ; Read the data DATA = H5D_READ( dId ) ; Free the dataset Id H5D_CLOSE, dId ;---------------------------------------------------------------------------------- ; Close the file and free the file Id H5_CLOSE, fId
However, if you are reading a lot of data from disk, or are reading many HDF5 or HDF5-EOS files, it may be more efficient to read them with Fortran.
--Bob Y. 10:30, 29 May 2009 (EDT)
netCDF file I/O
NOTE: IDL 7 and lower versions is compatible with netCDF-3 format but not netCDF-4. IDL 8 and higher versions should be compatible with both netCDF-3 and netCDF-4.
Requirements as of GAMAP v2.13
For Gamap to recognize a netCDF file, the file must have a suffix nc or NC or ncdf or NCDF. To see the datablocks, categories (ie diagnostics) and tracer names must be in diaginfo.dat and tracerinfo.dat resource files, respectively.
To convert from GEOS-Chem binary punch (aka bpch) file format to netCDF, we suggest to use BPCH2COARDS. It is COARDS compliant. To convert several files at once, MAKE_MULTI_NC is a convenient wrapper available in GAMAP v2.13 and higher versions It can travel directories recursively, and automatically recreates the input directory tree in the output directory if needed.
Note that, when writing netCDF, some characters require special handling (like $ for example): BPCH2COARDS replaces them with an underscore. Then, to read the converted files back into GAMAP, underscores must be converted back to their original character.
The later operation is done in CRC_Get_Tracer subroutine of CTM_READ_COARDS. In GAMAP v2.13, few tracer names (air density, tropopause levels) have been added to the list of names to process. The list is not exhaustive. Several warnings in CTM_READ_COARDS and CTM_TRACERINFO have been added in GAMAP v2.13 to help you to quickly find the name(s) that need to be added to the to-be-processed list in CRC_Get_Tracer, were you unable to read netCDF files you created with GAMAP.
--phs 11:25, 9 October 2009 (EDT)
Reading from netCDF files
GAMAP contains the very useful NCDF_READ routine (by Martin Schultz). NCDF_READ can read all of the various variables and attributes from a netCDF file into an IDL structure.
As stated above, the main gamap.pro program can also read from netCDF file format, providing that you have set up your diaginfo.dat and tracerinfo.dat files correctly.
--Bob Y. 15:54, 25 July 2011 (EDT)
Converting bpch to netCDF files
GAMAP contains some routines that can create netCDF files from GEOS-Chem binary punch output:
- bpch2gmi.pro: Creates GMI-style netCDF file
- bpch2nc.pro: Creates general netCDF file
- bpch2coards.pro: (RECOMMENDED) Creates COARDS-compliant netCDF file (see above)
--Bob Y. 15:18, 25 July 2011 (EDT)
Converting from netCDF to bpch format
There is no single GAMAP routine that will convert directly from netCDF format to bpch format. However, you can use the following multi-step procedure to do this:
- Use NCDF_READ to read the data (and attributes) from the netCDF file into an IDL structure.
- Use CTM_MAKE_DATAINFO to create a new DATAINFO structure for each data block in the structure returned by NCDF_READ
- Use CTM_WRITEBPCH to save each data block to a bpch file.
For more information on using CTM_MAKE_DATAINFO and CTM_WRITEBPCH to create a bpch file, please see: Chapter 8.11 of the GAMAP Online Users' Guide.
--Bob Y. 15:51, 25 July 2011 (EDT)
posix.c error when writing netCDF files
Qiaoqiao Wang wrote:
- I was using bpch2nc routine to do the file format conversion. But here comes the error:
idl: posixio.c:248: px_pgin: Assertion `*posp == ((off_t)(-1)) || *posp == lseek(nciop->fd, 0, 1)' failed. Abort (core dumped)
- And it automatically exits from IDL. Although it does generate the nc files, I am not sure what's wrong there. Do you have any idea about this error?
Michael Long wrote:
- This error has cropped up with NetCDF before when using libraries not fit for generating large files (>2Gb). It could also be that the machine you're on doesn't like the data size.
--Bob Y. 09:57, 19 December 2011 (EST)