Difference between revisions of "Matlab software tools for use with GEOS-Chem"

From Geos-chem
Jump to: navigation, search
Line 1: Line 1:
<span style="color:red">'''''We envision that the public release of [[GEOS-Chem v11-01]] will let you write out diagnostic output to netCDF file format.  This will render the bpch output obsolete.'''''</span>
<span style="color:red">'''''GEOS-Chem diagnostics are now written to netCDF format, which are best read with the Python xarray package.  This information is now obsolete.'''''</span>
== Disclaimer ==
== Disclaimer ==

Latest revision as of 14:53, 13 April 2021

GEOS-Chem diagnostics are now written to netCDF format, which are best read with the Python xarray package. This information is now obsolete.


The codes available on this page are basic at best - please use them with caution and be aware that they are not designed for speed, efficiency or robustness, but simply to offer an alternative route to process GEOS-Chem output using MATLAB. They are designed to work only with GEOS-Chem output in the form of restart files, primary ouput files or timeseries such as are generated by the ND49 diagnostic. The functions were written by Sebastian D. Eastham - please feel free to contact him with any questions or bug reports.


Processing BPCH files through IDL to produce MATLAB-readable NetCDF files can be time-consuming and requires an IDL license which might not otherwise be necessary. These functions allow BPCH files to be read directly into MATLAB without any external programs or toolboxes.


These codes take any GEOS-Chem output binary punch-card (BPCH) file and extract the diagnostic fields contained therein. There are two specific functions which are of use:

  • readBPCHSingle: Takes a category name and tracer ID, returning all relevant data in the file as a single matrix.
  • readAllBPCHData: Returns all data in a BPCH file as a MATLAB structure.

Both codes require that a tracerinfo.dat and diaginfo.dat file are located in the same folder as the target folder, or that their locations are specified in the function call. The remaining files contain routines which are called by these functions but not intended for direct access.

Retrieving data on a single tracer

The following function call will retrieve data on a single tracer in a specific category. The input file is testfile.bpch, while the tracer and diagnostic category information files are info\tracerinfo.dat and info\diaginfo.dat respectively. The tracer and category names are changed according to the rules below. Output will have the dimensions [longitude x latitude x altitude x time] except where there is only one result (no time dimension) or data is 2D (no altitude dimension - data will be [longitude x latitude x time]).

dataBlock = readBPCHSingle('testfile.bpch','C_IJ_AVG','T_NOx','info\tracerinfo.dat','info\diaginfo.dat',true,false)

The final two arguments are verbose and bruteForce; setting verbose to false will suppress warnings, while setting bruteForce to true will override certain errors. For example, if bruteForce is active then the code will attempt to match tracer and category names to the original strings if it cannot find a match in the 'sanitized' names. All of the final 4 arguments are optional; if the location of the tracer or diagnostic information files are not given (or are empty strings), it is assumed that they can be found in the same directory as the target file.

Retrieving all data from a file

Use readAllBPCHData To read all data from a file into a single MATLAB structure.

dataStruct = readAllBPCHData('testfile.bpch','info\tracerinfo.dat','info\diaginfo.dat',true,false)

Again, the final two arguments control verbosity and 'brute force' options, and all but the first input are optional arguments. The resultant structure will take the form of dataStruct.(category names).(tracer names).(tracer properties, data etc). There will also be a top-level field called modelData which contains some information regarding the model and the way in which the results are to be interpreted.

Tracer and Category Names

The tracer and category names are sanitized in the course of reading the BPCH file, as some category and tracer names would cause an error if they were used in a MATLAB variable or structure name. The following rules are applied in order:

  • The characters '-$' or '=$' are removed from the end of category names
  • Any occurrence of the following characters is removed: $ ( )
  • Both hyphens and spaces are converted to underscores
  • 'C_' or 'T_' (for categories and tracers respectively) is added to the start of the name

Therefore the IJ-AVG-$ category should be identified as C_IJ_AVG, while the tracer NOx will become T_NOx.


These functions allow BPCH files greater than 2 GB to be read directly. However, use readAllBPCHData with caution for large files, as there must be enough memory available locally to read in all data at once. An alternative set of routines has been developed which uses classes to read BPCH files into a structure without needing to hold all data in memory at once, but is still in testing. If interested, please e-mail Sebastian D. Eastham to request a copy.


The functions are currently available here. This file was last updated 2014-07-02 11:11 to fix a bug whereby identically-named tracers in different categories were not resolved by the readBPCHSingle function.


If you need to be able to handle tracer names that include a "/" (e.g., "soil/air" as in the POPs simulation), add the following line to this code block:

   tIDSafe = tID;
   for iTracer = 1:numTracers
   safeStr = char(tID{iTracer});
   safeStr(regexp(safeStr,'\$')) = ;
   safeStr(regexp(safeStr,'/')) = '_';       % <----- add this line
   safeStr(regexp(safeStr,')')) = ;
   safeStr(regexp(safeStr,'(')) = ;
   safeStr(regexp(safeStr,'-')) = '_';
   safeStr(regexp(safeStr,' ')) = '_';

--Helen Amos 12:29, 26 May 2015 (EDT)