Python code for GEOS-Chem

From Geos-chem
Jump to: navigation, search

On this page, we list information about Python software packages that are being created for use with GEOS-Chem.

Overview

Traditionally, GEOS-chem users have relied on IDL-based software, such as GAMAP, for data analysis and visualization. But to many GEOS-Chem user groups, the IDL software has become prohibitively expensive. Several members of the GEOS-Chem community have begun to develop file software programs in the Python language for reading, visualizing, and processing GEOS-Chem data. Python is a free and open-source computer language that comes with several pre-packaged libraries for numerical computation and visualization.

Using Python software for visualizing/regridding/analyzing GEOS-Chem output also facilitates using GEOS-Chem in cloud-computing environments, such as the Amazon EC2 platform.

On this page we provide a list of Python software that is being developed for use with GEOS-Chem. For more information, please contact the individual authors listed below.

--Bob Yantosca (talk) 16:14, 24 April 2017 (UTC)

Development Teams

Here is a list of Python visualizaton, file-management, and regridding software packages for GEOS-Chem. Some are still in a state of development, and as such, are not quite ready for widespread use yet.

Developer Packages Status
GEOS-Chem Support Team GCPy: Visualization and regridding software for GEOS-Chem In development; not quite ready for widespread use (as of Apr 2017)
Daniel Rothenberg xbpch: Backend for reading bpch output into xarray/dask Available on github
Barron Henderson Several software packages, including:
  1. PseudoNetCDF: a NetCDF like system including visualization (maps, profiles, timeseries, etc)
  2. Process analysis diagnostics: A tool for examining the change in each species due to each process and reaction.
Available on github
Gerrit Kuhlmann Gchem: a reader/writer for bpch files Available on github
Emanuel Mahieu
Benoit Bovy
PyGChem : a Python interface to GEOS-Chem (currently allows dataset handling using different backends - IO bpch/netCDF4, and will soon provide an interface to HEMCO). Available on github
Andre Perkins Python Ensemble Manager for ensembles of GEOS-Chem adjoint simulations using MPI Available on github
Ben Newsome Python script to split up GEOS-Chem simulations into smaller runs Available on github

--Bob Yantosca (talk) 15:52, 24 April 2017 (UTC)

Information about existing Python packages

GCPy

Summary:

Developers Harvard: GEOS-Chem Support Team, Sebastian Eastham, Jiawei Zhuang
MIT: Daniel Rothenberg
York: Killian Murphy, Tomás Sherwen
EPA: Barron Henderson
Status Still in development, not quite ready for public consumption

Description:

The GEOS-Chem Support Team is developing the GCPy Python package for visualization and regridding of GEOS-Chem output in netCDF format. GCPy is being used to produce plots for the GCHP benchmark simulations. GCPy development is still in the very early stages, and as such, it is not quite ready for public consumption. But we will be making improvements to GCPy in the near future and hope to have a more user-friendly version by the next GEOS-Chem public release. Stay tuned!!

--Bob Yantosca (talk) 16:46, 24 April 2017 (UTC)

xbpch

Summary:

Developer Daniel Rothenberg (MIT)
Status Available for download
Latest version https://github.com/darothen/xbpch
Documentation xbpch on readthedocs


Descripton:

xbpch brings the power of a modern data analysis toolkit to bear on legacy GEOS-Chem output. Fundamentally, xbpch provides a backend to ingest bpch files via the xarray package, an implementation of the NetCDF/Common Data Model for manipulating multi-dimensional array data in Python. By itself, xarray is a powerful tool for analyzing and manipulating the types of labeled datasets common in the geosciences. Additionally, xbpch wraps dask, a flexible library for parallel computing. This combination of toolkits enables users to rapidly ingest, process, and visualize the largest model output datasets, even when they do not fit into memory on their laptop or cluster node.

Because bpch files are nearing their end-of-life, xbpch will not be expanded much further beyond:

  • Enforcing CF-compliant metadata on ingested data
  • Incorporating command line utilities for high-performance conversion of bpch to NetCDF output

However, the advantage of having xbpch even in its current state is that users will be able to seamlessly shift between analyzing legacy GEOS-Chem output and new NetCDF output (in the near future) without changing more than a single line in their analysis pipeline or scripts.

Further documentation and examples will be added to the xbpch readthedocs page.

--Darothen (talk) 04:04, 25 April 2017 (UTC)

PseudoNetCDF

Summary:

Developer Barron Henderson (US EPA, formerly U. Florida)
Status Available for download
Latest version https://github.com/barronh/pseudonetcdf
Documentation https://github.com/barronh/pseudonetcdf/wiki/GC-Tutorials

Descripton:

PseudoNetCDF supersedes the python bpch library. It has all the same functionality and much much more. There is a core library that can be used for all kinds of programming, and a series of utility scripts to make easy things easy. Evaluating GEOS-Chem against AQS observations, for example, takes three commands.

  • Converts bpch to CF compliant NetCDF: pncgen -f bpch inpath outpath
  • View a lon/lat point: pncdump -f bpch --extract=-74,25 inpath
  • Make maps: pncmap.py -f bpch -r time,mean -s layer47,0 inpath outpath
  • Make profiles: pncvertprofile.py -f bpch -r time,mean -r latitude,mean -r longitude,mean inpath output
  • Evaluate against AQS: pncaqsraw4pnceval.py makes a netcdf file of AQS for comparison; pncgen -f bpch --extract file=aqs.nc
  • Create boundary conditions for CMAQ from GEOS-Chem output (i.e. supersedes the former pygeos2cmaq code)

The examples below use the benchmark run that was run for the GEOS-Chem timing exercise.

pncmap.py makes maps simply. Note that you can add states and counties or custom shapefiles.

   $ pncmap.py -f "bpch,vertgrid='GEOS-5-NATIVE'" --norm="Normalize()" -v IJ-AVG-\$_O3 \
         -r time,mean -s layer72,0 trac_avg.geosfp_4x5_benchmark.201307010000 \
         pnc_trac_avg.geosfp_4x5_benchmark.20130701000_

pnc_trac_avg.geosfp_4x5_benchmark.20130701000_IJ-AVG_O3.png

pncvertprofile.py makes vertical profile plots and, optionally, can plot TES and/or OMI (not shown) for comparison. TES and/or OMI will be spatially subset automatically (good for ND51).

   $ pncvertprofile.py -f "bpch,vertgrid='GEOS-5-NATIVE'" -v IJ-AVG-\$_O3 \
         -r time,mean trac_avg.geosfp_4x5_benchmark.201307010000 \
         --tes-paths=/path/to/tes/*.nc pnc_trac_avg.geosfp_4x5_benchmark.20130701000_pncprofile 

pnc_trac_avg.geosfp_4x5_benchmark.20130701000_pncprofile_IJ-AVG_O3.png

Comparing GEOS-Chem to AQS observations (or other networks) can be done using three simple commands, you can evaluate GEOS-Chem against AQS observations

   $ pncaqsraw4pnceval.py --param=44201 -s 2013-07-01 -e 2013-07-31 -r "1985-01-01 00:00:00" \
         --wktpolygon="POLYGON((-182.5 -90, 177.5 -90, 177.5 90, -182.5 90, -182.5 -90))"
   $ pncgen -O --extract-file="AQS_DATA_20130701-20130731.nc" -f "bpch,vertgrid='GEOS-5-NATIVE'" \
         -s layer72,0 trac_avg.geosfp_4x5_benchmark.201307010000 GC_AT_AQS.nc
   $ pnceval.py --funcs=MB,ME,NMB,NME -v Ozone -- -r time,mean AQS_DATA_20130701-20130731.nc \
         --sep --expr="Ozone=O3/1000." --rename=v,IJ-AVG-\$_O3,O3 GC_AT_AQS.nc

Note that pncaqsraw4pnceval.py will either use a pre-downloaded file or download it for you. You can substitute any other network that is in a text file using the pncgen -f csv,... command in place of pncaqsraw4pnceval.py. There are also many more statistical functions that can be used.

-- Barron H. Henderson May 16 2016
--Bob Yantosca (talk) 16:30, 24 April 2017 (UTC)

Process analysis diagnostics

Summary:

Developer Barron Henderson (US EPA, formerly U. Florida)
Status Available for download
Latest version https://github.com/barronh/pypa
Documentation See our Process Analysis Diagnostics wiki page

Descripton:

Process-based Analysis examines the change in each species due to each process and reaction. Models predict atmospheric state, which in a time-series can be used to create net-change of each species. What this cannot tell us, is which processes led to that change. To supplement state (or concentration, GEOS-Chem has long archived emissions and employed advanced diagnostics to predict gross chemical production or loss. Process Analysis goes a step further archiving grid-cell budgets for each species, and decomposing gross production/loss into individual reaction contributions. Process Analysis extensions are currently available in CAMx, WRF-Chem, CMAQ, and now GEOS-Chem. This allows for direct comparisons of models at a fundamental, process level.

For more information, or to download the code, please see our Process Analysis Diagnostics page on the GEOS-Chem wiki.

--Bob Yantosca (talk) 16:50, 24 April 2017 (UTC)

Gchem

Summary:

Developer Gerrit Kuhlmann
Status Available for download
Latest version https://github.com/gkuhl/gchem

Description:

Gchem is mainly a reader/writer for bpch files. To visualize data, you can use available libraries (Matplotlib and basemap).

--Bob Yantosca (talk) 16:52, 24 April 2017 (UTC)

PyGChem

Summary:

Developers Benoit Bovy and Emanuel Mahieu (U. Liege)
Status Available for download
Latest version https://github.com/benbovy/PyGChem/
Documentation http://nbviewer.ipython.org/github/benbovy/PyGChem_examples/blob/master/Index.ipynb

Description:

The PyGChem Python package aims to connect GEOS-Chem to the Python Scientific (SciPy) Stack, which consists of several core packages and many specialized packages for scientific computing. PyGChem is under active development. Currently, it allows reading GEOS-Chem datasets from both bpch and netCDF formats and handling it as Iris's cubes. Other dataset handling backends such as xray will be added soon. An API for HEMCO is also under development ; it will allow users to dynamically read/write, create, edit, test, and visualize HEMCO configurations.

Preliminary documentation is available as IPython notebooks.

--Benoit B. 10:45, 7 May 2015 (EDT)
--Bob Yantosca (talk) 16:56, 24 April 2017 (UTC)

Python Ensemble Manager

Summary:

Developers Andre Perkins (U. Wisconsin)
Status Available for download
Latest version https://github.com/frodre/pyEnsemble

Description:

This code is useful for running ensembles of GEOS-Chem adjoint model simulations within an MPI environment. For more information about our Python Ensemble Manager, please see the source code on github.

--Bob Yantosca (talk) 17:00, 24 April 2017 (UTC)

Queueing script

Summary:

Developers Ben Newsome (U. York)
Status Available for download
Latest version https://github.com/wacl-york/monthly_run

Description:

I set up a python script for my group to split up long runs into smaller runs (Allow you to run a year simulation in a 4 hour queue by running 1 month at a time) that can auto submit the next timestep. The bottom of the script should easily be adaptable for use in alternate queue systems. If any information on it is needed then I can be emailed or an Issue can be raised on github.

--Melissa Sulprizio (talk) 17:33, 18 November 2016 (UTC)

Python resources

In this section we preserve information about Python software packages that have been made obsolete.

Obsolete software packages

In this section we list Python software packages for GEOS-Chem that have been rendered obsolete by newer developments.

bpchdump

Obsolete.jpg

This package seems to be no longer archived online.

Author: Liang Feng (U. Edinburgh, UK)

Here are some screenshots of our bpchdump.py package, which allow you to browse the contents of a bpch file and plot them:

bpchdump.py GUI interface:

Bpchdump gui.png

Sample plot generated with bpchdump.py:

Bpchdump plot.png

--Bob Y. 11:40, 28 May 2013 (EDT)

bpch to netCDF converter

Obsolete.jpg

This functionality has been incorporated into Barron Henderson's PseudoNetCDF package. We shall leave this information here for reference.

Author: Barron Henderson (U. Florida)

Barron Henderson wrote:

I wanted to clarify a little bit about my own project, which I have named simply bpch. bpch is registered with pypi (and installable with the command `pip install bpch`).
bpch is netcdf-like reader for bpch files returns a netcdf-like object with CF-1.6 compliant meta-data. The interface for bpch mimics the netcdf4-python interface (the successor to SciPy.io.netcdf.netcdffile).
If the bpch module is called from the command line (see below), it creates a netcdf file. If the module is used in a python program, then no netcdf file is created. At the command line, it also supports nco-like dimension and variable subsetting. Optionally, it can output figures and animations as well.
Example command line:
   $ python -m bpch ctm.bpch -o ctm.bpch.nc
would create ctm.bpch.nc that is readable by Panoply. The output is not currently readable by IDV because of some dimension naming assumptions for vertical dimensions.
In script form, the bpch module is simple a reader. The script below would simply print out some statistics twice. The first time demonstrates the netcdf3-classic style and the second demonstrates the netcdf4 style. The module uses numpy array structures, so it is fully compatible with matplotlib plotting functions.
   #!/usr/bin/env python
   #--Begin Script--
   from bpch import bpch
   import numpy as np

   f = bpch('ctm.bpch')
   key = 'IJ-AVG-$_Ox'
   v = f.variables[key]
   print key
   print v.units, v.dimensions
   print 'Min, Median, Max, Mean, Std:', v[:].min(), np.median(v[:]), v[:].max(), v[:].mean(), v[:].std()

   group = 'IJ-AVG-$'
   key = 'Ox'
   v = f.groups[group].variables[key]
   print key
   print v.units, v.dimensions
   print 'Min, Median, Max, Mean, Std:', v[:].min(), np.median(v[:]), v[:].max(), v[:].mean(), v[:].std()

   #--End Script--

--Bob Yantosca (talk) 15:59, 24 April 2017 (UTC)