Automatic documentation with protex
Protex is a very useful Perl script that can strip information from a standard Fortran document header and save that to a LaTeX file. Protex was developed at the Goddard Space Flight Center by Arlindo Da Silva, Will Sawyer, and others.
We shall use protex to auto-document all ESMF-compatible code created at Harvard. Also, GEOS-Chem v8-01-03 and higher will utilize protex documentation (at least for newly-added code). We therefore recommend GEOS-Chem users to become familiar with protex.
NOTE: In order to use protex, you must have the various LaTeX utilities (e.g. latex, pdflatex, dvipdf, dvips, etc.) installed on your system. These should come standard with most builds of Unix or Linux.
Downloading protex
Click here to download a tarball file with the protex script, template headers, and sample output files.
To install protex, do the following:
gunzip protex.tar.gz tar xvf protex.tar
This will install the following files into the protex subdirectory:
- README
- A file that describes the contents of the tar file.
- protex
- Perl script (by Arlindo da Silva et al) from GSFC that converts special F90 header comment tags into LaTeX format.
- f90pdf
- Perl script (by Bob Yantosca) which is a wrapper for protex. Calls protex and pdflatex to convert the output of protex to PDF format.
- f90ps
- Perl script (by Bob Yantsoca) which is a wrapper for protex. Calls protex, latex, and dvips to convert the output of protex to PostScript format.
- template_introduction.txt
- Front page template file for use with the protex script. Use this template to specify the title of the document, authors, affiliation, and date. This should always be the first file passed to protex.
- template_includefile.h
- Template header file with sample F90 declarations
- template_module.F90
- Template F90 module with module header and internal routines.
- template_routine.F90
- Template F90 file for a standalone program or routine (i.e. not contained in a module).
- sample.pdf
- Sample PDF output file, created from the template_includefile.h, template_routine.F90 and template_module.F90 files.
- sample.ps
- Sample PostScript output file, created from the template_includefile.h, template_routine.F90 and template_module.F90 files.
If you have root privilege on your system, you can then copy the protex file to /usr/local/bin or whichever directory contains system-wide executables. Otherwise, you can install it to a directory in your own space.
Generating PDF documentation
To generate a PDF file, type:
f90pdf [list of files]
This will create the files:
- output.tex
- Documentation file in LaTeX format file
- output.pdf
- Documentation file in PDF format
You may choose to rename these as you see fit. f90pdf is a convenience wrapper script. It executes the following commands:
protex -s -f [list of files] > output.tex pdflatex output.tex pdflatex output.tex pdflatex output.tex rm *.dvi *.aux *.log *.toc
The pdflatex command is a useful shorthand. Using pdflatex replaces separate calls to latex and dvipdf. For example, the above commands are equivalent to:
protex -s -f [list of files] > output.tex latex output.tex latex output.tex latex output.tex dvipdf output.tex output.pdf rm *.dvi *.aux *.log *.toc
NOTES:
- The -s option of protex will cause only the module, subroutine, and function headers to be included in the output.
- The -f option of protex suppresses printing source code file information next to each module or routine name.
- Calling pdflatex 3 times is necessary to ensure that the table of contents information will be compiled correctly into the final output file.
- If the f90pdf script hangs for a while then it has probably encountered an error. To continue you may type "r" or exit with "x".
Generating PostScript documentation
To generate a PostScript file, type:
f90ps [list of files]
This will create the files:
- output.tex
- Documentation file in LaTeX format file
- output.ps
- Documentation file in PDF format
You may choose to rename these as you see fit. Like f90pdf, f90ps is a convenience wrapper script. It executes the following commands:
protex -s -f [list of files] > output.tex latex output.tex latex output.tex latex output.tex dvips output.dvi -o output.ps rm *.dvi *.aux *.log *.toc
NOTES:
- The -s option of protex will cause only the module, subroutine, and function headers to be included in the output.
- The -f option of protex suppresses printing source code file information next to each module or routine name.
- Note: calling latex 3 times is necessary to ensure that the table of contents information will be compiled correctly into the final output file.
- Also note: since there is no equivalent "pslatex" command, we must do the intermediate step of creating the dvi file and then calling dvips to create the PostScript file.
- If the f90ps script hangs for a while then it has probably encountered an error. To continue you may type "r".
Generating HTML documentation
To generate HTML documentation from F90 files, type:
protex -s -f [list of files] > output.tex latex2html output.tex
This will call the latex2html utility (which ships standard with Linux) to parse the LaTeX file "output.tex" and create navigatable HTML pages.
NOTES:
- The -s option of protex will cause only the module, subroutine, and function headers to be included in the output.
- The -f option of protex suppresses printing source code file information next to each module name.
Sample output files
We have created sample output files for you (sample.pdf, sample.ps) from the F90 template files located in this directory. These were created with the following commands:
f90pdf template_introduction.txt *.h *.F90 mv output.pdf sample.pdf f90ps template_introduction.txt *.h *.F90 mv output.ps sample.ps
Note that by placing the template_introduction.txt file first before any Fortran files (*.h, *.F90), this will cause a title page to be added to the output files. You may omit this if you wish.
Protex header files in more depth
template_introduction.txt
You can use the template_introduction.txt to define a title page for your document, such as:
!------------------------------------------------------------------------------ ! GEOS-Chem Global Chemical Transport Model ! !------------------------------------------------------------------------------ !BOI ! !TITLE: Front page template ! !AUTHORS: Bob Yantosca and Philippe Le Sager ! !AFFILIATION: School of Engineering and Applied Sciences, Harvard University ! !DATE: May 23, 2008 !EOI !------------------------------------------------------------------------------
NOTES:
- The BOI and EOI are used to define extent of the header. Protex will only look at the text between BOI and EOI.
- Text following !TITLE, !AUTHORS, !AFFILIATION, and !DATE tags must be placed on the same line.
- You may also specify an optional !INTRODUCTION tag following !DATE, which will become topic #1 in the table of contents. This allows you to go back later and manually insert LaTeX markup later into the output.tex file.
template_includefile.h
You can use the template_includefile.h to document an include file with either common blocks or defined parameters, such as:
!------------------------------------------------------------------------------ ! GEOS-Chem Global Chemical Transport Model ! !------------------------------------------------------------------------------ !BOP ! ! !INCLUDE: GC_SomethingIncludeFile.h ! ! !DESCRIPTION: This include file contains the various parameters that will ! allow the module and routine to do stuff to various things in various ! routines in various places. !\\ !\\ ! !PUBLIC TYPES: ! TYPE t_GeosChemSomething !%%% declare stuff here %%% END TYPE t_GeosChemSomething ! ! !DEFINED PARAMETERS: ! INTEGER(ESMF_KIND_I8), PUBLIC, PARAMETER :: myIntParam ! INTEGER value REAL(ESMF_KIND_I8), PUBLIC, PARAMETER :: myRealParam ! REAL*8 value ! ! !REVISION HISTORY: ! 21 May 2008 - R. Yantosca - Initial Version ! ! !REMARKS: !EOP !------------------------------------------------------------------------------
NOTES:
- The BOP and EOP tags denote the beginning and end of the protex prologue. Protex will search for text between these two tags.
- You must place the name of the include file on the same line as the !INCLUDE: tag.
- The text following the !DESCRIPTION tag must also start on the same line. This text will be rendered as formatted text instead of in fixed-space courier font. This means that:
- Special LaTeX characters (e.g. _ and #) must be prefaced with a backslash (e.g. \_ and \#) or else this will cause LaTeX to choke when parsing the file.
- Two manual line breaks !\\ must be placed at the end of the text to prevent the next tag from starting on the same line.
- Except for !MODULE and !DESCRIPTION, for the rest of the protex tags it is OK to start text on a new line below the tag. You can place extra Fortran comments ! after protex tags to make the code more readable.
- You should list the actual program unit definition (in this case, the MODULE GC_SomethingMod statement) on a new line following the !INTERFACE tag.
- References to journal articles or other notes should be placed after the !REMARKS tag.
template_module.F90
This file contains headers for declaring a Fortran 90 module, as well for the internal module routines.
Header for top-of-module
Here is the protex header that goes at the very top of a Fortran 90 module:
!------------------------------------------------------------------------------ ! GEOS-Chem Global Chemical Transport Model ! !------------------------------------------------------------------------------ !BOP ! ! !MODULE: GC_SomethingMod.F90 ! ! !DESCRIPTION: This module contains the data type to declare a Something ! object and the methods to work with the Something object. !\\ !\\ ! !INTERFACE: ! MODULE GC_SomethingMod ! ! !USES: ! USE ESMF_Mod IMPLICIT NONE ! ! !PUBLIC TYPES: ! TYPE t_GeosChemSomething !... declare stuff here END TYPE t_GeosChemSomething ! ! !PUBLIC MEMBER FUNCTIONS: ! PUBLIC :: GC_SomethingRoutine1 PUBLIC :: GC_SomethingFunction1 ! ! !PUBLIC DATA MEMBERS: ! INTEGER(ESMF_KIND_I4), PUBLIC :: myPublicVariable ! public data variable ! ! !REVISION HISTORY: ! 21 May 2008 - R. Yantosca - Initial Version ! ! !REMARKS: ! Protex is great! ! !EOP !------------------------------------------------------------------------------
The tags are mostly self-explanatory. However, you must be aware of a few things:
- The BOP and EOP tags denote the beginning and end of the protex prologue. Protex will search for text between these two tags.
- You must place the name of the module on the same line as the !MODULE tag.
- The text following the !DESCRIPTION tag must also start on the same line. This text will be rendered as formatted text instead of in fixed-space courier font. This means that:
- Special LaTeX characters (e.g. _ and #) must be prefaced with a backslash (e.g. \_ and \#) or else this will cause LaTeX to choke when parsing the file.
- Two manual line breaks !\\ must be placed at the end of the text to prevent the next tag from starting on the same line.
- Except for !MODULE and !DESCRIPTION, for the rest of the protex tags it is OK to start text on a new line below the tag. You can place extra Fortran comments ! after protex tags to make the code more readable.
- You should list the actual program unit definition (in this case, the MODULE GC_SomethingMod statement) on a new line following the !INTERFACE tag.
- References to journal articles or other notes should be placed after the !REMARKS tag.
Header for module procedures
Here is the protex header that should be placed at the top of each subroutine that is contained in the module:
!------------------------------------------------------------------------------ ! GEOS-Chem Global Chemical Transport Model ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: GC_SomethingRoutine1 ! ! !DESCRIPTION: This routine does something to the input variable and returns ! the result in the output variable. !\\ !\\ ! !INTERFACE: ! SUBROUTINE GC_SomethingRoutine1( input, inpout, output, status ) ! ! !INPUT PARAMETERS: ! INTEGER(ESMF_KIND_I4), INTENT(IN) :: input ! Input variable ! ! !INPUT/OUTPUT PARAMETERS: ! INTEGER(ESMF_KIND_I4), INTENT(IN) :: inpout ! In/out variable ! ! !OUTPUT PARAMETERS: ! INTEGER(ESMF_KIND_I4), INTENT(IN) :: output ! Output variable ! ! !REVISION HISTORY: ! 21 May 2008 - R. Yantosca - Initial Version ! ! !REMARKS: ! Protex is great! ! !EOP !------------------------------------------------------------------------------ !BOC !%%% Your code goes here %%% END SUBROUTINE GC_SomethingRoutine1 !EOC
and here is the corresponding header for a function that is contained in the module:
!------------------------------------------------------------------------------ ! GEOS-Chem Global Chemical Transport Model ! !------------------------------------------------------------------------------ !BOP ! ! !IROUTINE: GC_SomethingFunction1 ! ! !DESCRIPTION: This function does something to the input variable and returns ! the result in the value variable. !\\ !\\ ! !INTERFACE: ! FUNCTION GC_SomethingFunction1( input ) RESULT( value ) ! ! !INPUT PARAMETERS: ! INTEGER(ESMF_KIND_I4), INTENT(IN) :: input ! Input variable ! ! !OUTPUT PARAMETERS: ! INTEGER(ESMF_KIND_I4) :: value ! Function value ! ! !REVISION HISTORY: ! 21 May 2008 - R. Yantosca - Initial Version ! ! !REMARKS: ! Protex is great! ! !EOP !------------------------------------------------------------------------------ !BOC !%%% Your code goes here! %%% END FUNCTION GC_SomethingFunction1 !EOC
NOTES:
- Here again, the tags BOP and EOP denote the extent of the protex prologue.
- The tags BOC and EOC denote the beginning and end of the source code. If you call protex with the -s option (or if you use the f90pdf and f90ps scripts), then protex will ignore the source code and only render the text from the prologues at the top of each routine.
- For routines that are contained by a module, we use the !IROUTINE tag. The name of the routine must be placed on the same line as !IROUTINE.
template_routine.F90
Header for standalone routines
The protex header for standalone routines (i.e. main programs, or subroutines and functions not contained within a module) is slightly different than the header for module procedures:
!------------------------------------------------------------------------------ !BOP ! ! !ROUTINE: GC_Routine.F90 ! ! !DESCRIPTION: This routine takes in an input variable, does something to it, ! and then sends out an output variable. !\\ !\\ ! !INTERFACE: ! SUBROUTINE GC_Routine( input, output ) ! ! !USES: ! USE GC_SomethingMod ! ! !INPUT PARAMETERS: ! REAL(ESMF_KIND_R8), INTENT(IN) :: input ! input variable ! ! !OUTPUT PARAMETERS: ! REAL(ESMF_KIND_R8), INTENT(IN) :: output ! output variable ! ! !BUGS: ! None known at this time ! ! !SEE ALSO: ! GC_SomethingMod.F90 ! ! !SYSTEM ROUTINES: ! None ! ! !FILES USED: ! GC_SomethingMod.F90 ! ! !REVISION HISTORY: ! 21 May 2008 - R. Yantosca - Initial version ! ! !REMARKS: ! Protex is great! ! !EOP !------------------------------------------------------------------------------ !BOC !%%% Your code goes here! %%% END SUBROUTINE GC_Routine !EOC
NOTES:
- For standalone routines, we use the !ROUTINE tag as opposed to the !IROUTINE tag (which is reserved for routines within modules).
--Bob Y. 10:36, 28 May 2008 (EDT)
Useful LaTeX formatting commands
Here we include the LaTeX markup text commands for various useful constructs (lists, sections, etc). NOTE: These are only applicable for text that you place under the !DESCRIPTION section in the Protex header. For more information, please see Jeff Clark's online LaTeX tutorial.
Itemized bullet list
\begin{itemize} \item This is Item 1 \item This is Item 2 \item This is Item 3 \end{itemize}
Numbered bullet list
\begin{enumerate} \item This is Item 1 \item This is Item 2 \item This is Item 3 \end{enumerate}
Description list
\begin{description} \item[Item Name 1] Description 1 \item[Item name 2] Description 2 \item[Item name 3] Description 3 \end{description}
Text formatting
\textbf{This text will be rendered as Boldface.} \emph{This text will be rendered as Italics.} \underline{This text will be rendered as Underline.} \texttt{This text will be rendered as "teletype" text (i.e. fixed-space Courier font)}
Math mode
This text contains $i-1$ the term i-1 rendered into math mode. This text contains $A_{i-1}$ A with the "i-1" subscripted. This text contains $A^{i-1)$ A with the "i-1" superscripted $$A_{i-1} = A_{i} - B_{i}$$ (This text describes an equation on its own line, with subscripts)
--Bob Y. 11:50, 15 April 2009 (EDT)