%% LyX 1.6.9 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass[english]{article} \usepackage{ae,aecompl} \usepackage[T1]{fontenc} \usepackage[latin9]{inputenc} \usepackage[letterpaper]{geometry} \geometry{verbose,tmargin=2.5cm,bmargin=2.5cm,lmargin=2.5cm,rmargin=2.5cm} \usepackage{babel} \usepackage{textcomp} \usepackage{amsbsy} \usepackage{graphicx} \usepackage[numbers]{natbib} \usepackage[unicode=true, bookmarks=true,bookmarksnumbered=false,bookmarksopen=false, breaklinks=false,pdfborder={0 0 1},backref=false,colorlinks=false] {hyperref} \hypersetup{pdftitle={An Introduction to CHNOSZ}, pdfauthor={Jeffrey M. Dick}} \makeatletter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands. \usepackage{Sweave} \newenvironment{wrapsweave}{\par}{\par} \newenvironment{lyxcode} {\par\begin{list}{}{ \setlength{\rightmargin}{\leftmargin} \setlength{\listparindent}{0pt}% needed for AMS classes \raggedright \setlength{\itemsep}{0pt} \setlength{\parsep}{0pt} \normalfont\ttfamily}% \item[]} {\end{list}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands. %\VignetteIndexEntry{An introduction to CHNOSZ} % so DOIs in bibliography show up as hyperlinks \newcommand*{\doi}[1]{\href{http://dx.doi.org/#1}{doi: #1}} \makeatother \begin{document} \title{An introduction to CHNOSZ} \author{Jeffrey M. Dick} \maketitle \begin{wrapsweave} <>= options(width=90) @ \end{wrapsweave} \section{About} This document will orient you to the basic functionality of CHNOSZ, a package for the R software environment. R is a powerful language and also very fun to use. Don't worry if you're new to it; just plow through the examples below and you'll start to get the hang of it. If you want a more structured approach to learning the language, there are some excellent guides in the Manuals section of the \href{http://www.r-project.org}{R Project page}. There is also a publication on CHNOSZ available \citep{Dic08}, and \href{http://www.chnosz.net}{a website}. The package was developed since 2006 to support a research project on the thermodynamic properties of proteins. Since that time, the functions in the package have expanded to include calculation of the thermodynamic properties of reactions, and especially the construction of equilibrium chemical activity diagrams for both inorganic and organic systems. The development of the package since 2009 has focused on the calculation of the equilibrium chemical activities of large numbers of proteins with applications in interpretation of metagenomic data and protein abundances in a variety of settings. The database and functions are flexible in their use, allowing one to model the relative stabilities of proteins, minerals or aqueous species using very similar commands. Examples below are intended to demonstrate basic usage to new users. \section{Outline of workflow} CHNOSZ is made up of a set of functions and supporting datasets. The major components of the package are shown in the figure below, which is a modified version of the flowchart shown in Ref. \citep{Dic08} (boxes -- functions; ellipses -- datasets). \includegraphics[width=0.8\columnwidth]{chnosz_new} Some common usage scenarios are: \begin{itemize} \item using \texttt{info()} to search for species in the thermodynamic database \item using \texttt{subcrt()} to calculate the thermodynamic properties of species and reactions \item using the sequence \texttt{basis()}, \texttt{species()}, \texttt{affinity()}, \texttt{diagram()} to assign the basis species that define the dimensions of chemical composition in a system, define the species of interest for relative stability calculations, calculate the affinities of formation reactions of the species of interest under reference (non-equilibrium) conditions, and to transform the non-equilibrium affinities to equilibrium chemical activities and plot the results. \item using \texttt{revisit()} to calculate/plot statistics of the chemical activities of the species of interest and \texttt{findit()} to search for combinations of activities of basis species, temperature and/or pressure that optimize those statistics. (These features, first appearing in version 0.9-3 of the package, are not covered in this document.) \end{itemize} The functions are designed with an interactive setting in mind; you can use CHNOSZ without having to write your own scripts. The examples in this vignette are meant to portray a simple interactive session. However, as you become more familiar with CHNOSZ and R, you will probably find it helpful to save sequences of function calls that produce interesting results. The results can then be reproduced on demand by yourself or others with whom you might share your scripts. \section{Installing and loading CHNOSZ} If you have just installed R, and you are online, installing the CHNOSZ package should be as simple as selecting {}``Install packages from CRAN'' or similar menu item in the R GUI or using the following command to start the package installation process. (If you are not online, you instead have to tell R to install the package from a local package file.) \begin{wrapsweave} <>= install.packages("CHNOSZ") @ \end{wrapsweave} Then load the CHNOSZ package to make its functions and data available in your working session. \begin{wrapsweave} <>= library(CHNOSZ) @ <>= thermo$opt$online <- TRUE @ \end{wrapsweave} The rest of this document assumes that the CHNOSZ package is loaded. \section{Thermodynamic database} \subsection{\texttt{info()} part I} So you want to know what are the standard molal thermodynamic properties and equations of state parameters of aqueous ethylene? Look no further than the \texttt{info()} function, which provides a convenient interface to retrieve entries from the thermodynamic database packaged with CHNOSZ. \begin{wrapsweave} <>= info("ethylene") @ \end{wrapsweave} There are two species named {}``ethylene'' in the database. Normally, \texttt{info()} gives preference to aqueous species if they exist, so in this case we find that aqueous ethylene is species number 88 in the database. Let's display this entry, now by giving the species index to the function. \begin{wrapsweave} <>= info(88) @ \end{wrapsweave} If you were instead interested in the properties of the gas, you could run: \begin{wrapsweave} <>= info("ethylene","gas") @ \end{wrapsweave} \texttt{info()} itself is used by other functions in the package. It prints output to the screen, but also returns a numeric value if it finds a species matching the search term. So, we can retrieve the properties of aqueous acetic acid without having to type in the species ID number. \begin{wrapsweave} <>= aadata <- info(info("acetic acid")) print(aadata) @ \end{wrapsweave} \subsection{\texttt{thermo\$refs}} The thermodynamic data and other parameters used by the functions, as well as system definitions provided by the user in an interactive session, are stored in a list object called \texttt{thermo}. \begin{wrapsweave} <>= summary(thermo) @ \end{wrapsweave} Within this list, the thermodynamic database is contained in a data frame (an R object that is like a matrix with named columns), \texttt{thermo\$obigt}, and the references to the original sources of thermodynamic data in the literature are listed in \texttt{thermo\$refs}. Many of the authors who are responsible for these data would be grateful if you cite them whenever these data are used in publications! Use the \texttt{browse.refs()} function without any arguments to show citation information for all of the references in a browser window. You can include a species index number to open the URL(s) associated with that entry in the database (this requires an Internet connection). \begin{wrapsweave} <>= browse.refs(88) @ \end{wrapsweave} \texttt{browse.refs: opening URL for SH90 (E. L. Shock and H. C. Helgeson, 1990)} \subsection{\texttt{info()} part II} Want to know what acids are in the database? \begin{wrapsweave} <>= info("acid") @ \end{wrapsweave} Here, \texttt{info()} couldn't find an exact match to a name, so it performed a fuzzy search. That's why {}``uracil'' and {}``metacinnabar'' show up above. If you really just want species whose names include the term {}``acid'', you can add a placeholder character to narrow the search. (Note: don't use an underscore ({}``\_'') here because that character is reserved for names of proteins. Any other character will do; here we use a space.) \begin{wrapsweave} <>= info(" acid") @ \end{wrapsweave} The names of species other than proteins use (almost) exclusively lowercase letters. \texttt{info()} can also be used to search the text of the chemical formulas as they are entered in the database; the symbols for the elements always start with a capital letter. The example below lists the formulas of aqueous species, then minerals, that contain the symbol commonly used to represent the hydroxide group. \begin{wrapsweave} <>= info("(OH)") @ \end{wrapsweave} \section{Proteins} \subsection{\texttt{protein()}} There are few things more fun than calculating the standard molal Gibbs energy of formation from the elements at 25 $^{\circ}$C and 1 bar of a protein using group additivity. And there are few proteins whose thermodynamic properties are more well studied than lysozyme from the egg of the chicken. \begin{wrapsweave} <>= protein("LYSC_CHICK") protein(6) @ \end{wrapsweave} What happened there? Well, the first line extracted the row (rownumber 6) of \texttt{thermo\$protein} that contains the amino acid composition of LYSC\_CHICK. The second line used group additivity \citep{DLH06} to calculate the standard molal thermodynamic properties and equations of state parameters of the aqueous protein species. \subsection{\texttt{info()}} Most of the time you probably won't be using the \texttt{protein()} function. That's because \texttt{info()} recognizes the underscore character as being an essential part of the name of a protein. The names of proteins in CHNOSZ are mostly consistent with those used in \href{http://www.uniprot.org}{Swiss-Prot/UniProtKB}. \begin{wrapsweave} <>= si <- info("LYSC_CHICK") info(si) @ \end{wrapsweave} When CHNOSZ is first loaded, the thermodynamic properties and parameters of the proteins are not present in \texttt{thermo\$obigt}. Therefore, the first call to\texttt{ info()} just above had a side effect of adding the computed properties and parameters to \texttt{thermo\$obigt}. \section{Reaction properties} \subsection{A single species} A major feature of CHNOSZ is the ability to calculate standard molal properties of species and reactions as a function of temperature and pressure. The function used is called \texttt{subcrt()}, which takes its name (with modification) from the well known SUPCRT package \citep{JOH92}. \texttt{subcrt()}, like \texttt{info()}, has the name of a species (including proteins) as its first argument (it also works if you give it the numeric index of the species in the database). If no reaction coefficients are given, the function calculates the standard molal properties of the indicated species on a default temperature-pressure grid. \begin{wrapsweave} <>= subcrt("water") @ \end{wrapsweave} The columns in the output are temperature ($^{\circ}$C), pressure (bar), density of water (g cm$^{-3}$), logarithm of the equilibrium constant (only meaningful for reactions; see below), and standard molal Gibbs energy and enthalpy of formation from the elements (cal mol$^{-1}$), and standard molal entropy (cal K$^{-1}$ mol$^{-1}$), volume (cm$^{3}$ mol$^{-1}$) and heat capacity (cal K$^{-1}$ mol$^{-1}$). Compared to other species available in CHNOSZ, the equations for calculating the properties of liquid $\mathrm{H_{2}O}$ are quite complex. The package uses a Fortran subroutine taken from SUPCRT for these calculations. See \texttt{help(water)} for more information. \subsection{A reaction} To calculate the properties of a reaction, enter the stoichiometric reaction coefficients as a second argument to \texttt{subcrt()}. Reactants have negative coefficients, and products have positive coefficients. The function call below also shows the specification of temperature. \begin{wrapsweave} <>= subcrt(c("C2H5OH","O2","CO2","H2O"),c(-1,-3,2,3),T=37) @ \end{wrapsweave} For historical reasons (i.e., the prevalence of the use of oxygen fugacity in geochemical modeling \citep{And05}), $\mathrm{O_{2}}$ breaks the general rule in CHNOSZ that species whose states are not specified are given the aqueous designation if it is available in the thermodynamic database. If you want to specify the physical states of the species in the reaction, that's possible too. For example, we can ensure that dissolved $\mathrm{O_{2}}$ instead of the gaseous form is used in the calculation. \begin{wrapsweave} <>= subcrt(c("C2H5OH","O2","CO2","H2O"),c(-1,-3,2,3),c("aq","aq","aq","liq"),T=37) @ \end{wrapsweave} A useful feature of \texttt{subcrt()} is that it emits a warning if the reaction is not balanced. Let's say you forgot to account for oxygen on the left-hand side of the reaction% \footnote{This example is motivated by the unbalanced reaction found at the \href{http://en.wikipedia.org/wiki/Ethanol_metabolism}{Wikipedia entry on ethanol metabolism} on 2010-09-23 and still present as of 2011-08-15: {}``Complete Reaction: C$_{2}$H$_{6}$O(Ethanol)\textrightarrow{}C$_{2}$H$_{4}$O(Acetaldehyde)\textrightarrow{}C$_{2}$H$_{4}$O$_{2}$(acetic Acid) \textrightarrow{}Acetyl-CoA\textrightarrow{}3H$_{2}$O+2CO$_{2}$''.% }. \begin{wrapsweave} <>= subcrt(c("C2H5OH","CO2","H2O"),c(-1,2,3),T=37) @ \end{wrapsweave} The function still reports the results of the calculations, but use them very cautiously (only if you have a specific reason for writing an unbalanced reaction). In the next section we'll see how to use another feature of CHNOSZ to automatically balance reactions. \section{Basis species} \subsection{What are basis species?} \textbf{Basis species} are a minimal number of chemical species that represent the compositional variation in a system. Operationally, a \textbf{system} is the combination of basis species and species of interest which is set up by the user to investigate a real-life system. The basis species are akin to thermodynamic components, but can include charged species. There are at least two reasons to define the basis species when using CHNOSZ. First, you might want to use them to automatically balance reactions. Second, they are required for making chemical activity diagrams. Let's start with an example that \emph{doesn't} work. \begin{wrapsweave} <>= basis(c("CO2","H2O","NH3","H2S","H+")) @ \end{wrapsweave} ~ \begin{lyxcode} Error~in~put.basis(basis,~mystates)~:~ ~~the~stoichiometric~matrix~must~be~square~and~invertible In~addition:~Warning~messages: 1:~basis:~5~compounds~(~CO2~H2O~NH3~H2S~H+~)~ 2:~basis:~6~elements~(~C~H~N~O~S~Z~)~ \end{lyxcode} A limitation of CHNOSZ is that the number of basis species must be equal to the number of elements, plus one if charge is present. This way, any possible species of interest made up of these elements can be compositionally represented by a linear combination of the basis species. Now let's write a working basis definition. \begin{wrapsweave} <>= basis(c("CO2","H2O","NH3","O2","H2S","H+")) @ \end{wrapsweave} First basis definition! Note the column names, which give CHNOSZ its name. These represent the elements in the commonly-occurring amino acids, together with charge, denoted by {}``Z''. \subsection{Auto-balancing a reaction} Now that the basis species are defined, try the unbalanced reaction again. \begin{wrapsweave} <>= subcrt(c("C2H5OH","CO2","H2O"),c(-1,2,3),T=37) @ \end{wrapsweave} Here, \texttt{subcrt()} detected an unbalanced reaction, but since the missing element was among the elements of the basis species, it added the appropriate amount of $\mathrm{O_{2}}_{\left(gas\right)}$ to the reaction before running the calculations. You can go even further and eliminate $\mathrm{CO_{2}}$ and $\mathrm{H_{2}O}$ from the function call, but still get the same results. \begin{wrapsweave} <>= subcrt(c("C2H5OH"),c(-1),T=37) @ \end{wrapsweave} What if you were interested in the thermodynamic properties of the reaction of ethanol to acetaldehyde, but didn't want to balance the reaction yourself (and you also didn't know how the formulas of the species are written in the database)? \begin{wrapsweave} <>= subcrt(c("ethanol","acetaldehyde"),c(-1,1),T=37) @ \end{wrapsweave} Notice how 2 H's needed to be added to the right-hand side of the reaction; in our definition of basis species this comes out to $\mathrm{H_{2}O}-0.5\mathrm{O_{2}}$. With a different choice of basis species, but the same elements, the reaction might look quite different. As an example, suppose you had amino acids in mind. The first line below, \texttt{data(thermo)}, is a quick way to reset the \texttt{thermo} object to its original state, in order to clear the current system definition. \begin{wrapsweave} <>= data(thermo) basis(c("glutamic acid","methionine","isoleucine","lysine","tyrosine","H+")) subcrt(c("ethanol","acetaldehyde"),c(-1,1),T=37) @ \end{wrapsweave} In this case, the function finds that 2 H's are the compositional equivalent of $0.5\mathrm{C_{6}H_{13}NO_{2}}-0.125\mathrm{C_{6}H_{14}N_{2}O_{2}}-0.250\mathrm{C_{9}H_{11}NO_{3}}$. It's pretty easy for the computer to figure that out using matrix operations, but probably isn't something you'd want to do by hand. You might complain that this reaction is not likely to represent an actual metabolic process ... as always, the challenge (and fun) of coming up with a useful basis definition is in relating the species to observable quantities. \subsection{It works for proteins too!} Let's set the basis definition again, this time using a keyword that refers to a preset combination of basis species commonly encountered in the documentation for CHNOSZ. Then we will use \texttt{subcrt()} to calculate the thermodynamic properties of a reaction to form a protein from the basis species. \begin{wrapsweave} <>= data(thermo) basis("CHNOS+") subcrt("LYSC_CHICK",1,T=25) @ \end{wrapsweave} Note that using the keyword argument in \texttt{basis()} also set the logarithms of activities (or fugacity in the case of $\mathrm{O_{2}}_{\left(g\right)}$) to nominal values. While these settings do not affect the results of the \texttt{subcrt()} calculation (which normally returns only the standard molal properties of the reaction), they are essential for calculating the relative stabilities of the species of interest. If the protein is not found in CHNOSZ's own database, the amino acid composition of the protein can be retrieved from the UniProt Knowledge Base using the Swiss-Prot name (if the computer is connected to the Internet). This is the only time a function in CHNOSZ asks for confirmation from a user, in order to give fair warning that an online activity is about to be performed. \begin{wrapsweave} <>= subcrt("ALAT1_HUMAN",1,T=25) @ \end{wrapsweave} ~ \begin{lyxcode} Shall~I~try~an~online~search~for~~ALAT1\_HUMAN~\_~SWISS~?~y protein:~trying~http://www.uniprot.org/uniprot/ALAT1\_HUMAN~...~got~it! protein:~found~P24298~...~~Alanine~aminotransferase~1~~(length~496). protein:~found~ALAT1\_HUMAN~(C2429H3866N684O705S22,~496~residues) subcrt:~1~species~at~298.15~K~and~1~bar~(wet)~ subcrt:~reaction~is~not~balanced;~it~is~missing~this~composition: ~~~~~C~~~~~H~~~~N~~~~O~~~S ~-2429~-3866~-684~-705~-22 subcrt:~adding~missing~composition~from~basis~definition~and~restarting... subcrt:~6~species~at~298.15~K~and~1~bar~(wet)~ \$reaction ~~~~~coeff~~~~~~~~name~~~~~~~~~~~~~~~formula~state~ispecies 2926~~~~~1~ALAT1\_HUMAN~C2429H3866N684O705S22~~~~aq~~~~~2926 69~~~-2429~~~~~~~~~CO2~~~~~~~~~~~~~~~~~~~CO2~~~~aq~~~~~~~69 1~~~~~-885~~~~~~~water~~~~~~~~~~~~~~~~~~~H2O~~~liq~~~~~~~~1 68~~~~-684~~~~~~~~~NH3~~~~~~~~~~~~~~~~~~~NH3~~~~aq~~~~~~~68 70~~~~~-22~~~~~~~~~H2S~~~~~~~~~~~~~~~~~~~H2S~~~~aq~~~~~~~70 2691~~2519~~~~~~oxygen~~~~~~~~~~~~~~~~~~~~O2~~~gas~~~~~2691 \$out ~~~T~P~~~~~~logK~~~~~~~~~G~~~~~~~~~H~~~~~~~~S~~~~~~V~~~~~~~~Cp 1~25~1~-191972.3~261897066~273248830~38245.59~-73411~-107650.2 \end{lyxcode} \section{Activity diagrams} \subsection{Quick example: stability diagram for proteins} Suppose that we are asked to calculate the relative stabilities of some proteins from different organisms. We will use part of a case study presented in Ref. \citep{Dic08}. \emph{Methanocaldococcus jannaschii} is a hyperthermophilic methanogen known to live at higher temperatures than \emph{Methanococcus voltae} (also a methanogen) and \emph{Haloarcula japonica }(a halophile). These archaeal organisms produce cell-surface glycoproteins (a.k.a. surface-layer proteins). After defining the basis species we can define the \textbf{species of interest}, i.e. those proteins whose relative stabilities we wish to calculate. \begin{wrapsweave} <>= species(c("CSG_METJA","CSG_METVO","CSG_HALJP")) @ \end{wrapsweave} Note the output: the matrix denotes the coefficients of each of the basis species in the formation reaction for one mole of each of the species of interest. The \textbf{formation reaction} is the chemical reaction to form one mole of a species of interest (as a product) from a combination of basis species (as reactants and/or products, depending on the stoichiometric constraints). The formation reactions generally are \emph{not} statements about the mechanisms of reactions. The species definition also includes reference values for the chemical activities of the species of interest. Now we are all set up to calculate the chemical affinities of the formation reactions. The chemical affinity is the negative of the Gibbs energy change of a reaction per unit of reaction progress; it is calculated in CHNOSZ using $\boldsymbol{A}=2.303RT\log(K/Q)$ ($R$ -- gas constant, $T$ -- temperature, $K$ -- equilibrium constant, $Q$ -- activity product). \texttt{affinity()} can accept arguments describing the range of chemical conditions we're interested in. The names of the arguments can refer to the basis species. Here, we vary the logarithm of the fugacity of oxygen. The chemical activities of the other basis species are taken to be constants equal to the values shown above. \begin{wrapsweave} <>= a <- affinity(O2=c(-80,-65)) @ \end{wrapsweave} Now we can use \texttt{diagram()} to plot the relative stabilities of the proteins in the system. We'll also specify where the legend should be placed on the plot. \begin{wrapsweave} \setkeys{Gin}{width=0.6\textwidth} <>= diagram(a,legend.x="bottomleft") @ \setkeys{Gin}{width=1.0\textwidth} \end{wrapsweave} Notably, the protein from the organism found at the highest temperatures is relatively stable at more reduced conditions. \subsection{How does this work?} Here is a partial explanation: You use \texttt{affinity()} to calculate the chemical affinities of the formation reactions of the proteins, taking into account chemical activities of the proteins that are set to reference, non-equilibrium values. Then, the \texttt{diagram()} function transforms these non-equilibrium affinities into chemical activities of the proteins at metastable equilibrium (this is actually achieved using the Boltzmann distribution). These activities satisfy the conditions that 1) the total activity of an immobile component (for proteins, this defaults to the protein backbone group) is constant and 2) the chemical affinities of the formation reactions are all equal (but generally not zero). More details can be found in another vignette ({}``\texttt{protactiv}''). \subsection{More proteins, more dimensions} Now let's add some bacterial surface-layer proteins. They are in some way functional analogs (but not \href{http://en.wikipedia.org/wiki/Homology_(biology)}{homologs}) of the archaeal cell-surface glycoproteins. \begin{wrapsweave} \setkeys{Gin}{width=0.6\textwidth} <>= species(c("SLAP_ACEKI","SLAP_BACST","SLAP_BACLI","SLAP_AERSA")) basis(c("NH3","H2S"),c(-1,-10)) a <- affinity(O2=c(-90,-70),H2O=c(-15,0)) diagram(a) @ \setkeys{Gin}{width=1.0\textwidth} \end{wrapsweave} Equilibrium predominances for proteins as a function of two chemical activities! If you don't like the colors in the plot, don't worry... the colors can be changed by using the \texttt{col} argument of \texttt{diagram()}. This example hints at the multidimensional nature of the stability problem. Note how the order of predominance fields at $\log a_{\mathrm{H_{2}O}}=0$ matches the order of proteins with highest equilibrium activities in the previous diagram. Interpreting the meaning of low activities of $\mathrm{H_{2}O}$ in these calculations remains a challenge. Why did we increase the activity of $\mathrm{NH_{3}}$ and decrease that of $\mathrm{H_{2}S}$? It was done here in order to increase the size of the equilibrium predominance fields of the bacterial proteins. This behavior is a result of the elemental makeup of the proteins: the bacterial proteins under consideration are, per residue, more nitrogen-rich and sulfur-poor than their archaeal counterparts (except for CSG\_HALJP, which has no sulfur). CHNOSZ has a function to display the compositional makeup of the proteins, per residue, from the basis species. \begin{wrapsweave} <>= residue.info() @ \end{wrapsweave} \subsection{A mineral example} This example is modeled after a figure on p. 246 of Bowers et al., 1984 \citep{BJH84} for the system HCl-H2O-CaO-CO2-MgO-(SiO2) at 300 $^{\circ}$C and 1000 bar. \begin{wrapsweave} \setkeys{Gin}{width=0.6\textwidth} <>= basis(c("HCl","H2O","Ca+2","CO2","Mg+2","SiO2","O2","H+"), c(999,0,999,999,999,999,999,-7)) species(c("quartz","talc","forsterite","tremolite","diopside", "wollastonite","monticellite","merwinite")) a <- affinity("Mg+2"=c(-12,-4),"Ca+2"=c(-8,0),T=300,P=1000) diagram(a) @ \setkeys{Gin}{width=1.0\textwidth} \end{wrapsweave} The 999's in the assignment of logarithms of activities of basis species could be any number -- these settings do not affect the outcome of the calculation. This is so because 1) $\mathrm{HCl}$, $\mathrm{CO_{2}}$ and $\mathrm{O_{2}}$ have zero stoichiometric coefficients in the species, 2) the activities of $\mathrm{Ca}^{+2}$ and $\mathrm{Mg}^{+2}$ correspond to the axes of the diagram, and their ranges are taken from the call to \texttt{affinity()}, and 3) $\mathrm{SiO_{2}}$ is the immobile (conserved) component. Also note that {}``Mg+2'' and {}``Ca+2'' are not valid names of objects in R, but we can use them as names of arguments by putting them in quotation marks in the call to \texttt{affinity()}. Here, the scales of the axes here depend on the pH setting. This calculation is therefore logically different from the formulation used in Ref. \citep{BJH84}, where the axes are $\log\left(a_{\mathrm{Mg^{+2}}}/\sigma_{\mathrm{Mg}^{+2}}a_{\mathrm{H^{+}}}^{2}\right)$ and $\log\left(a_{\mathrm{Ca^{+2}}}/\sigma_{\mathrm{Ca}^{+2}}a_{\mathrm{H^{+}}}^{2}\right)$ (where $\sigma$ is a function of the solvation of the subscripted species). However, the geometry of the stability fields in the diagram produced here is consistent with the previous work. In just a few lines it's possible to make a wide variety of activity diagrams for organic and inorganic species. Try it for your favorite system! \section{Where to go from here} You can explore the package documentation through R's help system; just type \texttt{help.start()} at the command line and select CHNOSZ in the browser window that comes up. If you want to get an idea of the types of calculations available in CHNOSZ, run the examples in the help files for individual functions. A good one to try out might be \texttt{diagram()}; you can run all of the examples there with a single command: \begin{wrapsweave} <>= example(diagram) @ \end{wrapsweave} Or you can use the following to run \emph{all} of the examples provided in the documentation for the package. You will see a lot of text fly by on the screen, as well as a variety of plots. The examples will take about 5--10 minutes to run, depending on your machine. \begin{wrapsweave} <>= examples() @ \end{wrapsweave} If you want to add to or modify the thermodynamic database, read the instructions at the top of the help page for \texttt{thermo}: \begin{wrapsweave} <>= help(thermo) @ \end{wrapsweave} Have fun! \section{Document information} Revision history: \begin{itemize} \item 2010-09-30 Initial version \item 2011-08-15 Add \texttt{browse.refs()}; modifying database hint changed to \texttt{help(thermo)} \end{itemize} R session information: \begin{wrapsweave} <>= sessionInfo() @ \end{wrapsweave} \bibliographystyle{unsrtnat} \bibliography{vig} \end{document}