The transfer function, introduced in CHNOSZ-0.8, is meant to model the reaction paths of minerals and proteins. A reaction path portrays the progressive formation and destruction of minerals or proteins as the aqueous solution chemistry changes in response to incremental dissolution of the starting material. This function can be used to investigate the sequence and abundance of minerals formed in weathering processes and possibly of proteins formed in the cell cycle or evolution.
The PDF version of this document can be found here .
Reaction paths are often visualized on chemical activity diagrams. First, a stability diagram is constructed where the axes correspond to activities of aqueous species that can be altered by the dissolution of a reactant mineral or protein. The dissolution of the reactant might initially lead to the formation of other products because the starting composition of the fluid lies outside the stability field of the reactant mineral or protein.
The function takes a stepwise approach to the determination of a reaction path. It is based on the hypothesis of partial equilibrium of the products with the aqueous solution (Helgeson et al., 1969) . The most stable product mineral or protein with respect to the current solution chemistry is identified. A small amount of the primary reactant is provisionally introduced to the system. A conservation constraint is used to calculate the amount of the product that could be formed from this addition of reactant. ( and PBB (protein backbone) are the conserved components in the mineral and protein examples below.) If the change in solution chemistry as a consequence of the transformation of reactant to product is within a limit defined by the user, the step is allowed to proceed. Otherwise, the step fails and in the next step a smaller amount of primary reactant is provisionally introduced.
What happens when the solution chemistry changes so that a different product from that originally formed becomes stable? In an open system, the previously formed product is no longer available to react, so the formation of the new product simply proceeds according to whatever conservation constraint is in place. In a closed system, the previously formed product is now a reactant (referred to here as the secondary reactant). The primary and secondary reactants now both transform into the product, and the stoichiometry of this overall transformation is governed by a coupling constraint in addition to the primary conservation constraint.
The current calculations do not take account of the speciation of the aqueous components. For example, if is among the basis species, that particular stoichiometry will appear in the dissolution reactions even if the pH is below a value where is actually more stable.
A major motivation for writing this code was to generate reaction path diagrams for proteins. Certain terminology borrowed from the geochemical world may not be entirely consistent with biochemical reality. An example is the mention of the dissolution of reactants. In cells, do proteins really dissolve to form , , and so on? I suspect that the mechanism of protein formation and destruction in cells is more likely to be bound to amino acids than these chemical components. However, the process of protein transformation (the coupled dynamic changes in cellular chemistry and the population of proteins) rather than its mechanism might be amenable to our geochemically derived model.
In preparation for the reaction path simulation, first define the basis species and the species of interest (minerals or proteins) using the basis and species functions of CHNOSZ. Use affinity and diagram to construct a predominance diagram; the axes correspond to basis species that are candidates for the secondary conservants. Use species again to define the reactant, by giving one or more reacting species a non-zero activity (e.g. logarithm of activity equal to zero) and the remaining species an activity close to zero (e.g. logarithm of activity equal to -999). Call transfer with desired arguments. The function performs some setup steps, then initiates the major loop. At each step the function prints diagnostic messages to the screen; at each successful step it adds a point to the diagram indicating the current chemical conditions. When the end is reached (by hitting the maximum number of steps defined by the user) the function returns the results, including the numbers of moles of each species formed at each step.
A flowchart representing the preparatory steps and the function call and return is shown below.
Here the function initializes counters and constants, then processes the argument list. If ibalance (primary conservant) is PBB, the function stops with an error if all species are not in fact proteins, otherwise adds PBB rows to the basis and species stoichiometric matrices. If ibalance is NULL, an outside function (which.balance) is called to find the first basis species that is present in all species; this becomes the primary conservant. Everything is ready for the major loop.
At the beginning of each loop, the current value of alpha (destruction exponent) is calculated. If the previous step worked, the value of alpha is increased by one (we try to react more of the reactant in this step) but if the previous step failed, the value of alpha is decreased by one (we react less of the reactant) in an attempt to stay within the deviation limits of solution chemistry set by the user.
Then the function calculates the current numbers of moles of each basis species, and the changes that accompany dissolution of the primary reactant and dissolution of the secondary reactant (if any). The most stable product (or products if fmode is many) is also identified. Then, for an open system (dmode == none), the transformation of the reactant to product is calculated by observing the primary conservation constraint. For a closed system (dmode == coupled), the secondary conservation constraint is also invoked. For coupling to work, the secondary conservant must be present in the transformation reactions of the primary reactant to the product and of the secondary reactant to the product.
The calculation of reaction progress for either the open or closed system case takes account of the destruction exponent (alpha) which determines the amount of primary reactant that is available at that step. If the transformation of primary and secondary reactants to products causes the number of moles of any basis species to become negative, or the change in logarithm of activity of any basis species to exceed the deviation limit (devmax) the step fails and the next loop is initiated.
The figure below shows the major computations for each step. Failed steps lead to the ``next'' exits. If successful, the ``end'' of the step is reached and the reaction has progressed! The ensuing changes in the amounts of basis species and of reactant and product species are logged, a point is made on the activity diagram, and the following step is initiated.
The following examples examine the dissolution of K-feldspar in closed and open systems. The results are similar to the spreadsheet calculations for these systems described in Steinmann et al., 1994 . Each example below consists of a brief description of the strategy involved, the source code, and output in the form of a movie and a figure.
This example shows the dissolution of K-feldspar in a closed system (previously formed minerals are allowed to react). Here is the code to produce the diagrams, followed by a description of what each line does. The example may also be run in CHNOSZ with feldspar('closed').
basis(c('Al+3','H4SiO4','K+','H2O','H+','O2'),c(0,-6,-6,0,0,0)) species(c('k-feldspar','muscovite','pyrophyllite','kaolinite','gibbsite')) a <- affinity(H4SiO4=c(-6,-2),'K+'=c(-3,8)) diagram(a) basis('pH',4) species(1:5,c(-4,rep(-999,4))) t <- transfer(550,dmode='coupled',plot=c(2,3),devmax=0.2) plot.transfer(t)
Below is the plot showing the change of solution chemistry and minerals produced with reaction progress. At 100% reaction progress, moles of K-feldspar have reacted (this is right around step 403, as the system enters the muscovite stability field).
The sequence of commands below starts after the first six shown above. The three calls to transfer represent different starting constraints. For the first we react potassium feldspar, as above but instead in an open system; the result is the top curve in the diagram. Then we ask what happens when kaolinite is reacted instead of K-feldspar; this takes us horizontally from the starting point to the gibbsite-kaolinite boundary. Finally, we use that value of as a starting point for again reacting K-feldspar, which gives us the lower curved line that starts in the kaolinite field and stops at the muscovite - K-feldspar boundary. These segmented reaction paths are discussed in more detail by in Ref. . This simulation can also be run in CHNOSZ using feldspar('open').
t <- transfer(450,dmode='none',plot=c(2,3)) species(c('k-feldspar','kaolinite'),c(-999,-4)) t <- transfer(150,dmode='none',plot=c(2,3)) species(c('k-feldspar','kaolinite'),c(-4,-999)) basis('H4SiO4',t$basis[rownames(t$basis)=='H4SiO4',ncol(t$basis)]) t <- transfer(420,dmode='none',plot=c(2,3))
The movie is here . The reaction path is also shown in the figure below.
The examples below show computational results for reaction paths among some proteins that make up the anaphase promoting complex in yeast. We ask what proteins might form if we add to the system the composition of one of the proteins (APC2) in the complex.
Notice how in the feldspar diagram above the reaction path only had to cross boundaries that were horizontal and vertical, so the secondary conservant in these cases was just either one of the basis species that are represented by the axes. In the protein diagram the reaction boundaries have slopes between zero and infinity so the secondary conservation constraint is a linear combination of the basis species shown on the axes.
This reaction path calculation can also be performed in CHNOSZ using apc('closed').
basis(c("CO2","H2O","NH3","H2","H2S"),c(-10,0,-4,-10,-7)) basis("H2","aq") species(c("APC1","APC2","APC5","CDC16","CDC23","APC10","SWM1"),"YEAST") a <- affinity(CO2=c(-10,0),H2=c(-10,0)) diagram(a,residue=TRUE,as.residue=TRUE) species(1:nrow(species()),-999) species("APC2_YEAST",0) t <- transfer(510,ibalance="PBB",plot=c(1,4),dmode="coupled",devmax=0.15) plot.transfer(t,ylim=c(-22,-6),logprogress=TRUE)
The solution chemistry never gets to the APC2 metastability field even though we are adding its composition to the system. In the figure below it can be discerned that this is a consequence of a significant drop in the availability of sulfur (as ). Because the transformation of APC2 (moles sulfur per residue = 0.035) to APC5 (moles sulfur per residue = 0.044) is a sulfur-consuming reaction, it can not proceed significantly once the concentration of the sulfur-containing basis species, in this case hydrogen sulfide, drops too low. The situation would be different if we started with a greater amount of or used the ``buffer'' argument to transfer to bleed sulfur into the system during the computer experiment (try out apc('buffer') in CHNOSZ).
The values of percent reaction progress are shown in logarithmic units here. The entire percent reaction progress scale would increase by a factor of 10 for each unit decrease in the logarithm of activity of the starting protein (note that we started with log activity of zero for the reactant APC2). Here you can see the places where momentarily drops as a reaction boundary is crossed.
The reaction boundaries above all have negative slopes because we used to stand for the carbon-containing basis species. It is so highly oxidized (nominal oxidation state of carbon equal to +4) that any increase in in a system that maintains its metastable population of biomacromolecules must be offset by a decrease in . (To understand why, consider for example the idealized reaction 2H2 + CO2 = C + 2H2O where C stands for carbon in organic molecules like proteins). Let us try substituting a neutral (in terms of oxidation state) species for . One candidate is acetic acid: one can see by its formula that the nominal oxidation state of carbon in this species is zero.
The relevant commands are printed below. We also throw another twist in the equation by changing the formation mode with the fmode=many argument. This tells the function to form many proteins in amounts, consistent with equilibrium among the products, that are calculated using the diagram function in CHNOSZ (Dick, 2008 ). This simulation can also be performed in CHNOSZ using apc('many',basis='acetic').
basis(c("acetic acid","H2O","NH3","H2","H2S"),c(-5.5,0,-4,-10,-7)) basis("H2","aq") species(c("APC1","APC2","APC5","CDC16","APC10","SWM1"),"YEAST") a <- affinity(C2H4O2=c(-10,-2),H2=c(-10,-4)) diagram(a,residue=TRUE,as.residue=TRUE) species(1:nrow(species()),-999) species("APC2_YEAST",0) t <- transfer(250,ibalance="PBB",plot=c(1,4),dmode="none", devmax=0.15,fmode="many")
The movie is here . The reaction path diagram is shown below.
The logarithms of activities of basis species and the amounts of proteins formed are shown in the figure below.
Do the relative abundances of proteins predicted using a metastable equilibrium reaction path model have any relationship to the relative abundances of proteins in cells? The table below lists the abundances of the proteins (in relative units) found by Ghaemmaghami et al., 2003  (the YeastGFP protein expression study). The two most abundant proteins in the table show up as the two most abundant proteins in the figure above at log percent reaction progress between ca. -9 and -8.5. So, the data give some sign of an energy minimizing process, but equilibrium is far from being attained: Our reactant protein always shows up with a lower relative abundance in the model that what is observed in the cell.
The mineral reaction paths shown above are obviously meaningful to geochemistry students and researchers; such things have been published and commented on many times in the literature. But do the protein reaction paths mean anything? Perhaps the anaphase-promoting complex is not the greatest example since it is hard to imagine why the cell would form each of the proteins in the complex under differing conditions when they are all needed together for the complex to assemble. But maybe the proteins are formed at different times and kept around for the final assembly of the complex; that situation might permit more control by the cell and allow it to respond to unexpected environmental changes.
Another reservation I have about this model is that I have chosen to titrate in a particular protein, because (by analogy with K-feldspar) it shows up on the high-activity corner of a predominance diagram. Perhaps there is a particular stoichiometry, that of a proto-protein, that never actually forms but is the source of many different proteins in the cell by virtue of modulation of redox state, carbon, nitrogen and sulfur availability, and pH, hydration state and ionic strength, in the vicinity of protein synthesis pathways.
A good place to go with this might be to consider proteins that form in a well-understood sequence in the cell, including cell-division control proteins other than the anaphase-promoting complex. The relationship between subcellular chemistry and the sequential formation of organelles (Dick, 2009)  is another potential target.
There are many software tools that can be used to simulate reaction paths. Just a few are listed below. There is no reason (in theory) that any of these could not also be used to study protein reaction paths.
Updated 2009-04-22 by Jeffrey M. Dick