revisit {CHNOSZ}R Documentation

Diversity Calculations for Chemical Species

Description

Calculate species richness, or standard deviation, coefficient of variation or Shannon diversity index of activities or logarithms of activities of chemical species, and plot the results.

Usage

  revisit(d, target = "cv", loga.ref = NULL,
    do.plot = NULL, col = par("fg"), yline = 2, ylim = NULL, 
    ispecies = NULL, add = FALSE, cex = par("cex"), lwd = par("lwd"), 
    mar = NULL, side = 1:4, xlim = NULL, labcex = 0.6, pch = 1, 
    legend = "", legend.x = NULL, lpch = NULL, main = NULL, 
    lograt.ref = NULL, plot.ext = TRUE)
  extremes(z, target)
  where.extreme(z, target, do.sat = FALSE)

Arguments

d list, output from diagram, or list of logarithms of activities of species.
target character, what statistic to calculate.
loga.ref numeric, logarithm of activities for comparison statistics
do.plot logical, make a plot?
col character, color to use for points or liness.
yline numeric, margin line for y-axis label.
ylim numeric, limits of y axis.
ispecies numeric, which species to consider.
add logical, add to an existing plot?
cex numeric, character expansion factor.
lwd numeric, line width.
mar numeric, plot margin specifications.
side numeric, which sides of plot to draw axes.
xlim numeric, limits of x axis.
labcex numeric, character expansion factor for species labels.
pch numeric, plotting symbol(s) to use for points.
legend character, text to use for legend.
legend.x character, placement of legend.
lpch numeric, plotting symbol(s) to use in legend.
main character, main title for plot.
lograt.ref numeric, log10 of reference abundance ratios.
plot.ext logical, show the location of the extreme value(s)?
z numeric, matrix of values.
do.sat logical, identify multiple extreme values.

Details

The purpose of richness is to calculate and visualize summary statistics for logarithms of activities of chemical species. For most uses, supply the output of diagram as the value for d. Alternatively, d can be a list of logarithms of activities; the list elements each correspond to a different species and can be vectors, matrices, or higher-dimensional arrays, but they must all have the same dimensions. (This is always the case for d$logact if d is the output from diagram; the dimensionality is determined by the number of variables used in the calculations of affinity.) The type of statistic to be calculations is indicated by target, as summarized in the following table.

target description extremum additional arguments
sd standard deviation min none
cv coefficient of variation min none
shannon Shannon diversity index max none
qqr correlation coefficient on q-q plot (normal distribution) max none
richness species richness max loga.ref
cvrmsd coefficient of variation of RMSD min loga.ref
spearman Spearman correlation coefficient max loga.ref
pearson Pearson correlation coefficient max loga.ref

sd, cv, shannon and qqrr all operate on just the sample values. richness counts the numbers of species whose logarithms of activities are above log.min. cvrmsd, spearman and pearson are comparison statistics where loga.target represents the observed values. ratio determines the correlation coefficient of a predicted change in loga ratios (d$logact vs. loga.ref) plotted agains observed changed in loga ratios (e.g., from changes in protein expression deduced from microarray experiments; given in loga.target)

If do.plot is TRUE, d is the output from diagram, and the number of variables is 1 or 2, the results are plotted – a line diagram in 1 dimension or a contour plot in 2 dimensions.

The value of extremum in the table shows whether the extreme value that optimizes the system is the minimum (sd, cv, cvrmsd) or the maximum (all the others). On plots the location of the extreme value is indicated (by a dashed vertical line on a 1-D plot or a point marked by an asterisk on a 2-D plot). On 2-D plots the valleys (or ridges) leading to the location of the extremum are plotted. The ridges or valleys are plotted as dashed lines and are colored green for the x values returned by extremes and blue for the y values returned by extremes.

The location of the extreme value in a matrix or vector z is calculated using where.extreme. Whether the extreme is the minimum or the maximum value depends on the value of target. For matrices, if do.sat is TRUE, if the extreme value is repeated, the row and columns numbers for all instances are returned. Given a matrix of numeric values in z, extremes locates the maximum or minimum values in both dimensions. That is, the x values that are returned are the column numbers where the extreme is found for each row, and the y values that are returned are the row numbers where the extreme is found for each column.

If lograt.ref is provided, these values are the reference values for logarithm of abundance ratio.

The function name was changed from diversity to revisit in CHNOSZ-0.9 because there is a function named diversity in the vegan package. Note that while diversity takes a matrix with species on the columns, revisit takes a list with species as the elements of the list.

Value

revisit returns a list containing at least an element named H giving the calculated values for the target statistic. This has the same dimensions as a single element of d (or d$logact, if d was the output from diagram). For calculations as a function of one or two variables, the output also contains the elements ix (location of the extremum in the first direction), x (x-value at the extremum), and extval (extreme value). For calculations as a function of two variables, the output also contains the elements iy (location of the extremum in the second direction) and y (y-value at the extremum).

Examples

  
  
    ### using grep.file, read.fasta, add.protein
    # calculations for Pelagibacter ubique
    f <- system.file("extdata/fasta/HTCC1062.faa.xz",package="CHNOSZ")
    # what proteins to select (set to "" for all proteins)
    w <- "ribosomal"
    # locate entries whose names contain w
    j <- grep.file(f,w)
    # get the amino acid compositions of these protein
    p <- read.fasta(f,j)
    # add these proteins to CHNOSZ's inventory
    i <- add.protein(p)
    # set up a the chemical system
    basis("CHNOS+")
    # calculate affinities of formation in logfO2 space
    a <- affinity(O2=c(-90,-60),iprotein=i)
    # show the equilibrium activities
    d <- diagram(a,cex=1.5,logact=0)
    # make a title
    expr <- as.expression(substitute(x~y~"proteins in"~
      italic("P. ubique"),list(x=length(j),y=w)))
    mtitle(c("Equilibrium activities of",expr),cex=1.5)
    # show the coefficient of variation
    revisit(d,"CV",cex=1.5)
    mtitle(c("CV of equilibrium activities of",expr),cex=1.5)
    # calculate affinities in logfO2-logaH2O space
    a <- affinity(O2=c(-90,-60),H2O=c(-20,0),iprotein=i)
    # calculate the equilibrium activities
    d <- diagram(a,do.plot=FALSE,mam=FALSE,logact=0)
    # show the coefficient of variation
    revisit(d,"CV",cex=1.5)
    mtitle(c("CV of equilibrium activities of",expr),cex=1.5)
  

[Package CHNOSZ version 0.9-7 Index]