| util.seq {CHNOSZ} | R Documentation |
Count amino acids in protein sequences, return one- or three-letter abbreviations of amino acids; count nucleotides in nucleic acid sequences, calculate DNA and RNA complements of nucleic acid sequences.
aminoacids(seq, nchar=1) nucleicacids(seq, type = "DNA", comp = NULL, comp2 = NULL)
seq |
character, amino acid sequence of a protein (aminoacids) or base sequence of a nucleic acid (nucleicacids). |
nchar |
numeric, 1 to return one-letter, 3 to return three-letter abbreviations for amino acids (aminoacids). |
type |
character, type of nucleic acid sequence (DNA or RNA) (nucleicads). |
comp |
character, type of complement sequence. |
comp2 |
character, type of second complement sequence. |
aminoacids takes a character argument containing a protein sequence and counts the number of occurrences of each type of amino acid. The output is a dataframe with 20 columns, each corresponding to an amino acid, ordered in the same way as thermo$protein. If the first argument is NULL, the function returns the one-letter abbreviations (for nchar equal to 1) or the three-letter ones (if nchar is equal to 3) or the names of the amino acids (if nchar is NA) of twenty amino acids in the order used in thermo$protein.
nucleicacids takes a DNA or RNA sequence and counts the numbers of bases of each type. Whether the sequence is DNA or RNA is specified by type. Setting comp to DNA or RNA tells the function to compute the base composition of that type of complement of the sequence. If comp2 is specified, another complement is taken. The two rounds of complementing can be used in a single function call e.g. to go from a sequence on DNA minus strand (given in seq) to the plus strand (with comp="DNA") and then from the DNA plus strand to RNA (with comp2="RNA"). The value returned by the function is a dataframe of base composition, which can be passed back to the function to obtain the overall chemical formula for the bases.
An object of type character or dataframe.
## count amino acids in a sequence
aminoacids("GGSGG")
aminoacids("WhatAmIMadeOf?")
## count nucleobases in a sequence
nucleicacids("ACCGGGTTT")
# the DNA complement of that sequence
nucleicacids("ACCGGGTTT",comp="DNA")
# the RNA complement of the DNA complement
n <- nucleicacids("ACCGGGTTT",comp="DNA",comp2="RNA")
# the formula of the RNA complement
nucleicacids(n,type="RNA")