bpp-seq3  3.0.0
bpp::CodonSiteTools Class Reference

Utilitary functions for codon sites. More...

#include <Bpp/Seq/CodonSiteTools.h>

+ Inheritance diagram for bpp::CodonSiteTools:
+ Collaboration diagram for bpp::CodonSiteTools:

Public Member Functions

 CodonSiteTools ()
 
virtual ~CodonSiteTools ()
 

Static Public Member Functions

static bool hasGapOrStop (const Site &site, const GeneticCode &gCode)
 Method to know if a codon site contains gap(s) or stop codons. More...
 
static bool hasStop (const Site &site, const GeneticCode &gCode)
 Method to know if a codon site contains stop codon or not. More...
 
static bool isMonoSitePolymorphic (const Site &site)
 Method to know if a polymorphic codon site is polymorphic at only one site. More...
 
static bool isSynonymousPolymorphic (const Site &site, const GeneticCode &gCode)
 Method to know if polymorphism at a codon site is synonymous. More...
 
static std::unique_ptr< SitegenerateCodonSiteWithoutRareVariant (const Site &site, const GeneticCode &gCode, double freqmin)
 generate a codon site without rare variants More...
 
static size_t numberOfDifferences (int i, int j, const CodonAlphabet &ca)
 Compute the number of differences between two codons. More...
 
static double numberOfSynonymousDifferences (int i, int j, const GeneticCode &gCode, bool minchange=false)
 Compute the number of synonymous differences between two codons. More...
 
static double piSynonymous (const Site &site, const GeneticCode &gCode, bool minchange=false)
 Compute the synonymous pi per codon site. More...
 
static double piNonSynonymous (const Site &site, const GeneticCode &gCode, bool minchange=false)
 Compute the non-synonymous pi per codon site. More...
 
static double numberOfSynonymousPositions (int i, const GeneticCode &gCode, double ratio=1.0)
 Return the number of synonymous positions of a codon. More...
 
static double meanNumberOfSynonymousPositions (const Site &site, const GeneticCode &gCode, double ratio=1)
 Return the mean number of synonymous positions per codon site. More...
 
static size_t numberOfSubstitutions (const Site &site, const GeneticCode &gCode, double freqmin=0.)
 Return the number of substitutions per codon site. More...
 
static size_t numberOfNonSynonymousSubstitutions (const Site &site, const GeneticCode &gCode, double freqmin=0.)
 Return the number of Non Synonymous substitutions per codon site. More...
 
static std::vector< size_t > fixedDifferences (const Site &siteIn, const Site &siteOut, int i, int j, const GeneticCode &gCode)
 Return a vector with the number of fixed synonymous and non-synonymous differences per codon site. More...
 
static bool isFourFoldDegenerated (const Site &site, const GeneticCode &gCode)
 
static bool hasGap (const IntSymbolListInterface &site)
 
static bool hasGap (const ProbabilisticSymbolListInterface &site)
 
static bool hasGap (const CruxSymbolListInterface &site)
 
static bool hasUnresolved (const IntSymbolListInterface &site)
 
static bool isGapOnly (const IntSymbolListInterface &site)
 
static bool isGapOnly (const ProbabilisticSymbolListInterface &site)
 
static bool isGapOnly (const CruxSymbolListInterface &site)
 
static size_t numberOfGaps (const IntSymbolListInterface &site)
 
static size_t numberOfGaps (const ProbabilisticSymbolListInterface &site)
 
static size_t numberOfGaps (const CruxSymbolListInterface &site)
 
static bool isGapOrUnresolvedOnly (const IntSymbolListInterface &site)
 
static bool isGapOrUnresolvedOnly (const ProbabilisticSymbolListInterface &site)
 
static bool isGapOrUnresolvedOnly (const CruxSymbolListInterface &site)
 
static size_t numberOfUnresolved (const IntSymbolListInterface &site)
 
static size_t numberOfUnresolved (const ProbabilisticSymbolListInterface &site)
 
static size_t numberOfUnresolved (const CruxSymbolListInterface &site)
 
static bool hasUnknown (const IntSymbolListInterface &site)
 
static bool hasUnknown (const ProbabilisticSymbolListInterface &site)
 
static bool hasUnknown (const CruxSymbolListInterface &site)
 
static bool isComplete (const IntSymbolListInterface &site)
 
static bool isComplete (const ProbabilisticSymbolListInterface &site)
 
static bool isComplete (const CruxSymbolListInterface &site)
 
static bool isConstant (const IntSymbolListInterface &site, bool ignoreUnknown=false, bool unresolvedRaisesException=true)
 Tell if a site is constant, that is displaying the same state in all sequences that do not present a gap. More...
 
static bool isConstant (const ProbabilisticSymbolListInterface &site, bool unresolvedRaisesException=true)
 
static bool isConstant (const CruxSymbolListInterface &site, bool ignoreUnknown=false, bool unresolvedRaisesException=true)
 
static bool areSymbolListsIdentical (const IntSymbolListInterface &list1, const IntSymbolListInterface &list2)
 
static bool areSymbolListsIdentical (const ProbabilisticSymbolListInterface &list1, const ProbabilisticSymbolListInterface &list2)
 
static bool areSymbolListsIdentical (const CruxSymbolListInterface &l1, const CruxSymbolListInterface &l2)
 
template<class count_type >
static void getCounts (const IntSymbolListInterface &list, std::map< int, count_type > &counts)
 Count all states in the list. More...
 
static void getCounts (const ProbabilisticSymbolListInterface &list, std::map< int, double_t > &counts)
 Sum all states in the list. More...
 
static void getCounts (const CruxSymbolListInterface &list, std::map< int, double > &counts, bool resolveUnknowns=false)
 Count all states in the list, optionally resolving unknown characters. More...
 
template<class count_type >
static void getCounts (const IntSymbolListInterface &list1, const IntSymbolListInterface &list2, std::map< int, std::map< int, count_type >> &counts)
 Count all pair of states for two lists of the same size. More...
 
static void getCounts (const ProbabilisticSymbolListInterface &list1, const ProbabilisticSymbolListInterface &list2, std::map< int, std::map< int, double >> &counts)
 Sum along the lists the joined probabilities for all pair of states for two lists of the same size. More...
 
static void getCounts (const CruxSymbolListInterface &list1, const CruxSymbolListInterface &list2, std::map< int, std::map< int, double >> &counts, bool resolveUnknowns)
 Count all pairs of states for two lists of the same size, optionally resolving unknown characters. More...
 
static void getCountsResolveUnknowns (const IntSymbolListInterface &list, std::map< int, double > &counts)
 Count all states in the list normalizing unknown characters. More...
 
static void getCountsResolveUnknowns (const ProbabilisticSymbolListInterface &list, std::map< int, double > &counts)
 Count all states in the list normalizing unknown characters. More...
 
static void getCountsResolveUnknowns (const IntSymbolListInterface &list1, const IntSymbolListInterface &list2, std::map< int, std::map< int, double >> &counts)
 Count all pairs of states for two lists of the same size resolving unknown characters. More...
 
static void getCountsResolveUnknowns (const ProbabilisticSymbolListInterface &list1, const ProbabilisticSymbolListInterface &list2, std::map< int, std::map< int, double >> &counts)
 Count all pairs of states for two lists of the same size resolving unknown characters. More...
 
static void getFrequencies (const CruxSymbolListInterface &list, std::map< int, double > &frequencies, bool resolveUnknowns=false)
 Get all states frequencies in the list. More...
 
static void getFrequencies (const CruxSymbolListInterface &list1, const CruxSymbolListInterface &list2, std::map< int, std::map< int, double >> &frequencies, bool resolveUnknowns=false)
 Get all state pairs frequencies for two lists of the same size. More...
 
static double getGCContent (const IntSymbolListInterface &list, bool ignoreUnresolved=true, bool ignoreGap=true)
 Get the GC content of a symbol list. More...
 
static double getGCContent (const ProbabilisticSymbolListInterface &list, bool ignoreUnresolved=true, bool ignoreGap=true)
 
static double getGCContent (const CruxSymbolListInterface &list, bool ignoreUnresolved=true, bool ignoreGap=true)
 
static size_t getNumberOfDistinctPositions (const IntSymbolListInterface &l1, const IntSymbolListInterface &l2)
 Get the number of distinct positions. More...
 
static size_t getNumberOfDistinctPositions (const ProbabilisticSymbolListInterface &l1, const ProbabilisticSymbolListInterface &l2)
 
static size_t getNumberOfDistinctPositions (const CruxSymbolListInterface &l1, const CruxSymbolListInterface &l2)
 
static size_t getNumberOfPositionsWithoutGap (const IntSymbolListInterface &l1, const IntSymbolListInterface &l2)
 Get the number of positions without gap (or without null column). More...
 
static size_t getNumberOfPositionsWithoutGap (const ProbabilisticSymbolListInterface &l1, const ProbabilisticSymbolListInterface &l2)
 
static size_t getNumberOfPositionsWithoutGap (const CruxSymbolListInterface &l1, const CruxSymbolListInterface &l2)
 
static void changeGapsToUnknownCharacters (IntSymbolListInterface &l)
 Change all gap elements to unknown characters (or columns of 1). More...
 
static void changeGapsToUnknownCharacters (ProbabilisticSymbolListInterface &l)
 
static void changeGapsToUnknownCharacters (CruxSymbolListInterface &l)
 
static void changeUnresolvedCharactersToGaps (IntSymbolListInterface &l)
 Change all unknown characters to gap elements (or columns of 0). More...
 
static void changeUnresolvedCharactersToGaps (ProbabilisticSymbolListInterface &l)
 
static void changeUnresolvedCharactersToGaps (CruxSymbolListInterface &l)
 
static double variabilityShannon (const CruxSymbolListInterface &list, bool resolveUnknowns)
 Compute the Shannon entropy index of a SymbolList. More...
 
static double variabilityFactorial (const IntSymbolListInterface &list)
 Compute the factorial diversity index of a site. More...
 
static double mutualInformation (const CruxSymbolListInterface &list1, const CruxSymbolListInterface &list2, bool resolveUnknowns)
 Compute the mutual information between two lists. More...
 
static double entropy (const CruxSymbolListInterface &list, bool resolveUnknowns)
 Compute the entropy of a site. This is an alias of method variabilityShannon. More...
 
static double jointEntropy (const CruxSymbolListInterface &list1, const CruxSymbolListInterface &list2, bool resolveUnknowns)
 Compute the joint entropy between two lists. More...
 
static double heterozygosity (const CruxSymbolListInterface &list)
 Compute the heterozygosity index of a list. More...
 
static size_t getNumberOfDistinctCharacters (const IntSymbolListInterface &list)
 Give the number of distinct characters at a list. More...
 
static size_t getMajorAlleleFrequency (const IntSymbolListInterface &list)
 return the number of occurrences of the most common allele. More...
 
static int getMajorAllele (const CruxSymbolListInterface &list)
 return the state corresponding to the most common allele. More...
 
static size_t getMinorAlleleFrequency (const IntSymbolListInterface &list)
 return the number of occurrences of the least common allele. More...
 
static int getMinorAllele (const CruxSymbolListInterface &list)
 return the state corresponding to the least common allele. More...
 
static bool hasSingleton (const IntSymbolListInterface &list)
 Tell if a list has singletons. More...
 
static bool isParsimonyInformativeSite (const IntSymbolListInterface &site)
 Tell if a site is a parsimony informative site. More...
 
static bool isTriplet (const IntSymbolListInterface &list)
 Tell if a list has more than 2 distinct characters. More...
 
static bool isDoubleton (const IntSymbolListInterface &list)
 Tell if a list has exactly 2 distinct characters. More...
 

Detailed Description

Utilitary functions for codon sites.

Definition at line 24 of file CodonSiteTools.h.

Constructor & Destructor Documentation

◆ CodonSiteTools()

bpp::CodonSiteTools::CodonSiteTools ( )
inline

Definition at line 28 of file CodonSiteTools.h.

◆ ~CodonSiteTools()

virtual bpp::CodonSiteTools::~CodonSiteTools ( )
inlinevirtual

Definition at line 29 of file CodonSiteTools.h.

Member Function Documentation

◆ areSymbolListsIdentical() [1/3]

static bool bpp::SymbolListTools::areSymbolListsIdentical ( const CruxSymbolListInterface l1,
const CruxSymbolListInterface l2 
)
inlinestaticinherited

◆ areSymbolListsIdentical() [2/3]

bool SymbolListTools::areSymbolListsIdentical ( const IntSymbolListInterface list1,
const IntSymbolListInterface list2 
)
staticinherited
Parameters
list1The first site.
list2The second site.
Returns
True if the two sites have the same content (and, of course, alphabet).

Definition at line 216 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::areSymbolListsIdentical().

◆ areSymbolListsIdentical() [3/3]

bool SymbolListTools::areSymbolListsIdentical ( const ProbabilisticSymbolListInterface list1,
const ProbabilisticSymbolListInterface list2 
)
staticinherited

◆ changeGapsToUnknownCharacters() [1/3]

static void bpp::SymbolListTools::changeGapsToUnknownCharacters ( CruxSymbolListInterface l)
inlinestaticinherited

◆ changeGapsToUnknownCharacters() [2/3]

void SymbolListTools::changeGapsToUnknownCharacters ( IntSymbolListInterface l)
staticinherited

Change all gap elements to unknown characters (or columns of 1).

Parameters
lThe input list of characters.

Definition at line 501 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::changeGapsToUnknownCharacters().

◆ changeGapsToUnknownCharacters() [3/3]

void SymbolListTools::changeGapsToUnknownCharacters ( ProbabilisticSymbolListInterface l)
staticinherited

Definition at line 614 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ changeUnresolvedCharactersToGaps() [1/3]

static void bpp::SymbolListTools::changeUnresolvedCharactersToGaps ( CruxSymbolListInterface l)
inlinestaticinherited

◆ changeUnresolvedCharactersToGaps() [2/3]

void SymbolListTools::changeUnresolvedCharactersToGaps ( IntSymbolListInterface l)
staticinherited

Change all unknown characters to gap elements (or columns of 0).

Parameters
lThe input list of characters.

Definition at line 511 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::changeUnresolvedCharactersToGaps().

◆ changeUnresolvedCharactersToGaps() [3/3]

void SymbolListTools::changeUnresolvedCharactersToGaps ( ProbabilisticSymbolListInterface l)
staticinherited

Definition at line 623 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ entropy()

static double bpp::SymbolListTools::entropy ( const CruxSymbolListInterface list,
bool  resolveUnknowns 
)
inlinestaticinherited

Compute the entropy of a site. This is an alias of method variabilityShannon.

\[ I = - \sum_x f_x\cdot \ln(f_x) \]

where $f_x$ is the frequency of state $x$.

Author
J. Dutheil
Parameters
listA list.
resolveUnknownsTell is unknown characters must be resolved.
Returns
The Shannon entropy index of this list.

Definition at line 817 of file SymbolListTools.h.

References bpp::SymbolListTools::variabilityShannon().

◆ fixedDifferences()

vector< size_t > CodonSiteTools::fixedDifferences ( const Site siteIn,
const Site siteOut,
int  i,
int  j,
const GeneticCode gCode 
)
static

Return a vector with the number of fixed synonymous and non-synonymous differences per codon site.

Compute the number of synonymous and non-synonymous differences between the consensus codon of SiteIn (i) and SiteOut (j), which are fixed within each alignment. Example:

SiteIn
ATT
ATT
ATC
SiteOut
CTA
CTA
CTA

Here, the first position is non-synonymous different and fixed, the third position is synonymous different but not fixed (polymorphic in SiteIn). The return vector is thus [0,1]. In case of complex codon, the path that gives the minimum number of non-synonymous changes is chosen. The argument minchange=true is sent to numberOfSynonymousDifferences used in this method. Otherwise, a non-integer number could be return.

Rare variants (<= freqmin) can be excluded.

Parameters
siteIna Site
siteOuta Site
ian integer
jan integer
gCodea GeneticCode

Definition at line 661 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::Alphabet::equals(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::GeneticCode::getCodonAlphabet(), bpp::AlphabetTools::isCodonAlphabet(), bpp::SymbolListTools::isConstant(), bpp::AbstractTemplateSymbolList< T >::size(), and bpp::GeneticCode::sourceAlphabet().

◆ generateCodonSiteWithoutRareVariant()

unique_ptr< Site > CodonSiteTools::generateCodonSiteWithoutRareVariant ( const Site site,
const GeneticCode gCode,
double  freqmin 
)
static

generate a codon site without rare variants

Rare variants are replaced by the most frequent allele. This method is used to exclude rare variants in some analyses as in McDonald-Kreitman Test (McDonald & Kreitman, 1991, Nature 351 pp652-654). For an application, see for example (Fay et al. 2001, Genetics 158 pp 1227-1234).

Parameters
sitea Site
gCodeThe genetic code according to which stop codons are specified.
freqmina double, allele in frequency strictly lower than freqmin are replaced

Definition at line 129 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::SymbolListTools::getFrequencies(), bpp::AbstractTemplateSymbolList< T >::getValue(), bpp::AlphabetTools::isCodonAlphabet(), bpp::SymbolListTools::isConstant(), bpp::GeneticCode::isStop(), and bpp::AbstractTemplateSymbolList< T >::size().

Referenced by numberOfSubstitutions().

◆ getCounts() [1/6]

static void bpp::SymbolListTools::getCounts ( const CruxSymbolListInterface list,
std::map< int, double > &  counts,
bool  resolveUnknowns = false 
)
inlinestaticinherited

Count all states in the list, optionally resolving unknown characters.

For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.

Author
J. Dutheil
Parameters
listThe list.
countsThe output map to store the counts (existing ocunts will be incremented).
resolveUnknownsTell is unknown characters must be resolved.
Returns
A map with all states and corresponding counts.

Definition at line 360 of file SymbolListTools.h.

References bpp::SymbolListTools::getCounts(), and bpp::SymbolListTools::getCountsResolveUnknowns().

◆ getCounts() [2/6]

static void bpp::SymbolListTools::getCounts ( const CruxSymbolListInterface list1,
const CruxSymbolListInterface list2,
std::map< int, std::map< int, double >> &  counts,
bool  resolveUnknowns 
)
inlinestaticinherited

Count all pairs of states for two lists of the same size, optionally resolving unknown characters.

For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing ocunts will be incremented).
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
Returns
A map with all states and corresponding counts.

Definition at line 514 of file SymbolListTools.h.

References bpp::SymbolListTools::getCounts(), and bpp::SymbolListTools::getCountsResolveUnknowns().

◆ getCounts() [3/6]

template<class count_type >
static void bpp::SymbolListTools::getCounts ( const IntSymbolListInterface list,
std::map< int, count_type > &  counts 
)
inlinestaticinherited

Count all states in the list.

Author
J. Dutheil
Parameters
listThe list.
countsThe output map to store the counts (existing counts will be incremented).

Definition at line 265 of file SymbolListTools.h.

References bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::getCounts(), bpp::SequenceContainerTools::getFrequencies(), isSynonymousPolymorphic(), and numberOfNonSynonymousSubstitutions().

◆ getCounts() [4/6]

template<class count_type >
static void bpp::SymbolListTools::getCounts ( const IntSymbolListInterface list1,
const IntSymbolListInterface list2,
std::map< int, std::map< int, count_type >> &  counts 
)
inlinestaticinherited

Count all pair of states for two lists of the same size.

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing counts will be incremented).

Definition at line 412 of file SymbolListTools.h.

References bpp::CruxSymbolListInterface::size().

◆ getCounts() [5/6]

static void bpp::SymbolListTools::getCounts ( const ProbabilisticSymbolListInterface list,
std::map< int, double_t > &  counts 
)
inlinestaticinherited

Sum all states in the list.

Parameters
listThe list.
countsThe output map to store the sum for all states (existing counts will be summed).

Definition at line 282 of file SymbolListTools.h.

References bpp::CruxSymbolListInterface::size().

◆ getCounts() [6/6]

static void bpp::SymbolListTools::getCounts ( const ProbabilisticSymbolListInterface list1,
const ProbabilisticSymbolListInterface list2,
std::map< int, std::map< int, double >> &  counts 
)
inlinestaticinherited

Sum along the lists the joined probabilities for all pair of states for two lists of the same size.

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing counts will be summed).

Definition at line 437 of file SymbolListTools.h.

References bpp::CruxSymbolListInterface::size().

◆ getCountsResolveUnknowns() [1/4]

static void bpp::SymbolListTools::getCountsResolveUnknowns ( const IntSymbolListInterface list,
std::map< int, double > &  counts 
)
inlinestaticinherited

Count all states in the list normalizing unknown characters.

For instance, (1,1,1,1) will be counted as (1/4,1/4,1/4,1/4).

Author
J. Dutheil
Parameters
listThe list.
countsThe output map to store the counts (existing ocunts will be incremented).
Returns
A map with all states and corresponding counts.

Definition at line 306 of file SymbolListTools.h.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::getCounts().

◆ getCountsResolveUnknowns() [2/4]

void SymbolListTools::getCountsResolveUnknowns ( const IntSymbolListInterface list1,
const IntSymbolListInterface list2,
std::map< int, std::map< int, double >> &  counts 
)
staticinherited

Count all pairs of states for two lists of the same size resolving unknown characters.

For instance, (1,1,1,1) will be counted as (1/4,1/4,1/4,1/4).

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing ocunts will be incremented).
Returns
A map with all states and corresponding counts.

Definition at line 357 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

◆ getCountsResolveUnknowns() [3/4]

static void bpp::SymbolListTools::getCountsResolveUnknowns ( const ProbabilisticSymbolListInterface list,
std::map< int, double > &  counts 
)
inlinestaticinherited

Count all states in the list normalizing unknown characters.

For instance, (1,1,1,1) will be counted as (1/4,1/4,1/4,1/4).

Author
J. Dutheil
Parameters
listThe list.
countsThe output map to store the counts (existing ocunts will be incremented).
Returns
A map with all states and corresponding counts.

Definition at line 331 of file SymbolListTools.h.

References bpp::CruxSymbolListInterface::size(), and bpp::VectorTools::sum().

◆ getCountsResolveUnknowns() [4/4]

void SymbolListTools::getCountsResolveUnknowns ( const ProbabilisticSymbolListInterface list1,
const ProbabilisticSymbolListInterface list2,
std::map< int, std::map< int, double >> &  counts 
)
staticinherited

Count all pairs of states for two lists of the same size resolving unknown characters.

For instance, (1,1,1,1) will be counted as (1/4,1/4,1/4,1/4).

NB: The two lists do node need to share the same alphabet! The states of the first list will be used as the first index in the output, and the ones from the second list as the second index.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
countsThe output map to store the counts (existing ocunts will be incremented).
Returns
A map with all states and corresponding counts.

Definition at line 522 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ getFrequencies() [1/2]

void SymbolListTools::getFrequencies ( const CruxSymbolListInterface list,
std::map< int, double > &  frequencies,
bool  resolveUnknowns = false 
)
staticinherited

Get all states frequencies in the list.

Author
J. Dutheil
Parameters
listThe list.
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4.
frequenciesThe output map with all states and corresponding frequencies. Existing frequencies will be erased if any.

Definition at line 380 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

Referenced by generateCodonSiteWithoutRareVariant(), bpp::SiteContainerTools::getConsensus(), meanNumberOfSynonymousPositions(), piNonSynonymous(), piSynonymous(), and bpp::SiteContainerTools::removeGapSites().

◆ getFrequencies() [2/2]

void SymbolListTools::getFrequencies ( const CruxSymbolListInterface list1,
const CruxSymbolListInterface list2,
std::map< int, std::map< int, double >> &  frequencies,
bool  resolveUnknowns = false 
)
staticinherited

Get all state pairs frequencies for two lists of the same size.

Author
J. Dutheil
Parameters
list1The first list.
list2The second list.
resolveUnknownsTell is unknown characters must be resolved. For instance, in DNA, N will be counted as A=1/4,T=1/4,C=1/4,G=1/4. For ProbabilisticSymbolList, (1,1,1,1) states will be counted as (1/4,1/4,1/4,1/4).
frequenciesThe output map with all state pairs and corresponding frequencies. Existing frequencies will be erased if any.

Definition at line 396 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ getGCContent() [1/3]

static double bpp::SymbolListTools::getGCContent ( const CruxSymbolListInterface list,
bool  ignoreUnresolved = true,
bool  ignoreGap = true 
)
inlinestaticinherited

Definition at line 609 of file SymbolListTools.h.

References bpp::SymbolListTools::getGCContent().

◆ getGCContent() [2/3]

double SymbolListTools::getGCContent ( const IntSymbolListInterface list,
bool  ignoreUnresolved = true,
bool  ignoreGap = true 
)
staticinherited

Get the GC content of a symbol list.

Parameters
listThe list.
Returns
The proportion of G and C states in the list.
Parameters
ignoreUnresolvedDo not count unresolved states (or columns that sum > 1). Otherwise, weight by each state probability in case of ambiguity (e.g. the R state counts for 0.5) (or columns are normalized).
ignoreGapDo not count gaps (or null columns) in total
Exceptions
AlphabetExceptionIf the list is not made of nucleotide states.

Definition at line 415 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), bpp::TemplateCoreSymbolListInterface< T >::getValue(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::getGCContent().

◆ getGCContent() [3/3]

double SymbolListTools::getGCContent ( const ProbabilisticSymbolListInterface list,
bool  ignoreUnresolved = true,
bool  ignoreGap = true 
)
staticinherited

◆ getMajorAllele()

int SymbolListTools::getMajorAllele ( const CruxSymbolListInterface list)
staticinherited

return the state corresponding to the most common allele.

Parameters
listA list
Returns
The most frequent state.

Definition at line 817 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ getMajorAlleleFrequency()

size_t SymbolListTools::getMajorAlleleFrequency ( const IntSymbolListInterface list)
staticinherited

return the number of occurrences of the most common allele.

Parameters
listA list
Returns
The frequency (number of sequences) displaying the most frequent state.

Definition at line 795 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ getMinorAllele()

int SymbolListTools::getMinorAllele ( const CruxSymbolListInterface list)
staticinherited

return the state corresponding to the least common allele.

Parameters
listA list
Returns
The less frequent state.

Definition at line 866 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ getMinorAlleleFrequency()

size_t SymbolListTools::getMinorAlleleFrequency ( const IntSymbolListInterface list)
staticinherited

return the number of occurrences of the least common allele.

Parameters
listA list
Returns
The frequency (number of sequences) displaying the less frequent state.

Definition at line 844 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ getNumberOfDistinctCharacters()

size_t SymbolListTools::getNumberOfDistinctCharacters ( const IntSymbolListInterface list)
staticinherited

Give the number of distinct characters at a list.

Parameters
listA list
Returns
The number of distinct characters in the given list.

Definition at line 774 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

Referenced by numberOfSubstitutions().

◆ getNumberOfDistinctPositions() [1/3]

static size_t bpp::SymbolListTools::getNumberOfDistinctPositions ( const CruxSymbolListInterface l1,
const CruxSymbolListInterface l2 
)
inlinestaticinherited

◆ getNumberOfDistinctPositions() [2/3]

size_t SymbolListTools::getNumberOfDistinctPositions ( const IntSymbolListInterface l1,
const IntSymbolListInterface l2 
)
staticinherited

Get the number of distinct positions.

The comparison in achieved from position 0 to the minimum size of the two vectors.

Parameters
l1SymbolList 1.
l2SymbolList 2.
Returns
The number of distinct positions.
Exceptions
AlphabetMismatchExceptionif the two lists have not the same alphabet type.

Definition at line 469 of file SymbolListTools.cpp.

References count(), bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::getNumberOfDistinctPositions().

◆ getNumberOfDistinctPositions() [3/3]

size_t SymbolListTools::getNumberOfDistinctPositions ( const ProbabilisticSymbolListInterface l1,
const ProbabilisticSymbolListInterface l2 
)
staticinherited

◆ getNumberOfPositionsWithoutGap() [1/3]

static size_t bpp::SymbolListTools::getNumberOfPositionsWithoutGap ( const CruxSymbolListInterface l1,
const CruxSymbolListInterface l2 
)
inlinestaticinherited

◆ getNumberOfPositionsWithoutGap() [2/3]

size_t SymbolListTools::getNumberOfPositionsWithoutGap ( const IntSymbolListInterface l1,
const IntSymbolListInterface l2 
)
staticinherited

Get the number of positions without gap (or without null column).

The comparison in achieved from position 0 to the minimum size of the two vectors.

Parameters
l1SymbolList 1.
l2SymbolList 2.
Returns
The number of positions without gap (or columns with at least a non zero value)
Exceptions
AlphabetMismatchExceptionif the two lists have not the same alphabet type.

Definition at line 485 of file SymbolListTools.cpp.

References count(), bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::getNumberOfPositionsWithoutGap().

◆ getNumberOfPositionsWithoutGap() [3/3]

size_t SymbolListTools::getNumberOfPositionsWithoutGap ( const ProbabilisticSymbolListInterface l1,
const ProbabilisticSymbolListInterface l2 
)
staticinherited

◆ hasGap() [1/3]

static bool bpp::SymbolListTools::hasGap ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 36 of file SymbolListTools.h.

References bpp::SymbolListTools::hasGap().

◆ hasGap() [2/3]

◆ hasGap() [3/3]

bool SymbolListTools::hasGap ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 33 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ hasGapOrStop()

bool CodonSiteTools::hasGapOrStop ( const Site site,
const GeneticCode gCode 
)
static

Method to know if a codon site contains gap(s) or stop codons.

Parameters
sitea Site
gCodeThe genetic code according to which stop codons are specified.

Definition at line 26 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::AlphabetTools::isCodonAlphabet(), and bpp::AbstractTemplateSymbolList< T >::size().

◆ hasSingleton()

bool SymbolListTools::hasSingleton ( const IntSymbolListInterface list)
staticinherited

Tell if a list has singletons.

Parameters
listA list.
Returns
True if the list has singletons.

Definition at line 892 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ hasStop()

bool CodonSiteTools::hasStop ( const Site site,
const GeneticCode gCode 
)
static

Method to know if a codon site contains stop codon or not.

Parameters
sitea Site
gCodeThe genetic code according to which stop codons are specified.

Definition at line 41 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::AlphabetTools::isCodonAlphabet(), bpp::GeneticCode::isStop(), and bpp::AbstractTemplateSymbolList< T >::size().

Referenced by bpp::SiteContainerTools::getSitesWithoutStopCodon(), and bpp::SiteContainerTools::removeSitesWithStopCodon().

◆ hasUnknown() [1/3]

static bool bpp::SymbolListTools::hasUnknown ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 152 of file SymbolListTools.h.

References bpp::SymbolListTools::hasUnknown().

◆ hasUnknown() [2/3]

bool SymbolListTools::hasUnknown ( const IntSymbolListInterface site)
staticinherited
Parameters
siteA site.
Returns
True if the site contains one or several unknown characters.

Definition at line 109 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::hasUnknown().

◆ hasUnknown() [3/3]

bool SymbolListTools::hasUnknown ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 120 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ hasUnresolved()

bool SymbolListTools::hasUnresolved ( const IntSymbolListInterface site)
staticinherited
Parameters
siteA site.
Returns
True if the site contains one or several unresolved state.

Definition at line 46 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SiteContainerTools::changeUnresolvedCharactersToGaps().

◆ heterozygosity()

double SymbolListTools::heterozygosity ( const CruxSymbolListInterface list)
staticinherited

Compute the heterozygosity index of a list.

\[ H = 1 - \sum_x f_x^2 \]

where $f_x$ is the frequency of state $x$.

Parameters
listA list.
Returns
The heterozygosity index of this list.

Definition at line 760 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isComplete() [1/3]

static bool bpp::SymbolListTools::isComplete ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 174 of file SymbolListTools.h.

References bpp::SymbolListTools::isComplete().

◆ isComplete() [2/3]

◆ isComplete() [3/3]

bool SymbolListTools::isComplete ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 145 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isConstant() [1/3]

static bool bpp::SymbolListTools::isConstant ( const CruxSymbolListInterface site,
bool  ignoreUnknown = false,
bool  unresolvedRaisesException = true 
)
inlinestaticinherited

Definition at line 208 of file SymbolListTools.h.

References bpp::SymbolListTools::isConstant().

◆ isConstant() [2/3]

bool SymbolListTools::isConstant ( const IntSymbolListInterface site,
bool  ignoreUnknown = false,
bool  unresolvedRaisesException = true 
)
staticinherited

Tell if a site is constant, that is displaying the same state in all sequences that do not present a gap.

Parameters
siteA site.
ignoreUnknownIf true, positions with unknown positions will be ignored. Otherwise, a site with one single state + any uncertain state will not be considered as constant.
unresolvedRaisesExceptionIn case of ambiguous case (gap only site for instance), throw an exception. Otherwise returns false.
Returns
True if the site is made of only one state.

Definition at line 258 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by fixedDifferences(), generateCodonSiteWithoutRareVariant(), bpp::SymbolListTools::isConstant(), isFourFoldDegenerated(), isMonoSitePolymorphic(), isSynonymousPolymorphic(), numberOfNonSynonymousSubstitutions(), numberOfSubstitutions(), piNonSynonymous(), and piSynonymous().

◆ isConstant() [3/3]

bool SymbolListTools::isConstant ( const ProbabilisticSymbolListInterface site,
bool  unresolvedRaisesException = true 
)
staticinherited

Definition at line 320 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isDoubleton()

bool SymbolListTools::isDoubleton ( const IntSymbolListInterface list)
staticinherited

Tell if a list has exactly 2 distinct characters.

Parameters
listA list.
Returns
True if the site has exactly 2 distinct characters

Definition at line 946 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isFourFoldDegenerated()

bool CodonSiteTools::isFourFoldDegenerated ( const Site site,
const GeneticCode gCode 
)
static
Returns
True if all sequences have a fourfold degenerated codon in the site (that is, if a mutation in the fourth position does not change the aminoacid).
Author
Benoit Nabholz, Annabelle Haudry
Parameters
siteThe site to analyze.
gCodeThe genetic code to use.

If non-synonymous mutation

Definition at line 792 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::getValue(), bpp::SymbolListTools::isConstant(), bpp::GeneticCode::isFourFoldDegenerated(), isSynonymousPolymorphic(), and bpp::AbstractTemplateSymbolList< T >::size().

◆ isGapOnly() [1/3]

static bool bpp::SymbolListTools::isGapOnly ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 64 of file SymbolListTools.h.

References bpp::SymbolListTools::isGapOnly().

◆ isGapOnly() [2/3]

bool SymbolListTools::isGapOnly ( const IntSymbolListInterface site)
staticinherited

◆ isGapOnly() [3/3]

bool SymbolListTools::isGapOnly ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 71 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isGapOrUnresolvedOnly() [1/3]

static bool bpp::SymbolListTools::isGapOrUnresolvedOnly ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 108 of file SymbolListTools.h.

References bpp::SymbolListTools::isGapOrUnresolvedOnly().

◆ isGapOrUnresolvedOnly() [2/3]

bool SymbolListTools::isGapOrUnresolvedOnly ( const IntSymbolListInterface site)
staticinherited
Parameters
siteA site.
Returns
True if the site contains only gaps.

Definition at line 84 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::isGapOrUnresolvedOnly(), and bpp::SiteContainerTools::removeGapOrUnresolvedOnlySites().

◆ isGapOrUnresolvedOnly() [3/3]

bool SymbolListTools::isGapOrUnresolvedOnly ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 95 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isMonoSitePolymorphic()

bool CodonSiteTools::isMonoSitePolymorphic ( const Site site)
static

◆ isParsimonyInformativeSite()

bool SymbolListTools::isParsimonyInformativeSite ( const IntSymbolListInterface site)
staticinherited

Tell if a site is a parsimony informative site.

At least two distinct characters must be present.

Parameters
sitea Site.
Returns
True if the site is parsimony informative.

Definition at line 912 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ isSynonymousPolymorphic()

◆ isTriplet()

bool SymbolListTools::isTriplet ( const IntSymbolListInterface list)
staticinherited

Tell if a list has more than 2 distinct characters.

Parameters
listA list.
Returns
True if the list has more than 2 distinct characters

Definition at line 935 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ jointEntropy()

double SymbolListTools::jointEntropy ( const CruxSymbolListInterface list1,
const CruxSymbolListInterface list2,
bool  resolveUnknowns 
)
staticinherited

Compute the joint entropy between two lists.

\[ H_{i,j} = - \sum_x \sum_y p_{x,y}\ln\left(p_{x,y}\right) \]

where $p_{x,y}$ is the frequency of the pair $(x,y)$.

Author
J. Dutheil
Parameters
list1First list
list2Second list
resolveUnknownsTell is unknown characters must be resolved.
Returns
The mutual information for the pair of lists.

Definition at line 706 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

◆ meanNumberOfSynonymousPositions()

double CodonSiteTools::meanNumberOfSynonymousPositions ( const Site site,
const GeneticCode gCode,
double  ratio = 1 
)
static

Return the mean number of synonymous positions per codon site.

A site is consider as x% synonymous if x% of the possible mutations are synonymous Transition/transversion ratio can be taken into account (use the variable ratio) The mean is computed over the VectorSite.

Unresolved and stop codons are counted as 0.

Parameters
sitea Site
gCodea GeneticCode
ratioa double, set by default to 1

Definition at line 537 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::Alphabet::equals(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::GeneticCode::getCodonAlphabet(), bpp::SymbolListTools::getFrequencies(), bpp::AlphabetTools::isCodonAlphabet(), bpp::AbstractTemplateSymbolList< T >::size(), and bpp::GeneticCode::sourceAlphabet().

◆ mutualInformation()

double SymbolListTools::mutualInformation ( const CruxSymbolListInterface list1,
const CruxSymbolListInterface list2,
bool  resolveUnknowns 
)
staticinherited

Compute the mutual information between two lists.

\[ MI = \sum_x \sum_y p_{x,y}\ln\left(\frac{p_{x,y}}{p_x \cdot p_y}\right) \]

where $p_x$ and $p_y$ are the frequencies of states $x$ and $y$, and $p_{x,y}$ is the frequency of the pair $(x,y)$.

Author
J. Dutheil
Parameters
list1First list
list2Second list
resolveUnknownsTell is unknown characters must be resolved.
Returns
The mutual information for the pair of lists.

Definition at line 656 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

◆ numberOfDifferences()

size_t CodonSiteTools::numberOfDifferences ( int  i,
int  j,
const CodonAlphabet ca 
)
static

Compute the number of differences between two codons.

Parameters
ia int
ja int
caa CodonAlphabet

Definition at line 191 of file CodonSiteTools.cpp.

References bpp::CodonAlphabet::getFirstPosition(), bpp::CodonAlphabet::getSecondPosition(), and bpp::CodonAlphabet::getThirdPosition().

◆ numberOfGaps() [1/3]

static size_t bpp::SymbolListTools::numberOfGaps ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 86 of file SymbolListTools.h.

References bpp::SymbolListTools::numberOfGaps().

◆ numberOfGaps() [2/3]

size_t SymbolListTools::numberOfGaps ( const IntSymbolListInterface site)
staticinherited

◆ numberOfGaps() [3/3]

size_t SymbolListTools::numberOfGaps ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 172 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ numberOfNonSynonymousSubstitutions()

size_t CodonSiteTools::numberOfNonSynonymousSubstitutions ( const Site site,
const GeneticCode gCode,
double  freqmin = 0. 
)
static

Return the number of Non Synonymous substitutions per codon site.

It is assumed that the path linking amino acids only involved one substitution by step.

Rare variants (<= freqmin) can be excluded. In case of complex codon, the path that gives the minimum number of non-synonymous changes is chosen. The argument minchange=true is sent to numberOfSynonymousDifferences used in this method. Otherwise, a non-integer number could be return.

Parameters
sitea Site
gCodea GeneticCode
freqmina double To exclude snp in frequency strictly lower than freqmin (by default freqmin = 0).

Definition at line 614 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), count(), bpp::Alphabet::equals(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::GeneticCode::getCodonAlphabet(), bpp::SymbolListTools::getCounts(), bpp::SymbolListTools::hasGap(), bpp::AlphabetTools::isCodonAlphabet(), bpp::SymbolListTools::isConstant(), bpp::AbstractTemplateSymbolList< T >::size(), and bpp::GeneticCode::sourceAlphabet().

◆ numberOfSubstitutions()

size_t CodonSiteTools::numberOfSubstitutions ( const Site site,
const GeneticCode gCode,
double  freqmin = 0. 
)
static

Return the number of substitutions per codon site.

No recombination is assumed, that is in complex codon homoplasy is assumed. Example:

ATT
ATT
ATT
ATC
ATC
AGT
AGT
AGC

Here, 3 substitutions are counted. Assuming that the last codon (AGC) is a recombinant between ATC and AGT would have lead to counting only 2 substitutions.

Rare variants (<= freqmin) can be excluded.

Parameters
sitea Site
gCodea GeneticCode
freqmina double To exclude snp in frequency strictly lower than freqmin (by default freqmin = 0)

Definition at line 571 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), generateCodonSiteWithoutRareVariant(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::SymbolListTools::getNumberOfDistinctCharacters(), bpp::SymbolListTools::hasGap(), bpp::AlphabetTools::isCodonAlphabet(), bpp::SymbolListTools::isConstant(), and bpp::AbstractTemplateSymbolList< T >::size().

◆ numberOfSynonymousDifferences()

double CodonSiteTools::numberOfSynonymousDifferences ( int  i,
int  j,
const GeneticCode gCode,
bool  minchange = false 
)
static

Compute the number of synonymous differences between two codons.

For complex codon: If minchange = false (default option) the different paths are equally weighted. If minchange = true the path with the minimum number of non-synonymous change is chosen. Paths included stop codons are excluded.

Parameters
ia int
ja int
gCodea GeneticCode
minchangea boolean set by default to false

Definition at line 205 of file CodonSiteTools.cpp.

References bpp::GeneticCode::areSynonymous(), bpp::GeneticCode::getCodonAlphabet(), bpp::GeneticCode::isStop(), bpp::VectorTools::max(), and bpp::VectorTools::sum().

◆ numberOfSynonymousPositions()

double CodonSiteTools::numberOfSynonymousPositions ( int  i,
const GeneticCode gCode,
double  ratio = 1.0 
)
static

Return the number of synonymous positions of a codon.

A site is consider as x% synonymous if x% of the possible mutations are synonymous Transition/transversion ratio can be taken into account (use the variable ratio)

Unresolved codons and stop codon will return a value of 0.

Parameters
ia int
gCodea GeneticCode
ratioa double set by default to 1

Definition at line 496 of file CodonSiteTools.cpp.

References bpp::GeneticCode::getCodonAlphabet(), bpp::GeneticCode::isStop(), and bpp::GeneticCode::translate().

◆ numberOfUnresolved() [1/3]

static size_t bpp::SymbolListTools::numberOfUnresolved ( const CruxSymbolListInterface site)
inlinestaticinherited

Definition at line 130 of file SymbolListTools.h.

References bpp::SymbolListTools::numberOfUnresolved().

◆ numberOfUnresolved() [2/3]

size_t SymbolListTools::numberOfUnresolved ( const IntSymbolListInterface site)
staticinherited

◆ numberOfUnresolved() [3/3]

size_t SymbolListTools::numberOfUnresolved ( const ProbabilisticSymbolListInterface site)
staticinherited

Definition at line 200 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ piNonSynonymous()

double CodonSiteTools::piNonSynonymous ( const Site site,
const GeneticCode gCode,
bool  minchange = false 
)
static

Compute the non-synonymous pi per codon site.

The following formula is used:

\[ pi = \frac{n}{n-1}\sum_{i,j}x_{i}x_{j}P_{ij} \]

where n is the number of sequence, $x_i$ and $x_j$ the frequencies of each codon type occurring at the site $P_{i,j}$ the number of nonsynonymous difference between these codons. Be careful: here, pi is not normalized by the number of non-synonymous sites. If minchange = false (default option) the different paths are equally weighted. If minchange = true the path with the minimum number of non-synonymous change is chosen.

Parameters
sitea Site
gCodea GeneticCode
minchangea boolean set by default to false

Definition at line 459 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::Alphabet::equals(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::GeneticCode::getCodonAlphabet(), bpp::SymbolListTools::getFrequencies(), bpp::AlphabetTools::isCodonAlphabet(), bpp::SymbolListTools::isConstant(), bpp::AbstractTemplateSymbolList< T >::size(), and bpp::GeneticCode::sourceAlphabet().

◆ piSynonymous()

double CodonSiteTools::piSynonymous ( const Site site,
const GeneticCode gCode,
bool  minchange = false 
)
static

Compute the synonymous pi per codon site.

The following formula is used:

\[ pi = \frac{n}{n-1}\sum_{i,j}x_{i}x_{j}P_{ij} \]

where n is the number of sequence, $x_i$ and $x_j$ the frequencies of each codon type occurring at the site $P_{i,j}$ the number of synonymous difference between these codons. Be careful: here, pi is not normalized by the number of synonymous sites.

If minchange = false (default option) the different paths are equally weighted. If minchange = true the path with the minimum number of non-synonymous change is chosen.

Parameters
sitea Site
gCodea GeneticCode
minchangea boolean set by default to false

Definition at line 427 of file CodonSiteTools.cpp.

References bpp::AbstractTemplateSymbolList< T >::alphabet(), bpp::Alphabet::equals(), bpp::AbstractTemplateSymbolList< T >::getAlphabet(), bpp::GeneticCode::getCodonAlphabet(), bpp::SymbolListTools::getFrequencies(), bpp::AlphabetTools::isCodonAlphabet(), bpp::SymbolListTools::isConstant(), bpp::AbstractTemplateSymbolList< T >::size(), and bpp::GeneticCode::sourceAlphabet().

◆ variabilityFactorial()

double SymbolListTools::variabilityFactorial ( const IntSymbolListInterface list)
staticinherited

Compute the factorial diversity index of a site.

\[ F = \frac{log\left(\left(\sum_x p_x\right)!\right)}{\sum_x \log(p_x)!} \]

where $p_x$ is the number of times state $x$ is observed in the site.

Author
J. Dutheil
Parameters
listA list.
Returns
The factorial diversity index of this list.

Definition at line 744 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::size().

◆ variabilityShannon()

double SymbolListTools::variabilityShannon ( const CruxSymbolListInterface list,
bool  resolveUnknowns 
)
staticinherited

Compute the Shannon entropy index of a SymbolList.

\[ I = - \sum_x f_x\cdot \ln(f_x) \]

where $f_x$ is the frequency of state $x$.

Author
J. Dutheil
Parameters
listA list.
resolveUnknownsTell is unknown characters must be resolved.
Returns
The Shannon entropy index of this list.

Definition at line 632 of file SymbolListTools.cpp.

References bpp::CruxSymbolListInterface::getAlphabet(), and bpp::CruxSymbolListInterface::size().

Referenced by bpp::SymbolListTools::entropy().


The documentation for this class was generated from the following files: