bpp-seq3
3.0.0
|
The Alphabet interface. More...
#include <Bpp/Seq/Alphabet/Alphabet.h>
Public Member Functions | |
Alphabet () | |
virtual | ~Alphabet ()=default |
virtual std::string | getName (int state) const =0 |
Get the complete name of a state given its int description. More... | |
virtual std::string | getName (const std::string &state) const =0 |
Get the complete name of a state given its string description. More... | |
virtual int | getIntCodeAt (size_t stateIndex) const =0 |
virtual const std::string & | getCharCodeAt (size_t stateIndex) const =0 |
virtual size_t | getStateIndex (int state) const =0 |
virtual size_t | getStateIndex (const std::string &state) const =0 |
virtual std::string | getAlphabetType () const =0 |
Identification method. More... | |
virtual unsigned int | getStateCodingSize () const =0 |
Get the size of the string coding a state. More... | |
virtual bool | equals (const Alphabet &alphabet) const =0 |
Comparison of alphabets. More... | |
The Clonable interface | |
Alphabet * | clone () const =0 |
Tests | |
virtual bool | isIntInAlphabet (int state) const =0 |
Tell if a state (specified by its int description) is allowed by the the alphabet. More... | |
virtual bool | isCharInAlphabet (const std::string &state) const =0 |
Tell if a state (specified by its string description) is allowed by the the alphabet. More... | |
State access | |
virtual const AlphabetState & | getStateAt (size_t stateIndex) const =0 |
Get a state given its index. More... | |
virtual const AlphabetState & | getState (int state) const =0 |
Get a state given its int description. More... | |
virtual const AlphabetState & | getState (const std::string &state) const =0 |
Get a state given its string description. More... | |
Conversion methods | |
virtual std::string | intToChar (int state) const =0 |
Give the string description of a state given its int description. More... | |
virtual int | charToInt (const std::string &state) const =0 |
Give the int description of a state given its string description. More... | |
Sizes. | |
virtual size_t | getNumberOfStates () const =0 |
This is a convenient alias for getNumberOfChars(), returning a size_t instead of unsigned int. More... | |
virtual unsigned int | getNumberOfChars () const =0 |
Get the number of supported characters in this alphabet, including generic characters (e.g. return 20 for DNA alphabet). More... | |
virtual unsigned int | getNumberOfTypes () const =0 |
Get the number of distinct states in alphabet (e.g. return 15 for DNA alphabet). This is the number of integers used for state description. More... | |
virtual unsigned int | getSize () const =0 |
Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases. More... | |
Utilitary methods | |
virtual bool | isResolvedIn (int state1, int state2) const =0 |
Tells if a given (potentially unresolved) state can be resolved in another resolved state. More... | |
virtual std::vector< int > | getAlias (int state) const =0 |
Get all resolved states that match a generic state. More... | |
virtual std::vector< std::string > | getAlias (const std::string &state) const =0 |
Get all resolved states that match a generic state. More... | |
virtual int | getGeneric (const std::vector< int > &states) const =0 |
Get the generic state that match a set of states. More... | |
virtual std::string | getGeneric (const std::vector< std::string > &states) const =0 |
Get the generic state that match a set of states. More... | |
virtual const std::vector< int > & | getSupportedInts () const =0 |
virtual const std::vector< std::string > & | getSupportedChars () const =0 |
virtual const std::vector< std::string > & | getResolvedChars () const =0 |
virtual int | getUnknownCharacterCode () const =0 |
virtual int | getGapCharacterCode () const =0 |
virtual bool | isGap (int state) const =0 |
virtual bool | isGap (const std::string &state) const =0 |
virtual bool | isUnresolved (int state) const =0 |
virtual bool | isUnresolved (const std::string &state) const =0 |
The Alphabet interface.
An alphabet object defines all the states allowed for a particular type of sequence. These states are coded as a string and an integer. The string description is the one found in the text (human comprehensive) description of sequences, typically in sequence files. However, for computionnal needs, this is often more efficient to store the sequences as a vector of integers. The link between the two descriptions is made via the Alphabet classes, and the two methods intToChar() and charToInt(). The Alphabet interface also provides other methods, like getting the full name of the states and so on.
An Alphabet object in itself stores the states as AlphabetStates object, in a potentially arbitrary but consistent series. All states are then indexed from 0 to 'numbersOfChars'. The number of states is equal to the number of string representations, but is usually higher than the number of int representation, as several characters can correspond to the same state (for instance X, N and ? in nucleotide alphabets).
The alphabet objects may throw several exceptions derived of the AlphabetException class.
Definition at line 96 of file Alphabet.h.
|
inline |
Definition at line 101 of file Alphabet.h.
|
virtualdefault |
|
pure virtual |
Give the int description of a state given its string description.
state | The string description. |
BadCharException | When state is not a valid char description. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::LetterAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Implements bpp::Clonable.
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::NucleicAlphabet, bpp::LetterAlphabet, bpp::AbstractAlphabet, bpp::RNY, bpp::RNA, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DNA, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
|
pure virtual |
Comparison of alphabets.
Implemented in bpp::AbstractAlphabet.
Referenced by bpp::CodonSiteTools::fixedDifferences(), bpp::CodonSiteTools::isSynonymousPolymorphic(), bpp::CodonSiteTools::meanNumberOfSynonymousPositions(), bpp::CodonSiteTools::numberOfNonSynonymousSubstitutions(), bpp::CodonSiteTools::piNonSynonymous(), and bpp::CodonSiteTools::piSynonymous().
|
pure virtual |
Get all resolved states that match a generic state.
If the given state is not a generic code then the output vector will contain this unique code.
state | The alias to resolve. |
BadCharException | When state is not a valid char description. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::RNA, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::DNA, bpp::BinaryAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Get all resolved states that match a generic state.
If the given state is not a generic code then the output vector will contain this unique code.
state | The alias to resolve. |
BadIntException | When state is not a valid integer. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::RNA, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::DNA, bpp::BinaryAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Identification method.
Used to tell if two alphabets describe the same type of sequences. For instance, this method is used by sequence containers to compare two alphabets and allow or deny addition of sequences.
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::RNA, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DNA, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
Referenced by bpp::SequenceTools::combineSequences(), bpp::SequenceTools::concatenate(), bpp::AbstractAlphabet::equals(), bpp::SequenceTools::getPercentIdentity(), bpp::AbstractReverseTransliterator::reverse(), and bpp::AbstractTransliterator::translate().
|
pure virtual |
stateIndex | The index of the state to fetch. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Implemented in bpp::AbstractAlphabet.
Referenced by bpp::SequenceTools::getPercentIdentity().
|
pure virtual |
Get the generic state that match a set of states.
If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.
states | A vector of states to resolve. |
BadIntException | When a state is not a valid integer. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, bpp::RNA, bpp::ProteicAlphabet, bpp::DNA, and bpp::AbstractAlphabet.
|
pure virtual |
Get the generic state that match a set of states.
If the given states contain generic code, each generic code is first resolved and then the new generic state is returned. If only a single resolved state is given the function return this state.
states | A vector of states to resolve. |
BadCharException | when a state is not a valid char description. |
CharStateNotSupportedException | when the alphabet does not support Char state for unresolved state. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, bpp::RNA, bpp::ProteicAlphabet, bpp::DNA, and bpp::AbstractAlphabet.
|
pure virtual |
stateIndex | The index of the state to fetch. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Get the complete name of a state given its string description.
In case of several states with identical number (i.e. N and X for nucleic alphabets), this method will return the name of the first found in the vector.
state | The string description of the given state. |
BadCharException | When state is not a valid char description. |
Implemented in bpp::WordAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Get the complete name of a state given its int description.
In case of several states with identical number (i.e. N and X for nucleic alphabets), this method returns the name of the first found in the vector.
state | The int description of the given state. |
BadIntException | When state is not a valid integer. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Get the number of supported characters in this alphabet, including generic characters (e.g. return 20 for DNA alphabet).
Implemented in bpp::AbstractAlphabet.
Referenced by bpp::AlphabetTools::checkAlphabetCodingSize().
|
pure virtual |
This is a convenient alias for getNumberOfChars(), returning a size_t instead of unsigned int.
This function is typically used il loops over all states of an alphabet.
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Get the number of distinct states in alphabet (e.g. return 15 for DNA alphabet). This is the number of integers used for state description.
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::NucleicAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
Referenced by bpp::AlphabetTools::checkAlphabetCodingSize().
|
pure virtual |
Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Get the number of resolved states in the alphabet (e.g. return 4 for DNA alphabet). This is the method you'll need in most cases.
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::NucleicAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
|
pure virtual |
Get a state given its string description.
state | The string description. |
BadCharException | When state is not a valid string. |
Implemented in bpp::ProteicAlphabet, bpp::NucleicAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Get a state given its int description.
Note: several states can share the same int values. This function will return one.
state | The int description. |
BadIntException | When state is not a valid integer. |
Implemented in bpp::ProteicAlphabet, bpp::NucleicAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Get a state given its index.
stateIndex | The index of the state. |
IndexOutOfBoundsException | When index is not a valid. |
Implemented in bpp::NumericAlphabet, bpp::NucleicAlphabet, bpp::AbstractAlphabet, and bpp::ProteicAlphabet.
Referenced by bpp::AlphabetTools::checkAlphabetCodingSize().
|
pure virtual |
Get the size of the string coding a state.
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Note for developers of new alphabets: we return a const reference here since the list is supposed to be stored within the class and should not be modified outside the class.
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::NucleicAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
|
pure virtual |
Give the string description of a state given its int description.
state | The int description. |
BadIntException | When state is not a valid integer. |
Implemented in bpp::RNY, and bpp::AbstractAlphabet.
Referenced by bpp::AlphabetTools::checkAlphabetCodingSize(), bpp::AlphabetTools::getAlphabetCodingSize(), and bpp::RNY::getRNY().
|
pure virtual |
Tell if a state (specified by its string description) is allowed by the the alphabet.
state | The string description. |
Implemented in bpp::LetterAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
state | The state to test. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
state | The state to test. |
Implemented in bpp::RNY, bpp::NumericAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
Tell if a state (specified by its int description) is allowed by the the alphabet.
state | The int description. |
Implemented in bpp::AbstractAlphabet.
|
pure virtual |
Tells if a given (potentially unresolved) state can be resolved in another resolved state.
state1 | The alias to resolve. |
state2 | The candidate for resolution. |
BadIntException | When state is not a valid integer. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::RNA, bpp::ProteicAlphabet, bpp::DNA, bpp::BinaryAlphabet, and bpp::AbstractAlphabet.
|
pure virtual |
state | The state to test. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::NucleicAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.
|
pure virtual |
state | The state to test. |
Implemented in bpp::WordAlphabet, bpp::CodonAlphabet, bpp::CaseMaskedAlphabet, bpp::AllelicAlphabet, bpp::RNY, bpp::ProteicAlphabet, bpp::NumericAlphabet, bpp::NucleicAlphabet, bpp::LexicalAlphabet, bpp::IntegerAlphabet, bpp::DefaultAlphabet, and bpp::BinaryAlphabet.