bpp-phyl3  3.0.0
bpp::HierarchicalClustering Class Reference

Hierarchical clustering. More...

#include <Bpp/Phyl/Distance/HierarchicalClustering.h>

+ Inheritance diagram for bpp::HierarchicalClustering:
+ Collaboration diagram for bpp::HierarchicalClustering:

Public Member Functions

 HierarchicalClustering (const std::string &method, bool verbose=false)
 Builds a new clustering object. More...
 
 HierarchicalClustering (const std::string &method, const DistanceMatrix &matrix, bool verbose=false)
 
virtual ~HierarchicalClustering ()
 
HierarchicalClusteringclone () const
 
std::string getName () const
 
virtual void setDistanceMatrix (const DistanceMatrix &matrix) override
 Set the distance matrix to use. More...
 
bool hasTree () const override
 
const Treetree () const override
 
void computeTree () override
 Compute the tree corresponding to the distance matrix. More...
 
void setVerbose (bool yn) override
 
bool isVerbose () const override
 

Static Public Attributes

static const std::string COMPLETE = "Complete"
 
static const std::string SINGLE = "Single"
 
static const std::string AVERAGE = "Average"
 
static const std::string MEDIAN = "Median"
 
static const std::string WARD = "Ward"
 
static const std::string CENTROID = "Centroid"
 

Protected Member Functions

std::vector< size_t > getBestPair ()
 Get the best pair of nodes to agglomerate. More...
 
std::vector< double > computeBranchLengthsForPair (const std::vector< size_t > &pair)
 Compute the branch lengths for two nodes to agglomerate. More...
 
double computeDistancesFromPair (const std::vector< size_t > &pair, const std::vector< double > &branchLengths, size_t pos)
 Actualizes the distance matrix according to a given pair and the corresponding branch lengths. More...
 
void finalStep (int idRoot)
 Method called when there ar eonly three remaining node to agglomerate, and creates the root node of the tree. More...
 
virtual NodegetLeafNode (int id, const std::string &name)
 Get a leaf node. More...
 
virtual NodegetParentNode (int id, Node *son1, Node *son2)
 Get an inner node. More...
 

Protected Attributes

std::string method_
 
DistanceMatrix matrix_
 
std::unique_ptr< Treetree_
 
std::map< size_t, Node * > currentNodes_
 
bool verbose_
 
bool rootTree_
 

Detailed Description

Hierarchical clustering.

This class implements the complete, single, average (= UPGMA), median, ward and centroid linkage methods.

Definition at line 29 of file HierarchicalClustering.h.

Constructor & Destructor Documentation

◆ HierarchicalClustering() [1/2]

bpp::HierarchicalClustering::HierarchicalClustering ( const std::string &  method,
bool  verbose = false 
)
inline

Builds a new clustering object.

Parameters
methodThe linkage method to use. should be one of COMPLETE, SINGLE, AVERAGE, MEDIAN, WARD, CENTROID.
verboseTell if some progress information should be displayed.

Definition at line 50 of file HierarchicalClustering.h.

Referenced by clone().

◆ HierarchicalClustering() [2/2]

bpp::HierarchicalClustering::HierarchicalClustering ( const std::string &  method,
const DistanceMatrix matrix,
bool  verbose = false 
)
inline

◆ ~HierarchicalClustering()

virtual bpp::HierarchicalClustering::~HierarchicalClustering ( )
inlinevirtual

Definition at line 60 of file HierarchicalClustering.h.

Member Function Documentation

◆ clone()

HierarchicalClustering* bpp::HierarchicalClustering::clone ( ) const
inlinevirtual

◆ computeBranchLengthsForPair()

vector< double > HierarchicalClustering::computeBranchLengthsForPair ( const std::vector< size_t > &  pair)
protectedvirtual

Compute the branch lengths for two nodes to agglomerate.

+---l1-----N1
|
+---l2-----N2

This method compute l1 and l2 given N1 and N2.

Parameters
pairThe indices of the nodes to be agglomerated.
Returns
A size 2 vector with branch lengths.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 62 of file HierarchicalClustering.cpp.

◆ computeDistancesFromPair()

double HierarchicalClustering::computeDistancesFromPair ( const std::vector< size_t > &  pair,
const std::vector< double > &  branchLengths,
size_t  pos 
)
protectedvirtual

Actualizes the distance matrix according to a given pair and the corresponding branch lengths.

Parameters
pairThe indices of the nodes to be agglomerated.
branchLengthsThe corresponding branch lengths.
posThe index of the node whose distance ust be updated.
Returns
The distance between the 'pos' node and the agglomerated pair.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 71 of file HierarchicalClustering.cpp.

References bpp::abs(), and bpp::pow().

◆ computeTree()

void AbstractAgglomerativeDistanceMethod::computeTree ( )
overridevirtualinherited

Compute the tree corresponding to the distance matrix.

This method implements the following algorithm: 1) Build all leaf nodes (getLeafNode method) 2) Get the best pair to agglomerate (getBestPair method) 3) Compute the branch lengths for this pair (computeBranchLengthsForPair method) 4) Build the parent node of the pair (getParentNode method) 5) For each remaining node, update distances from the pair (computeDistancesFromPair method) 6) Return to step 2 while there are more than 3 remaining nodes. 7) Perform the final step, and send a rooted or unrooted tree.

Implements bpp::DistanceMethodInterface.

Reimplemented in bpp::BioNJ.

Definition at line 26 of file AbstractAgglomerativeDistanceMethod.cpp.

References bpp::ApplicationTools::displayGauge(), and bpp::Node::setDistanceToFather().

Referenced by HierarchicalClustering(), bpp::NeighborJoining::NeighborJoining(), and bpp::PGMA::PGMA().

◆ finalStep()

void HierarchicalClustering::finalStep ( int  idRoot)
protectedvirtual

Method called when there ar eonly three remaining node to agglomerate, and creates the root node of the tree.

Parameters
idRootThe id of the root node.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 131 of file HierarchicalClustering.cpp.

References bpp::Node::addSon(), and bpp::Node::setDistanceToFather().

◆ getBestPair()

vector< size_t > HierarchicalClustering::getBestPair ( )
protectedvirtual

Get the best pair of nodes to agglomerate.

Define the criterion to chose the next pair of nodes to agglomerate. This criterion uses the matrix_ distance matrix.

Returns
A size 2 vector with the indices of the nodes.
Exceptions
ExceptionIf an error occurred.

Implements bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 18 of file HierarchicalClustering.cpp.

References bpp::numeric::log().

◆ getLeafNode()

Node * HierarchicalClustering::getLeafNode ( int  id,
const std::string &  name 
)
protectedvirtual

Get a leaf node.

Create a new node with the given id and name.

Parameters
idThe id of the node.
nameThe name of the node.
Returns
A pointer toward a new node object.

Reimplemented from bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 148 of file HierarchicalClustering.cpp.

References bpp::ClusterInfos::length, bpp::ClusterInfos::numberOfLeaves, and bpp::NodeTemplate< NodeInfos >::setInfos().

◆ getName()

std::string bpp::HierarchicalClustering::getName ( ) const
inlinevirtual
Returns
The name of the distance method.

Implements bpp::DistanceMethodInterface.

Definition at line 65 of file HierarchicalClustering.h.

References method_.

◆ getParentNode()

Node * HierarchicalClustering::getParentNode ( int  id,
Node son1,
Node son2 
)
protectedvirtual

Get an inner node.

Create a new node with the given id, and set its sons.

Parameters
idThe id of the node.
son1The first son of the node.
son2The second son of the node.
Returns
A pointer toward a new node object.

Reimplemented from bpp::AbstractAgglomerativeDistanceMethod.

Definition at line 158 of file HierarchicalClustering.cpp.

References bpp::Node::addSon(), bpp::Node::getDistanceToFather(), bpp::ClusterInfos::length, and bpp::ClusterInfos::numberOfLeaves.

◆ hasTree()

bool bpp::AbstractAgglomerativeDistanceMethod::hasTree ( ) const
inlineoverridevirtualinherited
Returns
True if a tree has been computed.

Implements bpp::DistanceMethodInterface.

Definition at line 86 of file AbstractAgglomerativeDistanceMethod.h.

References bpp::AbstractAgglomerativeDistanceMethod::tree_.

◆ isVerbose()

bool bpp::AbstractAgglomerativeDistanceMethod::isVerbose ( ) const
inlineoverridevirtualinherited
Returns
True if verbose mode is enabled.

Implements bpp::DistanceMethodInterface.

Definition at line 114 of file AbstractAgglomerativeDistanceMethod.h.

References bpp::AbstractAgglomerativeDistanceMethod::verbose_.

◆ setDistanceMatrix()

void AbstractAgglomerativeDistanceMethod::setDistanceMatrix ( const DistanceMatrix matrix)
overridevirtualinherited

Set the distance matrix to use.

Parameters
matrixThe matrix to use.
Exceptions
ExceptionIn case an incorrect matrix is provided (eg smaller than 3).

Implements bpp::DistanceMethodInterface.

Reimplemented in bpp::PGMA, bpp::NeighborJoining, and bpp::BioNJ.

Definition at line 17 of file AbstractAgglomerativeDistanceMethod.cpp.

References bpp::DistanceMatrix::reset(), and bpp::DistanceMatrix::size().

Referenced by bpp::AbstractAgglomerativeDistanceMethod::AbstractAgglomerativeDistanceMethod(), bpp::NeighborJoining::setDistanceMatrix(), and bpp::PGMA::setDistanceMatrix().

◆ setVerbose()

void bpp::AbstractAgglomerativeDistanceMethod::setVerbose ( bool  yn)
inlineoverridevirtualinherited
Parameters
ynEnable/Disable verbose mode.

Implements bpp::DistanceMethodInterface.

Definition at line 113 of file AbstractAgglomerativeDistanceMethod.h.

References bpp::AbstractAgglomerativeDistanceMethod::verbose_.

◆ tree()

const Tree& bpp::AbstractAgglomerativeDistanceMethod::tree ( ) const
inlineoverridevirtualinherited
Returns
A reference toward the computed tree. Throws an exception if no tree was computed.

Implements bpp::DistanceMethodInterface.

Definition at line 91 of file AbstractAgglomerativeDistanceMethod.h.

References bpp::AbstractAgglomerativeDistanceMethod::tree_.

Member Data Documentation

◆ AVERAGE

const string HierarchicalClustering::AVERAGE = "Average"
static

Definition at line 35 of file HierarchicalClustering.h.

◆ CENTROID

const string HierarchicalClustering::CENTROID = "Centroid"
static

Definition at line 38 of file HierarchicalClustering.h.

◆ COMPLETE

const string HierarchicalClustering::COMPLETE = "Complete"
static

Definition at line 33 of file HierarchicalClustering.h.

◆ currentNodes_

std::map<size_t, Node*> bpp::AbstractAgglomerativeDistanceMethod::currentNodes_
protectedinherited

◆ matrix_

DistanceMatrix bpp::AbstractAgglomerativeDistanceMethod::matrix_
protectedinherited

◆ MEDIAN

const string HierarchicalClustering::MEDIAN = "Median"
static

Definition at line 36 of file HierarchicalClustering.h.

◆ method_

std::string bpp::HierarchicalClustering::method_
protected

Definition at line 41 of file HierarchicalClustering.h.

Referenced by getName().

◆ rootTree_

bool bpp::AbstractAgglomerativeDistanceMethod::rootTree_
protectedinherited

◆ SINGLE

const string HierarchicalClustering::SINGLE = "Single"
static

Definition at line 34 of file HierarchicalClustering.h.

◆ tree_

◆ verbose_

◆ WARD

const string HierarchicalClustering::WARD = "Ward"
static

Definition at line 37 of file HierarchicalClustering.h.


The documentation for this class was generated from the following files: