% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cor_bakers_gamma.R
\name{cor_bakers_gamma}
\alias{cor_bakers_gamma}
\alias{cor_bakers_gamma.dendrogram}
\alias{cor_bakers_gamma.hclust}
\alias{cor_bakers_gamma.dendlist}
\alias{cor_bakers_gamma.default}
\alias{cor_bakers_gamma.dendrogram}
\alias{cor_bakers_gamma.hclust}
\alias{cor_bakers_gamma.dendlist}
\title{Baker's Gamma correlation coefficient}
\usage{
cor_bakers_gamma(dend1, ...)

\method{cor_bakers_gamma}{default}(dend1, dend2, ...)

\method{cor_bakers_gamma}{dendrogram}(dend1, dend2,
  use_labels_not_values = TRUE, to_plot = FALSE,
  warn = dendextend_options("warn"), ...)

\method{cor_bakers_gamma}{hclust}(dend1, dend2, use_labels_not_values = TRUE,
  to_plot = FALSE, warn = dendextend_options("warn"), ...)

\method{cor_bakers_gamma}{dendlist}(dend1, which = c(1L, 2L), ...)
}
\arguments{
\item{dend1}{a tree (dendrogram/hclust/phylo)}

\item{...}{Passed to \link[dendextend]{cutree}.}

\item{dend2}{a tree (dendrogram/hclust/phylo)}

\item{use_labels_not_values}{logical (TRUE). Should labels be used in the 
k matrix when using cutree? Set to FALSE will make the function a bit faster
BUT, it assumes the two trees have the exact same leaves order values for 
each labels. This can be assured by using \link{match_order_by_labels}.}

\item{to_plot}{logical (FALSE). Passed to \link{bakers_gamma_for_2_k_matrix}}

\item{warn}{logical (default from dendextend_options("warn") is FALSE).
Set if warning are to be issued, it is safer to keep this at TRUE,
but for keeping the noise down, the default is FALSE.
should a warning be issued when using \link[dendextend]{cutree}?}

\item{which}{an integer vector of length 2, indicating
which of the trees in the dendlist object should be plotted (relevant for dendlist)}
}
\value{
Baker's Gamma association Index between two trees (a number between -1 to 1)
}
\description{
Calculate Baker's Gamma correlation coefficient for two trees 
(also known as Goodman-Kruskal-gamma index).

Assumes the labels in the two trees fully match. If they do not
please first use \link{intersect_trees} to have them matched.

WARNING: this can be quite slow for medium/large trees.
}
\details{
Baker's Gamma (see reference) is a measure of accosiation (similarity) 
between two trees of heirarchical clustering (dendrograms).

It is calculated by taking two items, and see what is the heighst
possible level of k (number of cluster groups created when cutting the tree)
for which the two item still belongs to the same tree. That k is returned, 
and the same is done for these two items for the second tree.
There are n over 2 combinations of such pairs of items from the items in 
the tree, and all of these numbers are calculated for each of the two trees. 
Then, these two sets of numbers (a set for the items in each tree)
are paired according to the pairs of items compared, and a spearman 
correlation is calculated.

The value can range between -1 to 1. With near 0 values meaning that
the two trees are not statistically similar.
For exact p-value one should result to a permutation test. One such option
will be to permute over the labels of one tree many times, and calculating 
the distriubtion under the null hypothesis (keeping the trees topologies
constant).

Notice that this measure is not affected by the height of a branch but only
of its relative position compared with other branches.
}
\examples{

\dontrun{

set.seed(23235)
ss <- sample(1:150, 10 )
hc1 <- hclust(dist(iris[ss,-5]), "com")
hc2 <- hclust(dist(iris[ss,-5]), "single")
dend1 <- as.dendrogram(hc1)
dend2 <- as.dendrogram(hc2)
#    cutree(dend1)   

cor_bakers_gamma(hc1, hc2)
cor_bakers_gamma(dend1, dend2)

dend1 <- match_order_by_labels(dend1, dend2) # if you are not sure
cor_bakers_gamma(dend1, dend2, use_labels_not_values = FALSE)   

library(microbenchmark)
microbenchmark(
   with_labels = cor_bakers_gamma(dend1, dend2, try_cutree_hclust=FALSE)   ,
   with_values = cor_bakers_gamma(dend1, dend2, 
               use_labels_not_values = FALSE, try_cutree_hclust=FALSE)   ,
   times=10
)


cor_bakers_gamma(dend1, dend1, use_labels_not_values = FALSE)   
cor_bakers_gamma(dend1, dend1, use_labels_not_values = TRUE)   

}

}
\references{
Baker, F. B., Stability of Two Hierarchical Grouping Techniques Case
 1: Sensitivity to Data Errors. Journal of the American Statistical 
 Association, 69(346), 440 (1974).
}
\seealso{
\link{cor_cophenetic}
}
