% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/bootstraping.R
\name{diversity_ci}
\alias{diversity_ci}
\title{Perform bootstrap statistics, calculate, and plot confidence intervals.}
\usage{
diversity_ci(
  tab,
  n = 1000,
  n.boot = 1L,
  ci = 95,
  total = TRUE,
  rarefy = FALSE,
  n.rare = 10,
  plot = TRUE,
  raw = TRUE,
  center = TRUE,
  ...
)
}
\arguments{
\item{tab}{a \code{\link[=genind]{genind()}}, \code{\link[=genclone]{genclone()}},
\code{\link[=snpclone]{snpclone()}}, OR a matrix produced from
\code{\link[=mlg.table]{mlg.table()}}.}

\item{n}{an integer defining the number of bootstrap replicates (defaults to
1000).}

\item{n.boot}{an integer specifying the number of samples to be drawn in each
bootstrap replicate. If \code{n.boot} < 2 (default), the number of samples
drawn for each bootstrap replicate will be equal to the number of samples
in the data set. See Details.}

\item{ci}{the percent for confidence interval.}

\item{total}{argument to be passed on to \code{\link[=mlg.table]{mlg.table()}} if
\code{tab} is a genind object.}

\item{rarefy}{if \code{TRUE}, bootstrapping will be performed on the smallest
population size or the value of \code{n.rare}, whichever is larger.
Defaults to \code{FALSE}, indicating that bootstrapping will be performed
respective to each population size.}

\item{n.rare}{an integer specifying the smallest size at which to resample
data. This is only used if \code{rarefy = TRUE}.}

\item{plot}{If \code{TRUE} (default), boxplots will be produced for each
population, grouped by statistic. Colored dots will indicate the observed
value.This plot can be retrieved by using \code{p <- last_plot()} from the
\pkg{ggplot2} package.}

\item{raw}{if \code{TRUE} (default) a list containing three elements will be
returned}

\item{center}{if \code{TRUE} (default), the confidence interval will be
centered around the observed statistic. Otherwise, if \code{FALSE}, the
confidence interval will be bias-corrected normal CI as reported from
\code{\link[boot:boot.ci]{boot::boot.ci()}}}

\item{...}{parameters to be passed on to \code{\link[boot:boot]{boot::boot()}} and
\code{\link[=diversity_stats]{diversity_stats()}}}
}
\value{
\subsection{raw = TRUE}{
\itemize{
\item \strong{obs} a matrix with observed statistics in columns,
populations in rows
\item \strong{est} a matrix with estimated statistics in columns,
populations in rows
\item \strong{CI} an array of 3 dimensions giving the lower and upper
bound, the index measured, and the population.
\item \strong{boot} a list containing the output of
\code{\link[boot:boot]{boot::boot()}} for each population.
}
}

\subsection{raw = FALSE}{ a data frame with the statistic observations,
estimates, and confidence intervals in columns, and populations in rows. Note
that the confidence intervals are converted to characters and rounded to
three decimal places. }
}
\description{
This function is for calculating bootstrap statistics and their confidence
intervals. It is important to note that the calculation of confidence
intervals is not perfect (See Details). Please be cautious when interpreting
the results.
}
\details{
\subsection{Bootstrapping}{
For details on the bootstrapping procedures, see
\code{\link[=diversity_boot]{diversity_boot()}}. Default bootstrapping is performed by
sampling \strong{N} samples from a multinomial distribution weighted by the
relative multilocus genotype abundance per population where \strong{N} is
equal to the number of samples in the data set. If \strong{n.boot} > 2,
then \strong{n.boot} samples are taken at each bootstrap replicate. When
\code{rarefy = TRUE}, then samples are taken at the smallest population
size without replacement. This will provide confidence intervals for all
but the smallest population.
}
\subsection{Confidence intervals}{
Confidence intervals are derived from the function
\code{\link[boot:norm.ci]{boot::norm.ci()}}. This function will attempt to correct for bias
between the observed value and the bootstrapped estimate. When \code{center = TRUE} (default), the confidence interval is calculated from the
bootstrapped distribution and centered around the bias-corrected estimate
as prescribed in Marcon (2012). This method can lead to undesirable
properties, such as the confidence interval lying outside of the maximum
possible value. For rarefaction, the confidence interval is simply
determined by calculating the percentiles from the bootstrapped
distribution. If you want to calculate your own confidence intervals, you
can use the results of the permutations stored in the \verb{$boot} element
of the output.
}
\subsection{Rarefaction}{
Rarefaction in the sense of this function is simply sampling a subset of
the data at size \strong{n.rare}. The estimates derived from this method
have straightforward interpretations and allow you to compare diversity
across populations since you are controlling for sample size.
}
\subsection{Plotting}{ Results are plotted as boxplots with point
estimates. If there is no rarefaction applied, confidence intervals are
displayed around the point estimates. The boxplots represent the actual
values from the bootstrapping and will often appear below the estimates and
confidence intervals.
}
}
\note{
\subsection{Confidence interval calculation}{ Almost all of the statistics
supplied here have a maximum when all genotypes are equally represented.
This means that bootstrapping the samples will always be downwardly biased.
In many cases, the confidence intervals from the bootstrapped distribution
will fall outside of the observed statistic. The reported confidence
intervals here are reported by assuming the variance of the bootstrapped
distribution is the same as the variance around the observed statistic. As
different statistics have different properties, there will not always be
one clear method for calculating confidence intervals. A suggestion for
correction in Shannon's index is to center the CI around the observed
statistic (Marcon, 2012), but there are theoretical limitations to this.
For details, see \url{https://stats.stackexchange.com/q/156235/49413}.
}

\subsection{User-defined functions}{
While it is possible to use custom functions with this, there are three
important things to remember when using these functions:

\if{html}{\out{<div class="sourceCode">}}\preformatted{1. The function must return a single value. 
2. The function must allow for both matrix and vector inputs 
3. The function name cannot match or partially match any arguments 
from [boot::boot()]
}\if{html}{\out{</div>}}

Anonymous functions are okay \cr(e.g. \code{function(x) vegan::rarefy(t(as.matrix(x)), 10)}).
}
}
\examples{
library(poppr)
data(Pinf)
diversity_ci(Pinf, n = 100L)
\dontrun{
# With pretty results
diversity_ci(Pinf, n = 100L, raw = FALSE)

# This can be done in a parallel fasion (OSX uses "multicore", Windows uses "snow")
system.time(diversity_ci(Pinf, 10000L, parallel = "multicore", ncpus = 4L))
system.time(diversity_ci(Pinf, 10000L))

# We often get many requests for a clonal fraction statistic. As this is 
# simply the number of observed MLGs over the number of samples, we 
# recommended that people calculate it themselves. With this function, you
# can add it in:

CF <- function(x){
 x <- drop(as.matrix(x))
 if (length(dim(x)) > 1){
   res <- rowSums(x > 0)/rowSums(x)
 } else {
   res <- sum(x > 0)/sum(x)
 }
 return(res)
}
# Show pretty results

diversity_ci(Pinf, 1000L, CF = CF, center = TRUE, raw = FALSE)
diversity_ci(Pinf, 1000L, CF = CF, rarefy = TRUE, raw = FALSE)
}

}
\references{
Marcon, E., Herault, B., Baraloto, C. and Lang, G. (2012). The Decomposition
of Shannon’s Entropy and a Confidence Interval for Beta Diversity.
\emph{Oikos} 121(4): 516-522.
}
\seealso{
\code{\link[=diversity_boot]{diversity_boot()}} \code{\link[=diversity_stats]{diversity_stats()}}
\code{\link[=poppr]{poppr()}} \code{\link[boot:boot]{boot::boot()}} \code{\link[boot:norm.ci]{boot::norm.ci()}}
\code{\link[boot:boot.ci]{boot::boot.ci()}}
}
\author{
Zhian N. Kamvar
}
