\name{cc_outl}
\alias{cc_outl}

\title{
Flag Geographic Outliers in Species Distributions
}
\description{
Flags records that are outliers in geographic space according to the method defined via the \code{method} argument. Geographic outliers often represent erroneous coordinates, for example due to data entry errors, imprecise geo-references, individuals in horticulture/captivity.
}
\usage{
cc_outl(x, lon = "decimallongitude", lat = "decimallatitude", species = "species", 
        method = "quantile", mltpl = 3, tdi = 1000, value = "clean", verbose = TRUE)
}

\arguments{
  \item{x}{
a data.frame. Containing geographical coordinates and species names.
}
  \item{lon}{
a character string. The column with the longitude coordinates. Default = \dQuote{decimallongitude}.
}
  \item{lat}{
a character string. The column with the longitude coordinates. Default = \dQuote{decimallatitude}.
}
  \item{species}{
a character string. The column with the species name. Default = \dQuote{species}.
}
  \item{method}{
a character string.  Defining the method for outlier selection.  See details. One of \dQuote{distance}, \dQuote{quantile}, \dQuote{mad}.  Default = \dQuote{quantile}.
}
  \item{mltpl}{
numeric. The multiplier of the interquartile range (\code{method == 'quantile'}) or median absolute deviation (\code{method == 'mad'})to identify outliers. See details.  Default = 3.
}
  \item{tdi}{
numeric.  The minimum absolute distance (\code{method == 'distance'}) of a record to all other records of a species to be identified as outlier, in km. See details. Default = 1000.
}
  \item{value}{
a  character string.  Defining the output value. See value.
}
  \item{verbose}{
logical. If TRUE reports the name of the test and the number of records flagged.
}
}
\details{
The method for outlier identification depends on the \code{method} argument. If \dQuote{outlier}: a boxplot method is used and records are flagged as outliers if their \emph{mean} distance to all other records of the same species is larger than mltpl * the interquartile range of the mean distance of all records of this species. If \dQuote{mad}: the median absolute deviation is used. In this case a record is flagged as outlier, if the \emph{mean} distance to all other records of the same species is larger than the median of the mean distance of all points plus/minus the mad of the mean distances of all records of the species * mltpl. If \dQuote{distance}: records are flagged as outliers, if the \emph{minimum} distance to the next record of the species is > \code{tdi}.
}
\value{
Depending on the \sQuote{value} argument, either a \code{data.frame} containing the records considered correct by the test (\dQuote{clean}) or a logical vector, with TRUE = test passed and FALSE = test failed/potentially problematic (\dQuote{flags}). Default = \dQuote{clean}.
}
\note{
See \url{https://github.com/azizka/CoordinateCleaner/wiki} for more details and tutorials.
}

\examples{
x <- data.frame(species = letters[1:10], 
                decimallongitude = runif(100, -180, 180), 
                decimallatitude = runif(100, -90,90))
                
cc_outl(x)
cc_outl(x, method = "quantile", value = "flags")
cc_outl(x, method = "distance", value = "flags", tdi = 10000)
cc_outl(x, method = "distance", value = "flags", tdi = 1000)
}

\keyword{ Coordinate cleaning }
