outForest: Multivariate Outlier Detection and Replacement
Provides a random forest based implementation of the method
described in Chapter 7.1.2 (Regression model based anomaly detection)
of Chandola et al. (2009) <doi:10.1145/1541880.1541882>. It works as
follows: Each numeric variable is regressed onto all other variables
by a random forest. If the scaled absolute difference between observed
value and out-of-bag prediction of the corresponding random forest is
suspiciously large, then a value is considered an outlier. The package
offers different options to replace such outliers, e.g. by realistic
values found via predictive mean matching. Once the method is trained
on a reference data, it can be applied to new data.
Please use the canonical form
to link to this page.