## Dray S., Josse J. (2015).
*Principal component analysis with missing values: a
comparative survey of methods*.
Plant Ecology, 216:657--667.
10.1007/s11258-014-0406-z.

Principal component analysis (PCA) is a standard technique
to summarize the main structures of a data table containing
the measurements of several quantitative variables for a
number of individuals. Here, we study the case where some
of the data values are missing and propose a review of
methods which accommodate PCA to missing data. In plant
ecology, this statistical challenge relates to the current
effort to compile global plant functional trait databases
producing matrices with a large amount of missing values.
We present several techniques to consider or estimate
(impute) missing values in PCA and compare them using
theoretical considerations. We carried out a simulation
study to evaluate the relative merits of the different
approaches in various situations (correlation structure,
number of variables and individuals, and percentage of
missing values) and also applied them on a real data set.
Lastly, we discuss the advantages and drawbacks of these
approaches, the potential pitfalls and future challenges
that need to be addressed in the future.

[ |
DOI |
]
Back

*This file was generated by
bibtex2html 1.98.*