**Next message:**Charline Laurent: "(no subject)"**Previous message:**Daniel Chessel: "Re: Co-inertie RLQ et son RV"**In reply to:**Alexandra Lima: "MGPCA & Discriminant Analysis"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

Answer to Alexandra Lima via all adelisters

Alexandra Lima wrote "I would like to do a MGPCA - Multiple Group Principal

Components Analysis (Thorpe, 1988), as I'm interested in removing a "size

effect", but I'm not sure if this is possible with ADE.".

The question is not so simple and there are three levels.

First level: the PCA definition.

(A) Estimation of principal axes for a Gaussian distribution

(B) Geometric analysis of the shape of multivariate scatter plots.

This conceptual difference does not change the calculation and all the view

points are used with the same program (dudi.pca, prcomp, princomp, ..)

When the individuals belong to groups, the topic may be :

1) distinguish groups (discrimination)

2) ordinate simultaneously the groups.

In strategy A, this is the common principal components (CPCs) or partial

CPCs. These methods assume that either all components or only some of them

are common to all groups, the discrepancies being due mainly to sampling

error. (Airoldi, J.-P., and B. Flury. 1988. An application of common

principal component analysis to cranial morphometry of Microtus

californicus and M. ochrogaster (Mammalia, Rodentia). Journal of Zoology

216:21-36).

In strategy B, this is the multiple group principal component analysis or

MGPCA (Thorpe, 1983) which is also viewed as within-classes analyses.

The same question of simultaneous ordination of several groups has been

raised in morphometry, ecology, economy,...and we are in ade4 MGPCA.

Second level: different types of analyses linked to needs

Let X(i,j,k) be the value of the variable j for the individual i belonging

to the group k.

M(-,j,-) is the global mean, M(i,j,- ) is the within group mean.

The same notations are used for the standard deviations: S(-,j,-) and S(i,j,-)

We can compute

- a classical centered PCA: X(i,j,k)-M(-,j,-)

- a classical normed PCA: (X(i,j,k)-M(-,j,-))/S(-,j,-)

- a classical within PCA: X(i,j,k)-M(i,j,-)

- a normed within PCA: (X(i,j,k)-M(i,j,-))/S(-,j,-)

- a within group normalized PCA: (X(i,j,k)-M(i,j,-))/S(i,j,-)

- a Partial normed PCA with the variances of variables centering by groups:

(X(i,j,k)-M(i,j,-))/S*(-,j,-)

The within group normalized PCA is used for studies where the group

variances are very different between groups for a same variable.

The partial normed PCA is used for studies where the group variances are

very different between variables.

Third level: separating the size and the shape, removing the size effect

It is a complicated problem with such kinds of answers:

- removing the first factor,

- using residuals of regression,

- using a double centering on the logarithm values....

With ade4, first of all, you can compute

1) a normed PCA: >pca1 = dudi.pca(X)

2) a simple within analysis > wit1 = within (pca1)

For removing the size effect using the first factor, you can :

1) compute of a normed PCA > pca1 = dudi.pca(X)

2) remove the first factor every where > Y = apply(pca1$tab, 2, function(x)

residuals(lm(x~pca1$l1[,1])))

3) compute again a centered PCA on the table Y > pca2 = dudi.pca(Y,scale=F)

4) and compute a simple within analysis > wit2 = within (pca2)

Many other solutions may be applied. That depends of your own abilities.

And at last, it is far from a simple software problem !!

D. Chessel & A. Dufour

**Next message:**Charline Laurent: "(no subject)"**Previous message:**Daniel Chessel: "Re: Co-inertie RLQ et son RV"**In reply to:**Alexandra Lima: "MGPCA & Discriminant Analysis"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*
This archive was generated by hypermail 2b30
: Tue Sep 07 2004 - 13:30:56 MEST
*