Multivariate analyses in soil microbial ecology: a new paradigm

On-line reproduction of the paper by
Jean Thioulouse, Yves Prin and Robin Duponnois (PDF file)*.

This web page allows to redo all the computations and graphical displays thanks to the Rweb system.
You can play with the R code and click the "Do it again !" buttons.
R code samples and data files can be obtained here: EES paper data files

*Thioulouse J, Prin Y, Duponnois R (2012). Multivariate analyses in soil microbial ecology: a new paradigm. Environmental and Ecological Statistics. (Accepted) (PDF file).

Note: Axes direction can be inverted when compared to figures obtained on other computers. The sign of eigenvectors is arbitrary and depends on processor numerical precision. These inversions do not change the interpretation of graphs.

Abstract

Mycorrhizal symbiosis is a key component of a sustainable soil-plant system, governing the cycles of major plant nutrients and vegetation cover. The mycorrhizosphere includes plants roots, the mycorrhizal fungi, and a complex microbial compartment. A large number of methods have been used to characterize the genetic and functional diversity of these soil microbial communities.We present here a review of the multivariate data analysis methods that have been used in 16 research articles published in the 2005-2009 period. "Descriptive" multivariate data analysis methods have been priviledged over classical "predictive" methods and univariate statistical tests. Data sets, multivariate data analysis methods, graphical outputs and interpretation results are presented and explained in details on several examples coming from some of the 16 articles. These multivariate and graphical methods are available in the ade4 package for the R statistical software. The discussion underlines the importance of using appropriate statistical methods to obtain a good description of soil microbiological and biochemical indicators and a good evaluation of soil quality.

Key words: Mycorrhizal symbiosis, Soil microbial diversity, Descriptive multivariate data analysis, BGA, Coinertia analysis, ade4

1. Introduction

The paper by Thioulouse et al. (2012) uses three multivariate data analysis methods: PCA (Principal Components Analysis), BGA (Between Group Analysis), and COIA (Co-Inertia Analysis). PCA is most the basic mutivariate data analyis method. BGA can be applied when samples belong to several groups. This is the case for example when we want to compare the effect of different treatments, like different levels of amendment, or different rates of mycorrhizal inoculation, on plant growth or on microbial communities. CoIA is useful to analyze the relationships between two tables having the same samples in rows. It can be used for example to explore the relationships between in situ catabolic potential (ISCP, representing bacterial functional diversity) and plant growth variables, or between soil variables and DNA fingerprints.

There are four figures in the paper that show examples of use of these methods. These four figures are reprinted from papers published by the authors, and this reproducibility Web page proposes to redo these figures and the corresponding computations, using the R software.

The first three figures display the results of BGA on plant parameters (Figure 1), and on SIR (Substrate Induced Respiration) responses (Figure 2 and 3). SIR is a method that allows to assess the functional diversity of microbial soil communities. It is based on the profiles of CO2 production of soil bacterial communities in presence of a range of simple organic compounds (amino-acids, organic acids, glucides). These profiles are called ISCP (for "in situ catabolic potential").

2. BGA: Figure 1

Figure 1 is reprinted from Figure 1 in Thioulouse et al. 2012. In the original paper (Faye A, Krasova-Wade T, Thiao M, Thioulouse J, Neyra M, Prin Y, Galiana A, Ndoye I, Dreyfus B, Duponnois R (2009) Controlled ectomycorrhization of an exotic legume tree species Acacia holosericea affects the structure of root nodule bacteria community and their symbiotic effectiveness on Faidherbia albida, a native sahelian acacia. Soil Biology and Biochemistry 41:1245-1252), authors use BGA to show that the biomass increase of Faidherbia albida seedlings is positively linked to the inoculation of Bradyrhizobia spp. Furthermore, this effect varies according to the origin of Bradyrhizobia isolates. Bradyrhizobia strains were isolated from a controlled mycorrhization experiment with an exotic Acacia species (A. holosericea) and an ectomycorrhizal fungus, Pisolithus albus IR100. This plantation was located in Senegal.

Three origins of isolates were compared, and four variables were measured on F. albida seedlings: shoot and root biomass (SB and RB) and total number and dry weight of nodules (TN and DW). The three isolate origins were:

1. Bacterial strains isolated from the soil of a plantation of A. holosericea previously inoculated with the ectomycorrhizal fungus P. albus IR100 (IR100S in Figure 1)

2. Bacterial strains isolated from the soil of a plantation of A. holosericea uninoculated with the ectomycorrhizal fungus (NIS in Figure 1)

3. Bacterial strains isolated from the soil of the F. albida parkland surrounding the A. holosericea plantation (PS in Figure 1).

On Figure 1, the three origins were represented with convex hulls surrounding the corresponding samples. A multivariate permutation test showed that the differences were statistically significant (p < 0.01), and the use of convex hulls on Figure 1 helped underline these differences. Faye et al. (2009) concluded that exotic plant species introduction (A. holosericea is an Australian Acacia) can drastically affect the structure and symbiotic effectiveness of native Bradyrhizobia populations and noted that this could limit the natural regeneration of native (Sahelian) plant species such as F. albida.

Figure 1: Between-group analysis (BGA) of Faidherbia albida growth (shoot and root biomass: SB and RB, respectively) and nodule formation (total number and dry weight of nodules per plant: TN and WN, respectively). A: Plot of variable loadings. B: Plot of sample scores. The scale is given by the value d in the upper right corner: this value represents the length of the side of background grid squares. The second principal component opposes the shoot biomass (up) to the nodule dry weight (down). The plot of sample scores (B) is split in three groups, according to the origin of the Bradyrhizobia isolates: PS, soil of F. albida parkland collected outside the A. holosericea plantation, NIS, soil of plantation with not inoculated trees, and IR100S, soil of plantation with IR100-inoculated trees. The circle inside each convex hull gives the position of the gravity center of each group. (Reprinted from Faye et al. (2009) with kind permission from Elsevier).


This code piece (code1.R), reads the data table (Tab1.txt) and computes a PCA on it (dudi.pca function, output object acp1).

The factor defining the three soil treatments (facrhizo) is read from file FacRhizo.txt and is used to compute the BGA (bca function, output object bets).

The first (upper) graph is plotted using function s.arrow on variables (columns) coordinates (bets$co). The second (lower) graph is a superimposition of two graphs: one is drawn with the s.label function to draw sample labels (bets$ls), and the other with the s.chull function to draw the convex hulls surrouding the three groups (PS = Parkland samples, NIS = soil from plantations with non-inoculated trees, IR: soil from plantations with trees inoculated by the ectomycorrhizal fungus Pisolithus albus IR100).

The direction of axis 2 (vertical axis) is inverted when compared to the figures of Thioulouse et al 2012. This inversion does not change the interpretation as the two graphs are inverted in the same way.

2. BGA: Figure 2

Figure 2 from Thioulouse et al. 2012 is reprinted from Kisa et al. (2007). In this paper, authors used BGA to show that the functional diversity of soil microbial communities (measured by ISCP) is altered by the exotic tree species Eucalyptus camaldulensis, and that the inoculation of an arbuscular mycorrhizal fungus (Glomus intraradices) can counterbalance this negative influence.

Figure 2 shows SIR substrates (top) and the position of soil samples on which SIR profiles were measured (bottom). The five pointed irregular stars on this figure show the five experimental repeats for each treatment and their mean position (circle at the center of the star). The permutations test of BGA confirmed that the difference between the four groups was highly significant (p < 0.001). The effect of Eucalyptus camaldulensis on bacterial functional diversity (difference between WEC and FA), and the influence of Glomus intraradices inoculation, are indeed very clear. Kisa et al. (2007) conclude that arbuscular mycorrhizal symbiosis with Glomus intraradices can counterbalance the negative influence of exotic tree species on soil microbial communities.

Figure 2: Between-group analysis (BGA) of the SIR responses with respect to the soil treatments. WEC, without Eucalyptus camaldulensis seedlings. FA, preplanting fertilizer application. 1, l-phenylalanine; 2, l-glutamine; 3, l-serine; 4, l-arginine; 5, l-asparagine; 6, l-histidine; 7, l-lysine; 8, l-glutamic acid; 9, l-tyrosine; 10, l-cystein; 11, d-glucose; 12, d-mannose; 13, sucrose; 14, d-glucosamine; 15, N-methyld-glucamine; 16, succinamide; 17, ascorbic acid; 18, citric acid; 19, fumaric acid; 20, gluconic acid; 21, quinic acid; 22, malonic acid; 23, formic acid; 24, ketoglutaric acid; 25, ketobutyric acid; 26, succinic acid; 27, tartaric acid; 28, uric acid; 29, oxalic acid; 30, gallic acid; 31, malic acid; 32, DL-hydroxy-butyric acid. (Reprinted from Kisa et al. (2007) with kind permission from John Wiley and Sons).


This code piece (code2.R), reads the data table (Tab2.txt) and computes a PCA on it (dudi.pca function, output object acp1).

The factor defining the four soil treatments (fTr) is read from file CodeTr.txt and is used to compute the BGA (bca function, output object bets).

The first (upper) graph is plotted using function s.label on substrates (columns) coordinates (bets$co). The second (lower) graph is drawn with the s.class function to draw the four categories of samples (FA = preplanting fertilizer application, Gi = inoculation with Glomus intraradices, WEC = soil from plantations without Eucalyptus camaldulensis, and C = Control).

3. BGA: Figure 3

Figure 3 from Thioulouse et al. 2012 is reprinted from Ouahmane et al. (2009). The aim of this analysis was to show that the inoculation of Pinus halepensis with the ectomycorrhizal fungus Pisolithus sp. strain PH4 had a strong effect on soil microbial functional diversity and on rock phosphate (Khodjari Rock Phosphate, KRP) solubilisation. The first axis of BGA very clearly shows the effect of PH4 inoculation on functional diversity (ISCP profiles), so using the second axis to draw a factor map was not appropriate. In the upper part of the graph, substrate labels are ordered according to the substrate score on the first BGA axis. In the lower part, Gauss curves are adjusted to the parameters (mean and standard deviation) of sample scores in each treatment. The mean and standard deviation of the fives samples belonging to each of the four treatments (Control, KRP, PH4, PH4+PN) are computed and the corresponding Gauss curves are drawn.

This presentation shows, for each treatment, the optimal substrates (position of Gauss curves) and the functional diversity (Gauss curve width). The permutation test showed that the difference between treatments was highly significant (p < 0.001).

Figure 3: Graphical display (biplot) of BGA first axis showing the SIR with respect to the soil treatments. The upper part of the figure shows the scores of the 31 substrates on the first BGA axis. The four Gauss curves in the lower part of the figure represent the mean and the variance of the scores of the soil samples on the first BGA axis. Control: un-inoculated and un-amended soil, KRP: soil amended with Khouribga Rock Phosphate, PH4: soil inoculated with Pisolithus sp. strain PH4, PH4+PN: soil amended with Khouribga Rock Phosphate and inoculated with Pisolithus sp. strain PH4. (Reprinted from Ouahmane et al. (2009) with kind permission from Springer Science+Business Media)


This code piece (code3.R), reads the data table (Tab3.txt) and computes a PCA on it (dudi.pca function, output object acp1).

The factor defining the four soil treatments (fsoil) is read from file CodeSoil.txt and is used to compute the BGA (bca function, output object bets).

The first graph is plotted using function sco.label on substrates (columns) coordinates on the first axis (bets$co[,1]). The second graph is drawn with the sco.gauss function to draw the four Gauss curves and the corresponding group labels.

4. Coinertia Analysis: Figure 4

The aim of graphical display in CoIA is to reveal the relationships between the two data tables. CoIA provides four sets of coordinates: one set for the rows and one set for the columns of the two tables. Figure 4 from Thioulouse et al. 2012 is reprinted from Ouahmane et al. (2007). This figure shows an example of the four graphics that can be drawn with these four sets of coordinates. In this example, the objective of the authors was to show the difference between native (AM) and allochtonous (Glomus intraradices) arbuscular mycorrhizal fungi inoculation on soil bacterial functional diversity and on rock phosphate alteration. They used CoIA to analyze the relationship between a table of SIR profiles and a table of plant variables. In the SIR profiles table, the columns correspond to the 28 substrates, and the rows to the 18 soil samples. In the plant variables table, the columns are Cupressus atlantica seedling height (H), shoot and root biomass (SB, RB), leaf P content (P), and mycorrhizal colonization (MC). The rows correspond to the same 18 soils as in the first table. The CoIA permutation test showed that the relationship between these two tables was highly significant.

Figure 4: Co-inertia analysis of the SIR responses of the soils inoculated with Glomus intraradices or the mixture of native arbuscular mycorrhizal fungi and/or Khouribga Rock Phosphate, plant growth, phosphorus leaf content and mycorrhizal colonization. A: Factor map of plant growth and mycorrhizal colonization variables (H, height; SB, shoot biomass; RB, root biomass; P, leaf P content; MC,mycorrhizal colonization). B: Factor map of SIR responses (d-mannose, 1; l-serine, 2; l- histidine, 3; l-tyrosine, 4; gluconic acid, 5; uric acid, 6; l-lysine, 7; l-glutamic acid, 8; sucrose, 9; succinamide, 10; cyclohexane, 11; l-glutamine, 12; citric acid, 13; ketobutyric acid, 14; tartaric acid, 15; DL-hydroxybutyric acid, 16; N-methyl-d-glucosamine, 17; d-glucose, 18; quinic acid, 19; l-asparagine, 20; succinic acid, 21; malic acid, 22; oxalic acid, 23; fumaric acid, 24; ascorbic acid, 25; malonic acid, 26; ketoglutaric acid, 27; l-arginine, 28). c Factor map of plant growth and mycorrhizal colonization (C, control (not inoculated); CP, Khouribga Rock Phosphate amendment; GI, Glomus intraradices inoculation; GIP, Glomus intraradices inoculation and Khouribga Rock Phosphate amendment; CAM, mixture of native arbuscular mycorrhizal fungi inoculation; CAMP, CAMinoculation and Khouribga Rock Phosphate amendment). d Factormap of SIR responses soil samples (for the legend, see c). (Reprinted from Ouahmane et al. (2007) with kind permission from Elsevier)

The first graph (Figure 4A) is the factor map of plant variables. The five variables are all oriented toward the left of the graph. This is a "size effect", meaning that the left side of the graph corresponds to samples where plant growth is better, while conversely the right side corresponds to a lesser growth of Cupressus atlantica seedlings.

On Figure 4C, we can see that this better growth is correlated with the inoculation of native arbuscular mycorrhizal fungi (CAM), and that this effect is even stronger when rock phosphate amendment is done (CAMP). Inoculation with the allochtonous fungi Glomus intraradices (GI) is also correlated with a better plant growth, but there is no additional rock phosphate amendment effect (GIP).

On Figure 4B, and 4D, we can see that plant growth is also linked to the functional diversity of the soil microbial community. The SIR substrates located on the left side of Figure 4B (particularly organic acids) correspond to a better plant growth, and are correlated with the inoculation of native arbuscular mycorrhizal fungi (CAM), alone or combined with rock phosphate amendment (CAMP). The effect of Glomus intraradices inoculation (GI) alone or combined with rock phosphate amendment (GIP) is also positive on plant growth, but clearly less than the effect of native arbuscular mycorrhizal fungi. Rock phosphate amendment alone (CP) is also positive, but less than in combination with fungi inoculation.

Authors concluded that the use of native arbuscular mycorrhizal fungi and their selective effect on soil microflora have to be considered in order to optimize the sustainable re-establishment of plant species in a degraded soil


This code piece (code4.R), reads the two data tables (Tab4SIR.txt and Tab4Cupr.txt) and computes a PCA on each of them (dudi.pca function, output objects acp1 and acp2).

The factor defining the six soil treatments (fsoil) is read from file Treat.txt)

The coinertia function is used to compute the COIA (output object coin1).

The four graphs are plotted using function s.arrow (plant growth and mycorrhizal colonization variables: coin1$li), s.label (SIR responses: coin1$co), and s.class (samples, grouped by treatment: coin1$lX and coin1$lY).

Note: The direction of axis 2 (vertical axis) is inverted when compared to the figures presented in Thioulouse et al. 2012. This inversion does not change the interpretation of graphs, as all the graphs are inverted in the same way.


If you have any problems or comments, please contact Jean Thioulouse.