genet {ade4}R Documentation

A class of data: tables of populations and alleles

Description

There are multiple formats of genetic data. The functions of ade4 associated genetic data use the class genet. An object of the class genet is a list containing at least one data frame whose lines are groups of individuals (populations) and columns alleles forming blocks associated with the locus. They contain allelic frequencies expressed as a percentage.
The function char2genet ensures the reading of tables crossing diploid individuals arranged by groups (populations) and polymorphic loci. Data frames containing only strings of characters are transformed in tables of allelic frequencies of the class genet. In entry a row is an individual, a variable is a locus and a value is a string of characters, for example ' 012028 ' for a heterozygote carrying alleles 012 and 028, ' 020020 ' for a homozygote carrying two alleles 020 and ' 000000 ' for a not classified locus (missing data).
The function count2genet reads data frames containing allelic countings by populations and allelic forms classified by locus.
The function freq2genet reads data frames containing allelic frequencies by populations and allelic forms classified by locus.
In these two cases, use as names of variables of strings of characters xx.yyy where xx are the names of locus and yyy a name of allelic forms in this locus. The analyses on this kind of data having to use compact labels, these functions classify the names of the populations, the names of the loci and the names of the allelic forms in vectors and re-code in a simple way starting with P for population, L for locus and 1,..., m for the alleles.

Usage

char2genet(X, pop, complete)
count2genet(PopAllCount)
freq2genet(PopAllFreq)

Arguments

X a data frame of strings of characters (individuals in row, locus in variables), the value coded '000000' or two alleles of 6 characters
pop a factor with the same number of rows than df classifying the individuals by population
complete a logical value indicating a complete issue or not, by default FALSE
PopAllCount a data frame containing integers: the occurrences of each allelic form (column) in each population (row)
PopAllFreq a data frame containing values between 0 and 1: the frequencies of each allelic form (column) in each population (row)

Details

As a lot of formats for genetic data are published in literature, a list of class genet contains at least a table of allellic frequencies and an attribut loc.blocks. The populations (row) and the variables (column) are classified by alphabetic order. In the component comp, each individual per locus of m alleles is re-coded by a vector of length m: for hererozygicy 0,...,1,...,1,...,0 and homozygocy 0,...,2,0.

Value

char2genet returns a list of class genet with :

$tab a frequencies table of poplations (row) and alleles (column)
$center the global frequency of each allelic form calculated on the overall individuals classified on each locus
$pop.names a vector containing the names of populations present in the data re-coded P01, P02, ...
$all.names a vector containing the names of the alleles present in the data re-coded L01.1, L01.2, ...
$loc.blocks a vector containing the number of alleles by loci
$loc.fac a factor sharing the alleles by loci
$loc.names a vector containing the names of loci present in the data re-coded L01, ..., L99
$pop.loc a data frame containing the number of genus allowing the calculation of frequencies
$comp the complete individual typing with the code 02000 or 01001 if the option complete is TRUE
$comp.pop a factor indicating the population if the option complete is TRUE
count2genet and freq2genet return a list of class genet which don't contain the components pop.loc and complete.

Author(s)

Daniel Chessel

Examples

data(casitas)
casitas[24,]
casitas.pop <- as.factor(rep(c("dome", "cast", "musc", "casi"), c(24,11,9,30)))
casi.genet <- char2genet(casitas, casitas.pop, complete=TRUE)
names(casi.genet$tab) 
casi.genet$tab[,1:8] 
casi.genet$pop.names
casi.genet$loc.names
casi.genet$all.names
casi.genet$loc.blocks # number of allelic forms by loci
casi.genet$loc.fac # factor classifying the allelic forms by locus
casi.genet$pop.loc # table populations loci
names(casi.genet$comp)
casi.genet$comp[1:4,]
casi.genet$comp.pop
casi.genet$center
apply(casi.genet$tab,2,mean)
casi.genet$pop.loc[,"L15"]
casi.genet$tab[, c("L15.1","L15.2")]
class(casi.genet)
casitas.coa <- dudi.coa(casi.genet$comp, scannf = FALSE)
s.class(casitas.coa$li,casi.genet$comp.pop)

Worked out examples


> library(ade4)
> ### Name: genet
> ### Title: A class of data: tables of populations and alleles
> ### Aliases: genet char2genet count2genet freq2genet
> ### Keywords: multivariate
> 
> ### ** Examples
> 
> data(casitas)
> casitas[24,]
      Aat    Amy    Es1    Es2   Es10    Hbb   Gpd1   Idh1   Mod1   Mod2    Mpi
24 100100 080100 094094 100100 000000 000000 000000 125125 110110 100100 100100
       Np   Pgm1   Pgm2    Sod
24 000000 100100 100100 100100
> casitas.pop <- as.factor(rep(c("dome", "cast", "musc", "casi"), c(24,11,9,30)))
> casi.genet <- char2genet(casitas, casitas.pop, complete=TRUE)
> names(casi.genet$tab) 
 [1] "L01.1" "L01.2" "L02.1" "L02.2" "L03.1" "L03.2" "L04.1" "L04.2" "L05.1"
[10] "L05.2" "L05.3" "L06.1" "L06.2" "L06.3" "L07.1" "L07.2" "L08.1" "L08.2"
[19] "L08.3" "L08.4" "L09.1" "L09.2" "L09.3" "L10.1" "L10.2" "L11.1" "L11.2"
[28] "L12.1" "L12.2" "L12.3" "L12.4" "L13.1" "L13.2" "L13.3" "L14.1" "L14.2"
[37] "L15.1" "L15.2"
> casi.genet$tab[,1:8] 
       L01.1     L01.2     L02.1     L02.2     L03.1     L03.2     L04.1
P1 0.2500000 0.7500000 0.8333333 0.1666667 0.8500000 0.1500000 0.0000000
P2 0.3181818 0.6818182 1.0000000 0.0000000 0.2272727 0.7727273 0.0000000
P3 0.0000000 1.0000000 0.8125000 0.1875000 1.0000000 0.0000000 0.0000000
P4 0.0000000 1.0000000 0.0000000 1.0000000 0.0000000 1.0000000 0.9444444
        L04.2
P1 1.00000000
P2 1.00000000
P3 1.00000000
P4 0.05555556
> casi.genet$pop.names
    P1     P2     P3     P4 
"casi" "cast" "dome" "musc" 
> casi.genet$loc.names
   L01    L02    L03    L04    L05    L06    L07    L08    L09    L10    L11 
 "Aat"  "Amy"  "Es1" "Es10"  "Es2" "Gpd1"  "Hbb" "Idh1" "Mod1" "Mod2"  "Mpi" 
   L12    L13    L14    L15 
  "Np" "Pgm1" "Pgm2"  "Sod" 
> casi.genet$all.names
     L01.1      L01.2      L02.1      L02.2      L03.1      L03.2      L04.1 
 "Aat.080"  "Aat.100"  "Amy.080"  "Amy.100"  "Es1.094"  "Es1.100" "Es10.060" 
     L04.2      L05.1      L05.2      L05.3      L06.1      L06.2      L06.3 
"Es10.100"  "Es2.095"  "Es2.098"  "Es2.100" "Gpd1.095" "Gpd1.100" "Gpd1.105" 
     L07.1      L07.2      L08.1      L08.2      L08.3      L08.4      L09.1 
 "Hbb.110"  "Hbb.120" "Idh1.050" "Idh1.080" "Idh1.100" "Idh1.125" "Mod1.100" 
     L09.2      L09.3      L10.1      L10.2      L11.1      L11.2      L12.1 
"Mod1.110" "Mod1.120" "Mod2.100" "Mod2.120"  "Mpi.100"  "Mpi.120"   "Np.080" 
     L12.2      L12.3      L12.4      L13.1      L13.2      L13.3      L14.1 
  "Np.085"   "Np.090"   "Np.100" "Pgm1.060" "Pgm1.080" "Pgm1.100" "Pgm2.080" 
     L14.2      L15.1      L15.2 
"Pgm2.100"  "Sod.080"  "Sod.100" 
> casi.genet$loc.blocks # number of allelic forms by loci
L01 L02 L03 L04 L05 L06 L07 L08 L09 L10 L11 L12 L13 L14 L15 
  2   2   2   2   3   3   2   4   3   2   2   4   3   2   2 
> casi.genet$loc.fac # factor classifying the allelic forms by locus
L01.1 L01.2 L02.1 L02.2 L03.1 L03.2 L04.1 L04.2 L05.1 L05.2 L05.3 L06.1 L06.2 
  L01   L01   L02   L02   L03   L03   L04   L04   L05   L05   L05   L06   L06 
L06.3 L07.1 L07.2 L08.1 L08.2 L08.3 L08.4 L09.1 L09.2 L09.3 L10.1 L10.2 L11.1 
  L06   L07   L07   L08   L08   L08   L08   L09   L09   L09   L10   L10   L11 
L11.2 L12.1 L12.2 L12.3 L12.4 L13.1 L13.2 L13.3 L14.1 L14.2 L15.1 L15.2 
  L11   L12   L12   L12   L12   L13   L13   L13   L14   L14   L15   L15 
Levels: L01 L02 L03 L04 L05 L06 L07 L08 L09 L10 L11 L12 L13 L14 L15
> casi.genet$pop.loc # table populations loci
   L01 L02 L03 L04 L05 L06 L07 L08 L09 L10 L11 L12 L13 L14 L15
P1  30  30  30  30  30  30  30  30  30  30  30  30  30  30  30
P2  11  11  11  11  11  11  11  11  11  11  11  11  11  11  11
P3  24  24  24  23  24  23  23  24  24  24  24  23  24  24  24
P4   9   9   9   9   9   8   9   9   9   9   8   9   8   8   9
> names(casi.genet$comp)
 [1] "L01.1" "L01.2" "L02.1" "L02.2" "L03.1" "L03.2" "L04.1" "L04.2" "L05.1"
[10] "L05.2" "L05.3" "L06.1" "L06.2" "L06.3" "L07.1" "L07.2" "L08.1" "L08.2"
[19] "L08.3" "L08.4" "L09.1" "L09.2" "L09.3" "L10.1" "L10.2" "L11.1" "L11.2"
[28] "L12.1" "L12.2" "L12.3" "L12.4" "L13.1" "L13.2" "L13.3" "L14.1" "L14.2"
[37] "L15.1" "L15.2"
> casi.genet$comp[1:4,]
   L01.1 L01.2 L02.1 L02.2 L03.1 L03.2 L04.1 L04.2 L05.1 L05.2 L05.3 L06.1
01     0     2     2     0     1     1     0     2     2     0     0     0
02     1     1     2     0     2     0     0     2     0     0     2     0
03     1     1     1     1     2     0     0     2     2     0     0     0
04     0     2     1     1     2     0     0     2     0     0     2     2
   L06.2 L06.3 L07.1 L07.2 L08.1 L08.2 L08.3 L08.4 L09.1 L09.2 L09.3 L10.1
01     0     2     1     1     0     0     1     1     0     2     0     0
02     0     2     1     1     0     0     2     0     0     2     0     1
03     2     0     1     1     0     0     2     0     0     2     0     0
04     0     0     2     0     0     0     2     0     0     2     0     0
   L10.2 L11.1 L11.2 L12.1 L12.2 L12.3 L12.4 L13.1 L13.2 L13.3 L14.1 L14.2
01     2     2     0     0     0     0     2     0     0     2     0     2
02     1     2     0     0     0     0     2     0     0     2     0     2
03     2     2     0     0     0     0     2     0     0     2     0     2
04     2     2     0     0     0     0     2     0     0     2     0     2
   L15.1 L15.2
01     0     2
02     0     2
03     0     2
04     0     2
> casi.genet$comp.pop
 [1] P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1
[26] P1 P1 P1 P1 P1 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P2 P3 P3 P3 P3 P3 P3 P3 P3 P3
[51] P3 P3 P3 P3 P3 P3 P3 P3 P3 P3 P3 P3 P3 P3 P4 P4 P4 P4 P4 P4 P4 P4
Levels: P1 P2 P3 P4
> casi.genet$center
      L01.1       L01.2       L02.1       L02.2       L03.1       L03.2 
0.148648649 0.851351351 0.750000000 0.250000000 0.702702703 0.297297297 
      L04.1       L04.2       L05.1       L05.2       L05.3       L06.1 
0.116438356 0.883561644 0.081081081 0.195945946 0.722972973 0.375000000 
      L06.2       L06.3       L07.1       L07.2       L08.1       L08.2 
0.534722222 0.090277778 0.465753425 0.534246575 0.101351351 0.006756757 
      L08.3       L08.4       L09.1       L09.2       L09.3       L10.1 
0.608108108 0.283783784 0.128378378 0.797297297 0.074324324 0.689189189 
      L10.2       L11.1       L11.2       L12.1       L12.2       L12.3 
0.310810811 0.794520548 0.205479452 0.061643836 0.089041096 0.061643836 
      L12.4       L13.1       L13.2       L13.3       L14.1       L14.2 
0.787671233 0.027397260 0.130136986 0.842465753 0.082191781 0.917808219 
      L15.1       L15.2 
0.121621622 0.878378378 
> apply(casi.genet$tab,2,mean)
     L01.1      L01.2      L02.1      L02.2      L03.1      L03.2      L04.1 
0.14204545 0.85795455 0.66145833 0.33854167 0.51931818 0.48068182 0.23611111 
     L04.2      L05.1      L05.2      L05.3      L06.1      L06.2      L06.3 
0.76388889 0.05719697 0.37247475 0.57032828 0.56666667 0.37916667 0.05416667 
     L07.1      L07.2      L08.1      L08.2      L08.3      L08.4      L09.1 
0.54431818 0.45568182 0.13446970 0.01136364 0.47935606 0.37481061 0.09365530 
     L09.2      L09.3      L10.1      L10.2      L11.1      L11.2      L12.1 
0.75356692 0.15277778 0.80833333 0.19166667 0.59943182 0.40056818 0.12500000 
     L12.2      L12.3      L12.4      L13.1      L13.2      L13.3      L14.1 
0.14772727 0.12500000 0.60227273 0.06250000 0.26704545 0.67045455 0.10729167 
     L14.2      L15.1      L15.2 
0.89270833 0.25000000 0.75000000 
> casi.genet$pop.loc[,"L15"]
[1] 30 11 24  9
> casi.genet$tab[, c("L15.1","L15.2")]
   L15.1 L15.2
P1     0     1
P2     0     1
P3     0     1
P4     1     0
> class(casi.genet)
[1] "genet" "list" 
> casitas.coa <- dudi.coa(casi.genet$comp, scannf = FALSE)
> s.class(casitas.coa$li,casi.genet$comp.pop)
> 
> 
> 
> 

[Package ade4 Index]