genet {ade4} | R Documentation |
There are multiple formats of genetic data. The functions of ade4 associated genetic data use the class genet
.
An object of the class genet
is a list containing at least one data frame whose lines are groups of individuals (populations) and columns alleles forming blocks associated with the locus.
They contain allelic frequencies expressed as a percentage.
The function char2genet
ensures the reading of tables crossing diploid individuals arranged by groups (populations) and polymorphic loci. Data frames containing only strings of characters are transformed in tables of allelic frequencies of the class genet
.
In entry a row is an individual, a variable is a locus and a value is a string of characters, for example ' 012028 ' for a heterozygote carrying alleles 012 and 028, ' 020020 ' for a homozygote carrying two alleles 020 and ' 000000 ' for a not classified locus (missing data).
The function count2genet
reads data frames containing allelic countings by populations and allelic forms classified by locus.
The function freq2genet
reads data frames containing allelic frequencies by populations and allelic forms classified by locus.
In these two cases, use as names of variables of strings of characters xx.yyy
where xx
are the names of locus and yyy
a name of allelic forms in this locus.
The analyses on this kind of data having to use compact labels, these functions classify the names of the populations, the names of the loci and the names of the allelic forms in vectors and re-code in a simple way starting with P for population, L for locus and 1,..., m for the alleles.
char2genet(X, pop, complete) count2genet(PopAllCount) freq2genet(PopAllFreq)
X |
a data frame of strings of characters (individuals in row, locus in variables), the value coded '000000' or two alleles of 6 characters |
pop |
a factor with the same number of rows than |
complete |
a logical value indicating a complete issue or not, by default FALSE |
PopAllCount |
a data frame containing integers: the occurrences of each allelic form (column) in each population (row) |
PopAllFreq |
a data frame containing values between 0 and 1: the frequencies of each allelic form (column) in each population (row) |
As a lot of formats for genetic data are published in literature, a list of class genet
contains at least a table of allellic frequencies and an attribut loc.blocks
. The populations (row) and the variables (column) are classified by alphabetic order.
In the component comp
, each individual per locus of m alleles is re-coded by a vector of length m: for hererozygicy 0,...,1,...,1,...,0 and homozygocy 0,...,2,0.
char2genet
returns a list of class genet
with :
$tab |
a frequencies table of poplations (row) and alleles (column) |
$center |
the global frequency of each allelic form calculated on the overall individuals classified on each locus |
$pop.names |
a vector containing the names of populations present in the data re-coded P01, P02, ... |
$all.names |
a vector containing the names of the alleles present in the data re-coded L01.1, L01.2, ... |
$loc.blocks |
a vector containing the number of alleles by loci |
$loc.fac |
a factor sharing the alleles by loci |
$loc.names |
a vector containing the names of loci present in the data re-coded L01, ..., L99 |
$pop.loc |
a data frame containing the number of genus allowing the calculation of frequencies |
$comp |
the complete individual typing with the code 02000 or 01001 if the option |
$comp.pop |
a factor indicating the population if the option |
count2genet
and freq2genet
return a list of class genet
which don't contain the components pop.loc
and complete
.
Daniel Chessel
data(casitas) casitas[24,] casitas.pop <- as.factor(rep(c("dome", "cast", "musc", "casi"), c(24,11,9,30))) casi.genet <- char2genet(casitas, casitas.pop, complete=TRUE) names(casi.genet$tab) casi.genet$tab[,1:8] casi.genet$pop.names casi.genet$loc.names casi.genet$all.names casi.genet$loc.blocks # number of allelic forms by loci casi.genet$loc.fac # factor classifying the allelic forms by locus casi.genet$pop.loc # table populations loci names(casi.genet$comp) casi.genet$comp[1:4,] casi.genet$comp.pop casi.genet$center apply(casi.genet$tab,2,mean) casi.genet$pop.loc[,"L15"] casi.genet$tab[, c("L15.1","L15.2")] class(casi.genet) casitas.coa <- dudi.coa(casi.genet$comp, scannf = FALSE) s.class(casitas.coa$li,casi.genet$comp.pop)