Re: creating $samples for AMOVA from spreadsheet data

From: Stéphane Dray (dray@biomserv.univ-lyon1.fr)
Date: Thu Oct 27 2005 - 19:13:07 MEST

  • Next message: Patrick Kuss: "Re: creating $samples for AMOVA from spreadsheet data"

    Hello Patrick,
    I think that your command would be:

    data.short <- data[data$pop %in% c(1:3),]
    data.short

    pop <- as.factor(data.short$pop)
    samples <- as.data.frame(model.matrix(~-1 + pop))
    names(samples) <- levels(pop)

    One thing important when you subset factors:
    a=as.factor(rep(c("d","e","f"),c(3,3,4)))
    a
     [1] d d d e e e f f f f
    Levels: d e f

    If you subset a factor, by defaults all levels are kept :
    b=a[1:6]
    b
    [1] d d d e e e
    Levels: d e f

    Use factor to 'refactorise' or the drop argument to remove non-used levels:

    > b2=factor(b)
    > b2
    [1] d d d e e e
    Levels: d e
    > b3=a[1:6,drop=T]
    > b3
    [1] d d d e e e
    Levels: d e

    For other questons, I am sure that Sandrine Pavoine will give you a more
    complete answer.

    Patrick Kuss wrote:

    >Good evening,
    >
    >I have been using R (ade4) and amova() to analyze my RAPD data from plants.
    >Creating the adequate $distance and $structure df works fine. However I have a
    >problem creating the $sample df from my original data.
    >
    >My data set looks something like this:
    >
    >region <- rep(c("east","west"),each=8)
    >pop <- c(rep(1:4,each=4))
    >ind <- c(rep(1:4,4))
    >l.1 <- c(1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0)
    >l.2 <- sample(c(0,1),16,replace=T)
    >l.3 <- sample(c(0,1),16,replace=T)
    >l.4 <- sample(c(0,1),16,replace=T)
    >l.5 <- sample(c(0,1),16,replace=T)
    >l.6 <- sample(c(0,1),16,replace=T)
    >data <- data.frame(region,pop,ind,l.1,l.2,l.3,l.4,l.5,l.6)
    >data
    >
    >Creating the $samples df works fine that way
    >
    >pop <- as.factor(pop)
    >samples <- as.data.frame(model.matrix(~-1 + pop))
    >names(samples) <- levels(pop)
    >
    >But if I subset my original dataset:
    >
    >data.short <- data[data$pop %in% c(1:3),]
    >data.short
    >
    >and try to create the $samples again I still get the original $samples output.
    >
    >pop <- as.factor(pop)
    >samples <- as.data.frame(model.matrix(~-1 + pop))
    >names(samples) <- levels(pop)
    >
    >Any ideas how I can solve this? Additionally, how can I erase duplicate
    >genotypes and note double presence in the $samples df?
    >
    >I am happy for any comments. Cheers
    >
    >Patrick
    >
    >
    >
    >--
    >Patrick Kuss
    >PhD-student
    >Institute of Botany
    >University of Basel
    >Schönbeinstr. 6
    >CH-4056 Basel
    >+41 61 267 2976
    >
    >----------------------------------------------------------------
    >This message was sent using IMP, the Internet Messaging Program.
    >
    >
    >
    >

    -- 
    Stéphane DRAY (dray@biomserv.univ-lyon1.fr )
    Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - Lyon I
    43, Bd du 11 Novembre 1918, 69622 Villeurbanne Cedex, France
    Tel: 33 4 72 43 27 57       Fax: 33 4 72 43 13 88
    http://www.steph280.freesurf.fr/ 
    



    This archive was generated by hypermail 2b30 : Thu Oct 27 2005 - 19:15:32 MEST