alfacinha

Simulating the evolution of biological sequences with neighbor-dependent substitutions.

This package is aimed at simulating the evolution of sequences along phylogenetic trees.

In this tutorial, we will show how a sequence can be built and assigned to the root of a tree.

And how we can simulate the evolution of this sequence along the tree down to the leaves.

Building a sequence

A sequence can be either read from a file (in FASTA format) or built from scratch.

import sequence
s=sequence.Sequence(fic="sequence.fst")
print s
or
import sequence
s=sequence.Sequence()
import compte
c=compte.Proportion(fic="create.prop")  # a file with the proportions for all letters (here uniform distribution)
print c
s.read_prop(c,long=1000)  # create a sequence of length 1000 with the proportions given by 'c'
print s.fasta()
s.write_fasta("store-sequence.fst",mode="write")
You can view here both files sequence.fst and create.prop.
help(evol)  # for more information on sequence manipulation.

Loading an evolutionary model

import model
m=model.Model(fic="model.prop") # CpG methylation-deamination process is the only neighbour-dependent substituon process modelled here.
print m
You can view file model.prop here.
help(model)  # for more information on evolutionary models.

Loading a phylogenetic tree

A tree can either be defined by a specific string or be read from a file, both in NEWICK format.
import tree
t=tree.Node(newick="(Bovine:0.69395,(Gibbon:0.36079,(Orang:0.33636,(Gorilla:0.17147,(Chimp:0.19268, Human:0.11927):0.08386):0.06124):0.15057):0.54939,Mouse:1.21460):0.10;")
t.get_leaf_labels()
print t.matrix() # print the distance matrix of the tree
or
import tree
t=tree.Node(fic="tree.newick")
t.get_leaf_labels()
print t.matrix() # print the distance matrix of the tree
You can view file tree.newick here.
help(tree)  # for more information on manipulating trees.

Simulating the evolution on a sequence

With the objects created earlier, we are now able to evolve a sequence along the tree we have loaded, and according to the specified evolutionary model.
t.evolve_seq(s,m) # assign the sequence to the root and simulate evolution
t.write_phylip("aligned-sequences.phylip",mode="append") # export aligned sequences in phylip format
Note that there is no obligation to build a tree in order to simulate evolution on a sequence. Once the sequence is built and the evolutionary model is loaded:
old=s.copy() 
s.evolve(m,0.04) # simulate 0.04 substitutions per site on sequence s
s.compare(old) # count number of observed differencies
s.write_fasta("evolved-sqce.fst",mode="append")

Need help?

If you have any problems or comments about alfacinha, please send an email to Leonor Palmeira.


Back to PBIL home page