Exercise 5 - Molecular Phylogenies
Aims: compute different phylogenies and interpret the results
1- Insulin phylogeny
Start program phylojava: in the command shell, type:
phylojava &
Open the multiple alignment containing vertebrate insulin proteins in
MASE format that you previously created during Exercise 3 (if needed
get one such file here).
Compute the molecular phylogeny of vertebrate insulin proteins using
the Neighbour-Joining (NJ)
method and Kimura's
distance. Bootstrap the tree with 500 replicates. The correspondence between
SWISS-PROT sequences and species names is available here.
Identify the different gene duplications in the tree. In which species
did they occur? Does the tree correspond to the expected vertebrate phylogeny?
2- Universal phylogeny
File lsufrags.mase
contains a set
of prealigned rRNA sequences from the large (LSU) and the small (SSU) subunits:
- Save it in text format on your computer.
- Load it with Phylojava.
- Visualize the set of reliably aligned sites called
all sequences
.
- Build the universal phylogeny using the
Kimura 2-P
evolutionary distance, bootstrap it with 500 replicates.
- Is the position of the Euglena chloroplast sequence expected?
3- A 250 MY old bacterium: is it possible?
Vreeland et al. have published the isolation
of a 250 million years-old bacterium from a salt crystal. Their data are
reproduced in a file of aligned bacterial 16S rRNA sequences: permians.mase
.
- Compare usage of parsimony and of distances + NJ methods.
- What is the very important information vehiculed by the branch lengths
in the NJ analysis?
- What do you think of the conclusions of Vreeland et al.?
The results of Vreeland et al. have been severely critized by Graur and Pupko, who concluded that the isolated bacterium
is most probably recent in age.
4- The evolutionary origins of HIV-1 and HIV-2 viruses
Gao et al. have published a phylogenetic
analysis of the pol gene of HIV-1 and HIV-2 viruses and of their
simian homologs (SIV).
File hivpol.mase
contains public
protein sequences with which it is possible to attempt to reproduce their
results. Sequence FIV/Oma (Feline Immunodeficiency Virus) is used as an outgroup
for the analysis. File hivpol-dna.mase
reproduces the same alignment at the DNA level:
- Identify which simian species are at the origin of HIV-1 and HIV-2 viruses?
- run a phyml analysis on the protein sequences with JTT substitution matrix
and 1 rate category (running time ~ 2 min).
- Conduct analyses on Ka and on Ks distances when possible.
File hivpol.pdf
contains the
complete article by Gao et al.
If you have problems or comments...
Back to PBIL home page