Exercise 1 - Gene Prediction
Aim: identify genes in human genomic sequences
We will use three different approaches to identify genes within a human
genomic sequence. Compare the results of each method.
Identify and mask repeated sequences
This will be performed with RepeatMasker (***), three sites are available:
1- Methods ab initio
We will use Grail (*** ) and GenScan (***):
2- Search for transcribed regions (EST, cDNA) in genomic DNA
- Search for ESTs matching the genomic sequence (masked) with:
BLASTN,
use the database est_human (***)
- Retrieve the first 20 ESTs using the Get selected sequences utility.
(***)
- Assemble theses ESTs with CAP3
(***)
- Align the cDNAs and genomic DNA with SIM4
(*** *** )
3- Comparative approach: search protein coding regions by similarity
- Search for proteins matching the genomic sequence (masked) with
BLASTX,
use the swissprot database (***)
- Retrieve the closest homologue (***)
- Align this protein to the genomic DNA with Wise2 at
Sanger or
Pasteur,
use parameter: genewise protein to genomic DNA
(***)
Now repeat the same exercise with two other human genomic sequences:
If you have problems or comments...
Back to PBIL home page