Annotation - Exofish
Exofish (Exon Finding by Sequences Homology) is a genomic tool based on the assumption that coding regions are more conserved than non-coding regions through evolution. Therefore, if we compare the genome of two species, regions of homology should reveal coding regions. Exofish is performed with sequences of Tetraodon nigroviridis, and calibrated to annotate human genome.
A comparaison of DNA sequences is realised using the tblastx algorithm which gives a large set of alignments. A selection matrix permits to retain alignments falling in coding regions. Ecores (Evolutionary Conserved Regions) are built with the selected alignments.
Exofish was tested on various sets of human genes. The percentage of false positive results is less than 5 %, and the percentage of genes detected is up to 80 %. Exofish allowed Genoscope to do the first re-evaluation of gene content of the human genome, suggesting that the human genome contains 28,000 to 34,000 genes rather than the previous estimates of 50,000 to 90,000 genes.