Hoppsigen
(Homologous Processed Pseudogenes database rel 3.0)
Hoppsigen is a nucleic database of homologous processed pseudogenes. The database
is developped at the PBIL (Pôle
Bioinformatique Lyonnais).
What is a processed pseudogene?
Processed pseudogenes are retroelements (like SINE and LINE). They are generated by
two ways:
- Reverse transcription of RNA messanger in germ line cells.
- Duplication of an existing processed pseudogene in germ line cells.
So if we compare processed pseudognes to their functional homologous gene, they
have lost their introns and they have no promoters (except those which are
inserted near an existing promoter sequence).
Main interests to molecular evolution
Genes giving rise to processed pseudogenes have to be expressed in germ line
cells. Their identification could help us to identify precisely genes expressed
in these cells. Since they lack promoters, generally they are no longer transcribed
so they quickly accumulate mutations. That's the reason why we consider that these
sequences are non functional They are no longer subjected to selection constraints.
So they could be used to estimate silent substitution rates along genomes in
relation with the GC content of the genomes regions where they had inserted.
Processed pseudognes indentification
Processed pseudogenes were identified by looking in complete mouse and human
genomes for sequences similar to genes with introns. We have successively used
two methods: BLAST (Altschul et al., 1997) and SIM (Huang and Miller, 1991)
to identified all retroelements generated by reverse transcribed genes with introns.
Then within these retroelements, we have detected non functional sequences and we
have annotated them as processed pseudogenes.
A database of retroelements and processed pseudogenes
We have identified 12236 human retroelements and 11391 mouse retroelements. These
retroelements were annotated and stored in the database HOPPSIGEN (Homologous
processed pseudogenes). Sequences were grouped in families considering their
homologies. The database contains 9821 families of human (4941) and mouse
retroelements (4880). 5961 human retroelements were annotated as processed
pseudogenes (respectively 5000 mouse retroelements). The database contains
functional genes from ENSEMBL 8.3 homologous to Hoppsigen retroelements.
For each family we have calculated a multiple alignement between
the functional gene and its homologous retroelements using
CLUSTAL W 1.7
(Thompson et al., 1994). We also calculated a phylogenetic tree for this
alignement using the NJ method implemented in CLUSTAL W with K2 distances
(Kimura, 1982).
Querying the database
Hoppsigen is structured under the
ACNUC sequence database
management system. The Query
program allows to browse the database flat files and to select sets of homologous
pseudogenes from different species according to specific criteria. If you
want to get flat files, please contact
me.
The database is also available by Internet, through the
WWW-Query
query engine. To access the database, you must first select search in nuclec
databases or seach by families. then you must choose Hoppsigen in the list of
available database.
After choosing the database, it is possible to make request using various keywords,
or if the sequence name or a list of sequence name is known, to retrieve the
corresponding sequences.
Here is a list of simple keywords:
- ppgene: this keywords allows you to select all retroelements. By combinating it
with a species or a taxon, it is possible to extract retroelements for one or few
species.
- CDS: select only coding sequences of the functional parent gene.
- CDE: to select only the region of the pseudogene homolog to CDS.
- 5'FL (or 3'FL): to select the regions of the pseudogenes similar to
the 5'NCR (or 3'NCR) of the functionnal gene.
A complete list of keywords and annotation desciptions are available in this
file.
Contact
If you encounter some problems when installing or using
Hoppsigen, please contact Adel
Khelifi or Dominique
Mouchiroud. Also we welcome any comments or suggestions on the database
and/or its interface.
If you want to use Hoppsigen in any published work, please send a mail before to
Adel Khelifi
If you have problems or comments...
Back to PBIL home page