Database of Homologous Sequences from Complete Genomes
HOGENOM release 02 November 2004
|
HOGENOM release 03 available soon (October 2005)
Release informations:
Protein
Nucleotide
Previous Release:here
Query Hogenom
Hogenom
HOGENOM is a database of homologous genes from fully sequenced organisms, structured under
ACNUC sequence database management system.
It allows one to select sets of homologous genes among species,
and to visualize multiple alignments and phylogenetic trees. Thus HOGENOM
is particularly useful for comparative sequence analysis, phylogeny and
molecular evolution studies. More generaly, HOGENOM gives an overall view
of what is known about a peculiar gene family.
Content
The database itself contains all protein sequences from the EBI
proteomes data,
with some additional annotations modifications.
It contains also all the corresponding nucleotide sequences in
Genome Reviews and EMBL.
Homologous proteins are classified into families and multiple alignments and phylogenetic trees
are computed for each family.
Sequences and related information have been structured in an ACNUC database.
A brief description on how the database is built is available
here.
The present version of HOGENOM is release 02 (November 2004). It has been built
using sequences from the European Bioinformatics Institute
proteome data (22th July 2004). It
contains a total of 626,687 sequences protein sequences
(and 772,359 cds) classified in
200,639 families.
Among all the proteins included in this release, 480,015 (76.6%) are classified
into 56,842 families containing at least two sequences, 143,796 (22.9%) are unique
in their family and 2,876 (0.4%) partial proteins are not attached to a family.
New feature: Cross references to Ensembl are now available in Hogenom protein annotations for human, mus,
rattus, fly, and caenorhabditis elegans.
Species |
Homo sapiens |
Mus musculus |
Rattus norvegicus |
Drosophila melanogaster |
Caenorhabditis elegans |
Hogenom |
28 576 |
27 340 |
6 610 |
16 317 |
21 781 |
Ensembl |
22 292 |
25 383 |
22 160 |
13 526 |
19 874 |
Hogenom associated to Ensembl |
89 % |
93 % |
85 % |
88 % |
52 % |
Ensembl associated to Hogenom |
84 % |
80 % |
23 % |
95 % |
57 % |
Example:
ID ANPA_HUMAN STANDARD; PRT; 1061 AA.
AC P16066;
DT 01-APR-1990 (Rel. 14, Created)
DT 01-APR-1990 (Rel. 14, Last sequence update)
...
...
DR ENSEMBL:Homo_sapiens;ENSG00000169418;ENST00000306672;ENSP00000305386.
...
...
Fully Sequenced Organisms
There are 182 organims in HOGENOM, among which 13 are eukarya
(
Arabidopsis thaliana,
Caenorhabditis elegans,
Drosophila melanogaster,
Encephalitozoon cuniculi,
Eremothecium gossypii,
Guillardia theta,
Homo sapiens,
Mus musculus,
Oryza sativa,
Plasmodium falciparum,
Rattus norvegicus,
Saccharomyces cerevisiae,
Schizosaccharomyces pombe
) and 20 are archaeas.
Among all the proteins, 24% (150,459) belong to eukarya, 69%(434,524) belong to bacteria and 6%(41,704) belong to archaea.
WWW access
It is possible to query the database on this server through the
WWW-Query and Cross-Taxa systems. Note that
HOGENOM is splitted into two databases on this server: HOGENPROT
contains the protein sequences from EBI proteome data while HOGENNUCL
contains the nucleotide sequences from EMBL.
FamFetch Graphical User Interface
FamFetch is a graphical interface, specifically developped to query databases of gene families.
FamFetch is based on a client/server architecture. To access
the database you only need to install the FamFetch
application on your computer. This program, written in Java, integrates a GUI that allows
users to easily access and visualize:
- The list of the families available in the database.
- The sequence (protein or nucleotide) of the genes
defining these families.
- The alignments built with these families.
- The phylogenetic trees computed with these alignments.
In FamFetch phylogenetic trees, genes are colored according to the species
from which they come.
The user can modify the color table according to the taxa (any taxonomic level) he is interested in.
This color table is saved in a file of preferences (named .famfetch in UNIX, FamFetch.Prefs in MaOS, HobacFetch.ini in Windows systems).
The color table that is installed by default with FamFetch is dedicated to prokaryotes (for the HOBACGEN database).
You can replace this preference file by the one we have prepared for HOGENOM, that is available here.
Server mirroring
You don't need to install the server itself to have HOGENOM running on your
computer as the client is enough for that purpose. On the other hand
you may want to set-up your own server in a way to speed up your
database access and to propose that service to potential users in your
geographic area. To install an HOGENOM server, you need first to
register. Starting from the
registering page results, you will have access to the server installation
procedure.
The whole database is available from our FTP server at URL:
ftp://pbil.univ-lyon1.fr/pub/hogenom/
Note that it is much more efficient to use a dedicated FTP client to download the
database rather than an Internet Web browser.
Important note: the SWISS-PROT entries such as those found in HOGENOM are
copyrighted. They are produced through a collaboration between the Swiss Institute
of Bioinformatics and the European Bioinformatics Institute. There are no restrictions
on its use by non-profit institutions as long as its content is in no way modified.
Usage by and for commercial entities requires a
license agreement
(See or send an Email to license@isb-sib.ch).
Contact and reference
If you encounter some problems when installing or using HOGENOM, please
contact Laurent Duret.
Also we welcome any comments or suggestions on the database and/or its
interface.
Acknowledgements
This project is supported
If you have problems or comments...
Back to PBIL home page