HOBACGEN: Homologous Bacterial Genes Database
WARNING: HOBACGEN has been now replaced by the
HOGENOM contains data from all complete genomes, including of
course bacteria and archaea. The HOBACGEN database is therefore no longer
HOBACGEN is a database system that contains all the protein sequences of bacteria
organized into families. It allows one to select sets of homologous genes from
bacterial species and to visualize multiple alignments and phylogenetic trees.
Thus HOBACGEN is particularly useful for comparative genomics, phylogeny and
molecular evolution studies on bacteria.
The database contains all sequences of bacteria (eubacteria and archeae) and yeast
taken from SWISS-PROT + TrEMBL,(now the UniProt Knowledgebase) with some annotation
modifications. It contains also all the
corresponding nucleotide sequences in EMBL. Homologous proteins are classified into
families and multiple alignments and phylogenetic trees are computed for each family.
The description on how the database is built is available
The present version of HOBACGEN is release 10 (February 2002). It has been built
using sequences from SWISS-PROT 40, TrEMBL 19 and TrEMBL_NEW - January 25, 2002 (now the UniProt Knowledgebase). It
contains a total of 260,025 proteins with the following repartition:
Among all the proteins included in this release, 193,747 (74.5%) are classified
into 23,961 families containing at least two sequences, 58,445 (22.5%)
are unique in their family, and 7,833 (3%) partial proteins are not attached
to a family.
Important note: the SWISS-PROT entries such as those found in HOBACGEN are
copyrighted. They are produced through a collaboration between the Swiss Institute
of Bioinformatics and the European Bioinformatics Institute. There are no restrictions
on its use by non-profit institutions as long as its content is in no way modified.
Usage by and for commercial entities requires a
(See or send an Email to email@example.com).
Graphical User Interface
HOBACGEN interface is based on a client/server architecture. To access
the database you only need to install the FamFetch
application on your computer. This program, written in Java, integrates a GUI that allows
users to easily access and visualize:
- The list of the families available in the database.
- The sequence (protein or nucleotide) of the genes
defining these families.
- The alignments built with these families.
- The phylogenetic trees computed with these alignments.
It is also possible to query the database on this server through the
WWW-Query system. Note that
HOBACGEN is splitted into two databases on this server: HOBACGEN
contains the protein sequences from SWISS-PROT + TrEMBL (now the UniProt Knowledgebase) while HOBACCGENDNA
contains the nucleotide sequences from EMBL.
You don't need to install the server itself to have HOBACGEN running on your
computer as the client is enough for that purpose. On the other hand
you may want to set-up your own server in a way to speed up your
database access and to propose that service to potential users in your
geographic area. To install an HOBACGEN server, you need first to
register. Starting from the
registering page results, you will have access to the server installation
Contact and reference
If you encounter some problems when installing or using HOBACGEN, please
contact Guy Perrière or
Also we welcome any comments or suggestions on the database and/or its
If you use HOBACGEN in any published work, please cite the following
Perrière, G., Duret, L. and Gouy, M. (2000) HOBACGEN: database
system for comparative genomics in bacteria. Genome Res., 10,
This project is supported by the European Commission (
contract-no. QLRI-CT-2001-00015, RTD programme
"Quality of Life and Management of Living
Resources") as a part of the Integr8 project.
If you have problems or comments...
Back to PBIL home page