Annotation Modifications


The annotations of the SWISS-PROT + TrEMBL and EMBL subsets used in HOVERGEN are slightly modified to include complementary data related to families and protein domains.

SWISS-PROT + TrEMBL annotations

First we add for each entry a line in the CC field that gives the number of the family the sequence belongs to:
CC   -!- GENE_FAMILY: HBG017522.
This number is incorporated in the keywords associated to the corresponding entry in the ACNUC database structure. Due to that fact it is possible to retrieve all the sequences associated to a family with this number when using the retrieval system Query or the on-line version WWW-Query.

We also add data on the localization of PRODOM domains that are found in a given entry. These data are integrated in the FT field:

FT   PRODOM    begin    end       domain_ID       domain_#
FT   PRODOM        2     44       p99.1_24658     3
FT   PRODOM       45    361       p99.1_8718      8
FT   PRODOM      362    398       p99.1_133440    1
FT   PRODOM      399    494       p99.1_9971      7
The prefix of domain_ID (p99.1) indicates the PRODOM release from which these annotations were derived. The suffix (e.g. 24658) indicates the identifier of the domain in the corresponding PRODOM release.

The domain_# indicates the number of occurence of this domain in the PRODOM database. Soon these features will be available as subsequences under the ACNUC structure.

Note: An other type of annotation is present in HOVERGEN. This annotation is relative to the use of a duplicated SWISSPROT-TrEMBL database:

CC   -!- modified from 143G_HUMAN.

EMBL annotations

In this subset we add for each coding sequence a qualifier that gives the number of the family the gene belongs to:
FT                   /gene_family="HBG017522"
We also include in HOVERGEN descriptions of non-coding regions: For example:
FT   3'ncr           2278..2368
These subsequences can be selected and extracted from the database in the same way as CDS, using WWW-Query (see Help).
If you have problems or comments...

Back to PBIL home page