C API for remote access
Download source code. Run make and use the resulting libraa.a library. Programming example
API functions in alphabetical order :
codaa,
raa_acnucclose,
raa_acnucopen,
raa_acnucopen_alt,
raa_alllistranks,
raa_bcount, raa_bit0, raa_bit1,
raa_btest,
raa_copylist, raa_countfilles,
raa_decode_address,
raa_extract_1_seq,
raa_extract_interrupt,
raa_fcode,
raa_getattributes,
raa_getemptylist,
raa_getlistrank,
raa_getliststate,
raa_get_taxon_info,
raa_gfrag, raa_ghelp,
raa_iknum, raa_isenum,
raa_knowndbs,
raa_loadtaxonomy,
raa_modifylist,
raa_next_annots,
raa_nexteltinlist,
raa_nexteltinlist_annots,
raa_nextmatchkey,
print_raa_long,
raa_open_socket,
raa_opendb,
raa_opendb_pw,
raa_prep_acnuc_query,
raa_prep_extract,
raa_proc_query,
raa_read_annots,
raa_readacc,
raa_readext,
raa_readkey,
raa_readlng,
raa_readloc,
raa_readshrt,
raa_readsmj,
raa_readspec,
raa_readsub,
raa_read_first_rec,
raa_releaselist,
raa_residuecount,
raa_savelist,
raa_seq_to_annots,
raa_seqrank_attributes,
raa_setlistname,
raa_setliststate,
raa_showannots,
raa_translate_cds,
raa_translate_init_codon,
raa_zerolist,
scan_raa_long
API functions by theme : #include "raa_acnuc.h"
- Find, open, close database(s).
- Query database.
- raa_proc_query, processes a query expressed with the acnuc query language and creates the list of matching sequences or species or keywords;
- raa_nexteltinlist or raa_nexteltinlist_annots, loops through elements of list.
- raa_nextmatchkey, finds next keyword matching a template.
- raa_modifylist, modifies a sequence list by various criteria.
- raa_savelist, save names of list elements in local file.
- raa_getlistrank, raa_alllistranks, get list rank from list name, ranks of all currently defined lists.
- raa_releaselist, raa_zerolist, deletes or empties a list.
- Read sequence data and annotations.
- raa_getattributes, raa_seqrank_attributes, get sequence and/or attributes from a sequence name or accession number or ACNUC rank
- raa_gfrag, read sequence data
- raa_seq_to_annots, gain access to annotations from sequence rank.
- raa_read_annots and raa_next_annots, read successive annotation lines
- raa_showannots, processes selected annotation lines
- raa_prep_extract, raa_extract_1_seq, raa_extract_interrupt, extracts all seqs from a list to a local file
- raa_prep_coordinates, raa_1_coordinate_set, get coordinates in their parent seq of various subsequences and feature items
- raa_translate_cds, raa_translate_init_codon, codaa, translate a protein coding sequence or a codon
- Use species, keywords, accession nos.
- raa_get_taxon_info, get information about a taxon specified by name, acnuc rank or ncbi ID.
- raa_iknum, get db rank of species or keyword.
- raa_fcode, get db rank of accession no., author, reference, type.
- raa_isenum, get db rank of sequence name.
- Utility functions.
Programming example
#include "raa_acnuc.h"
int main(int argc, char **argv)
{
raa_db_access *raa;
int err, list, count, num, length;
char *seq, *name;
err = raa_acnucopen_alt("pbil.univ-lyon1.fr", 5558, "embl", "myprog", &raa);
if(err != 0) exit(1);
raa_proc_query(raa, "j=nar and y=2000", NULL, "mylist", &list, &count, NULL, NULL);
num = 0;
while((num = raa_nexteltinlist(raa, num, list, &name, &length)) != 0) {
seq = (char *)malloc((length + 1)*sizeof(char));
raa_gfrag(raa, num, 1, length, seq);
printf("Name:%s Sequence:%s\n", name, seq);
free(seq);
}
raa_acnucclose(raa);
}
Link this toy program named prog.c with:
setenv RAADIR name-of-dir-containing-libraa.a
gcc -o prog prog.c -I$RAADIR -L$RAADIR -lraa -lz
Typedefs :
raa_db_access: a structure containing all information related to a connection with a remote acnuc database.
typedef long long raa_long; /* scan_raa_long/print_raa_long converts from/to string decimal form */
typedef enum { raa_sub = 0, raa_loc, raa_key, raa_spec, raa_shrt, raa_lng, raa_ext, raa_smj,
raa_aut, raa_bib, raa_txt, raa_acc } raa_file;
Public fields of the raa_db_access structure :
typedef struct _raa_db_access {/* all information related to a connection with a remote acnuc database */
char *dbname; /* name of connected acnuc database */
FILE *raa_sockfdr, *raa_sockfdw; /* variables for read/write from/to connection socket */
int genbank, embl, swissprot, nbrf; /* one is true according to format of connected db */
int nseq; /* total number of sequences (and subseqs) in db */
int longa;
int maxa;
/* max widths of several db textual fields */
int L_MNEMO, WIDTH_SP, WIDTH_KW, WIDTH_SMJ, WIDTH_AUT, WIDTH_BIB, ACC_LENGTH, lrtxt;
raa_node **sp_tree; /* NULL or the full taxonomy tree */
int max_tid; /* largest correct taxon ID value */
int *tid_to_rank; /* NULL or ncbi taxon ID to acnuc rank table */
int SUBINLNG; /* true number of sequence numbers in a struct rlng record */
struct rlng {
int sub[SUBINLNG];
int next;
} *rlng_buffer;
/* supports working with selected parts of sequence annotations */
int tot_key_annots; /* number of elements of each of next three arrays */
/* uppercase names of annotation records in connected database; key_annots[0] is "ALL" */
char **key_annots;
char **key_annots_min; /* same in lowercase */
/* each element is true if annotation record is wanted; want_key_annots[0]=TRUE means all records wanted */
unsigned char *want_key_annots;
} raa_db_access;
- raa_acnucopen opens access to a remote acnuc database using the racnuc, or, if undefined, acnuc, environment variable.
int raa_acnucopen(char *clientid, raa_db_access **praa);
- clientid: the calling program name (freely chosen by the programmer)
- praa: points to a value that, upon return, defines the newly created acnuc connection
- return value : those of raa_open_socket and raa_opendb,
or 8 if environment variables racnuc and acnuc are undefined or inadequately defined.
The value of the racnuc (or acnuc) environment variable should be such as pbil.univ-lyon1.fr:5558/embl
- raa_acnucopen_alt opens access to a remote acnuc database using explicit address information.
int raa_acnucopen_alt(char *server_ip, int s_num , char *db_name, char *clientid, raa_db_access **praa);
- server_ip : ip name of the acnuc server (e.g., "pbil.univ-lyon1.fr")
- s_num : socket number, normally 5558
- db_name : name of the database (e.g. "embl")
- clientid: the calling program name (freely chosen by the programmer)
- praa: points to a value that, upon return, defines the newly created acnuc connection
- return value : those of raa_open_socket and raa_opendb.
- raa_open_socket opens access to the remote acnuc server
int raa_open_socket(char *serverName, int port, char *clientid, raa_db_access **praa);
- serverName : ip name of the acnuc server (e.g., "pbil.univ-lyon1.fr")
- port: port number (e.g. 5558)
- clientid: the calling program name (freely chosen by the programmer)
- praa: points to a value that, upon return, defines the newly created acnuc connection
- return value: 0 if OK;
1 if problem with remote host name;
2 if cannot create connection with remote host;
7 if not enough memory.
- raa_opendb opens an acnuc database after raa_open_socket call
int raa_opendb(raa_db_access *raa, char *dbname);
- raa: value of the remote acnuc connection
- dbname : database name (e.g., "embl")
- return value: 0 iff OK; 3 if database is unknown by remote host;
4 if database is currently unavailable on remote host;
5 if a database was previously opened and was not closed;
9 if no socket was previously opened by raa_open_socket.
- raa_opendb_pw opens a password-protected database after raa_open_socket call
int raa_opendb_pw(raa_db_access *raa, char *db_name, void *ptr, char *(*getpasswordf)(void *) );
- raa: value of the remote acnuc connection
- dbname : database name (e.g., "embl")
- ptr : NULL or pointer to data transmitted to the getpasswordf function
- getpasswordf : pointer to password-providing function that returns the password as a writable static string
- return value: as raa_opendb, or 6 to indicate failed password-based authorization.
- raa_decode_address to decode an ACNUC-specific URL
int raa_decode_address(char *address, char **server_ip, int *s_num, char **db_name);
- address : same form as "pbil.univ-lyon1.fr:5558/swissprot" or "pbil.univ-lyon1.fr:5558"
- server_ip : upon return, the ip name part of the address (pbil.univ-lyon1.fr)
- s_num : upon return, the socket number part (5558)
- db_name : upon_return, the db name (swissprot) or NULL if absent
- return value : 0 iff OK
- raa_acnucclose to close access to the db
void raa_acnucclose(raa_db_access *raa);
raa: value of the remote acnuc connection
- raa_getattributes to get sequence and/or attributes from a sequence name or accession number.
char *raa_getattributes(raa_db_access *raa, const char *id,
int *prank, int *plength, int *pframe, int *pgc, char **pacc, char **pdesc,
char **pspecies, char **pseq);
- raa: value of the remote acnuc connection
- id: a name or accession number
- prank: NULL or, upon return, pointer to ACNUC rank of sequence
- plength: NULL or, upon return, pointer to length of sequence
- pframe: NULL or, upon return, pointer to reading frame (0,1,2) of sequence
- pgc: NULL or, upon return, pointer to genetic code (ACNUC's) of sequence
- pacc: NULL or, upon return, pointer to primary accession no. of sequence in private memory
- pdesc: NULL or, upon return, pointer to one-line description of sequence in private memory
- pspecies: NULL or, upon return, pointer to species name of sequence in private memory
- pseq: NULL or, upon return, pointer to complete sequence in private memory
- return value: NULL if id not found, or sequence name
- raa_seqrank_attributes to get sequence and/or attributes from a sequence rank.
char *raa_seqrank_ attributes(raa_db_access *raa,
int rank, int *plength, int *pframe, int *pgc, char **pacc, char **pdesc,
char **pspecies, char **pseq);
- raa: value of the remote acnuc connection
- rank: the ACNUC rank of sequence
- plength: NULL or, upon return, pointer to length of sequence
- pframe: NULL or, upon return, pointer to reading frame (0,1,2) of sequence
- pgc: NULL or, upon return, pointer to genetic code (ACNUC's) of sequence
- pacc: NULL or, upon return, pointer to primary accession no. of sequence in private memory
- pdesc: NULL or, upon return, pointer to one-line description of sequence in private memory
- pspecies: NULL or, upon return, pointer to species name of sequence in private memory
- pseq: NULL or, upon return, pointer to complete sequence in private memory
- return value: NULL if id not found, or sequence name
- raa_gfrag to read a sequence fragment
int raa_gfrag(raa_db_access *raa, int nsub, int first, int lfrag, char *dseq);
- raa: value of the remote acnuc connection
- nsub : rank of sequence
- first : first residue to read (counting from 1)
- lfrag : number of residues to read
- dseq : character array, allocated by caller, to be filled with residues
- return value : number of residues read (can be 0)
- raa_seq_to_annots get adress of start of annotations for a sequence
void raa_seq_to_annots(raa_db_access *raa, int numseq, raa_long *faddr, int *div);
- raa: value of the remote acnuc connection
- numseq : rank of sequence
- *faddr : returned filled with the offset within flat file of beginning of annotations
- div : returned filled with rank of division containing the sequence
- raa_read_annots read the first line of annotations of a sequence
char *raa_read_annots(raa_db_access *raa, raa_long faddr, int div);
- raa: value of the remote acnuc connection
- faddr : offset within flat file of beginning of annotations (typically from raa_seq_to_annots)
- div : rank of division containing the sequence
- return value : pointer to line read in static memory (NULL if error)
- raa_next_annots read the next line of annotations of a sequence
char *raa_next_annots(raa_db_access *raa, NULL);
or
char *raa_next_annots(raa_db_access *raa, raa_long *faddr);
- raa: value of the remote acnuc connection
- *faddr : returned filled with offset within flat file of beginning of line read (can be used later by raa_read_annots)
- return value : pointer to line read in static memory (NULL if error)
- raa_iknum get rank of a species or a keyword
int raa_iknum(raa_db_access *raa, char *name, raa_file cas);
- raa: value of the remote acnuc connection
- name : a species or a keyword (case is not significant)
- cas : raa_spec for a species or raa_key for a keyword
- return value : rank of given name (0 if absent)
- raa_isenum get rank of a sequence from its name
int raa_isenum(raa_db_access *raa, char *name);
- raa: value of the remote acnuc connection
- name : a sequence name (case is not significant)
- return value : rank of given name (0 if absent)
- raa_proc_query processes a query and creates the list of matching seqs, species or keywords
int raa_proc_query(raa_db_access *raa, char *query, char **message, char *listname,
int *numlist, int *count, int *locus, int *type);
- raa: value of the remote acnuc connection
- query : a string containing a query following the acnuc query language
- message : if != NULL, returned filled with an error message in case of error in malloc'ed memory
- listname : the name to be given to the list (case is not significant)
- numlist : upon return, if no error, the rank of the created list
- count : if != NULL, returned filled with the number of elements in the created list
- locus : if != NULL, returned filled with TRUE if list contains parent sequences only
- type : if != NULL, returned filled with 'S', 'K', or 'E' for a list of seqs, keywords, or species
- return value : O if OK, or an error number
- raa_nexteltinlist returns the next element of a list
int raa_nexteltinlist(raa_db_access *raa, int first, int lrank, char **name, int *length);
- raa: value of the remote acnuc connection
- first : elements of list are searched after this position (initiate this to 0)
- lrank : rank of the list
- name : if != NULL, returned filled with the name of the element in static memory
- length : if != NULL, returned filled with the element length (for seq list only)
- return value : the rank of the next element in the list, or 0 if none
- raa_nexteltinlist_annots returns the next element of a sequence list and related information
int raa_nexteltinlist_annots(raa_db_access *raa, int first, int lrank, char **name, int *length, raa_long *offset, int *div);
- raa: value of the remote acnuc connection
- first : elements of list are searched after this position (initiate this to 0)
- lrank : rank of the sequence list
- name : if != NULL, returned filled with the name of the sequence in static memory
- length : if != NULL, returned filled with the sequence length
- offset : if != NULL, returned filled with the annotation offset of the seq
- div : if != NULL, returned filled with the division rank of the seq
- return value : the rank of the next sequence in the list, or 0 if none
- raa_nextmatchkey returns the next keyword matching a given pattern
int raa_nextmatchkey(raa_db_access *raa, int num, char *pattern, char **matching);
- raa: value of the remote acnuc connection
- num : rank beyond which next matching keyword if sought (set num=2 the first time)
- pattern : must contain at least once the wild card character @ (used only if num = 2)
- matching : if not NULL, filled with matching keyword in private memory
- return value : rank of next matching keyword, or 0 if no more matching keyword.
- raa_bcount counts the number of elements in a list
int raa_bcount(raa_db_access *raa, int lrank);
- raa: value of the remote acnuc connection
- lrank : rank of the list
- return value : the number of elements in the list
- raa_bit1 adds an element to a list
void raa_bit1(raa_db_access *raa, int lrank, int num);
- raa: value of the remote acnuc connection
- lrank : rank of the list
- num: rank of the element to add
- raa_bit0 removes an element from a list
void raa_bit0(raa_db_access *raa, int lrank, int num);
- raa: value of the remote acnuc connection
- lrank : rank of the list
- num: rank of the element to remove
- raa_btest tests presence of element in a list
int raa_btest(raa_db_access *raa, int lrank, int num);
- raa: value of the remote acnuc connection
- lrank : rank of the list
- num: rank of the element to remove
- return value: TRUE iff element num is in list lrank
- raa_copylist duplicates a list
void raa_copylist(raa_db_access *raa, int rank_from, int rank_to);
- raa: value of the remote acnuc connection
- rank_from : rank of the list to copy
- rank_to : rank of the destination list that must have been previously allocated by e.g., raa_getemptylist
- raa_zerolist empties a list
void raa_zerolist(raa_db_access *raa, int lrank);
- raa: value of the remote acnuc connection
- lrank : rank of the list to empty, that must have been previously allocated by e.g., raa_getemptylist.
- raa_setliststate sets the state of a list
void raa_setliststate(raa_db_access *raa, int lrank, int locus, int type);
- raa: value of the remote acnuc connection
- lrank: rank of the list
- locus : TRUE iff list contains only parent sequences
- type : 'S', 'K', or 'E' for list of seqs, keywords, or species.
- raa_getliststate gets the state of a list
char *raa_getliststate(raa_db_access *raa, int lrank, int *locus, int *type, int *count);
- raa: value of the remote acnuc connection
- lrank: rank of the list
- locus : if != NULL, returned filled with TRUE iff list contains only parent sequences
- type : if != NULL, returned filled with 'S', 'K', or 'E' for list of seqs, keywords, or species.
- count : if != NULL, returned filled with the number of elements in list
- return value : the list name in static memory, or NULL if error
- raa_getemptylist finds an empty list and sets its name
int raa_getemptylist(raa_db_access *raa, char *lname);
- raa: value of the remote acnuc connection
- lname: name to give to the list;
- return value : the list rank, or 0 if none is available;
- if no list named lname existed, a new, empty one is created;
- if a list named lname already existed, its rank is returned and no change is done to that list.
- raa_setlistname sets the name of a list
int raa_setlistname(raa_db_access *raa, int lrank, char *name);
- raa: value of the remote acnuc connection
- lrank: rank of the list
- name : name to give to the list
- return value :
- 0 : OK
- 1 : a list with that name already existed and was deleted
- -1 : no list with that rank exists
- raa_getlistrank gets the rank of a list from its name
int raa_getlistrank(raa_db_access *raa, char *name);
- raa: value of the remote acnuc connection
- name: name of the list
- return value : > 0 if OK, 0 if no list with that name exists
- raa_releaselist releases a list
int raa_releaselist(raa_db_access *raa, int lrank);
- raa: value of the remote acnuc connection
- lrank: rank of the list
- return value : 0 if OK, != 0 if no list with that rank exists
- raa_residuecount count residues in a list
char *raa_residuecount(raa_db_access *raa, int lrank);
- raa: value of the remote acnuc connection
- lrank: rank of the list
- return value : total number of residues (nucleotides or aminoacids) in all sequences of the list as a char string in static memory (caution: may require a 64-bit integer to be sscanf'ed).
- raa_countfilles counts the number of subsequences present in a list
int raa_countfilles(raa_db_access *raa, int lrank);
- raa: value of the remote acnuc connection
- lrank: rank of the list
- return value : the number of subsequences present
- raa_alllistranks get ranks of all currently defined lists
int raa_alllistranks(raa_db_access *raa, int **ranks);
- raa: value of the remote acnuc connection
- ranks: upon return, an array of ranks of used lists in malloc'ed memory
- return value : the number of used lists
- raa_fcode returns the rank of a record of an index file from its key
int raa_fcode(raa_db_access *raa, raa_file case, char *name);
- raa: value of the remote acnuc connection
- case: one of raa_aut raa_bib raa_acc raa_smj raa_sub
- name: the record key
- return value : the rank of the corresponding key
- raa_read_first_rec returns the total number of records in an index file
int raa_read_first_rec(raa_db_access *raa, raa_file case);
- raa: value of the remote acnuc connection
- case: an index file expressed using the raa_file enumeration
- return value : the total number of records in the index file
- int atoi_u(const char *p);
decodes a string as an unsigned decimal integer
- raa_readsub reads a SUBSEQ record
char *raa_readsub(raa_db_access *raa, int num, int *plength, int *ptype, int *pext, int *plkey, int *plocus, int *pframe, int *pgencode)
- raa: value of the remote acnuc connection
- num: seq rank
- plength: upon return, filled, if != NULL, with seq length
- ptype : upon return, filled, if != NULL, with rank of seq type
- pext : upon return, filled, if != NULL, with
- > 0 indicates a subsequence and pext is a record # in EXTRACT
- ≤ 0 indicates a parent sequence and -pext is the start of long list of subsequences
- plkey : upon return, filled, if != NULL, with start of short list of keywords
- plocus : upon return, filled, if != NULL, with LOCUS rank for a parent sequence or 0 for a subsequence
- pframe : upon return, filled, if != NULL, with reading frame (0,1, or 2)
- pgencode : upon return, filled, if != NULL, with genetic code (0 is standard)
- return value : sequence name in static memory or NULL if error
- raa_readloc reads a LOCUS record
char *raa_readloc(raa_db_access *raa, int num, int *sub, int *pnuc, int *spec, int *host,
int *plref, int *molec, int *placc, int *org);
- raa: value of the remote acnuc connection
- num: rank of LOCUS record
- sub, pnuc, spec, host, plref, molec, placc, org: fields of the record (any pointer can be NULL for no value returned)
- return value: the date as a private character string
- raa_readspec reads a SPECIES record
char *raa_readspec(raa_db_access *raa, int num, char **plibel, int *plsub, int *desc, int *syno, int *plhost);
- raa: value of the remote acnuc connection
- num: rank of SPECIES record
- plibel: if plibel != NULL, *plibel is returned as pointer to label in private memory, or as NULL if no label exists
- plsub, desc, syno, plhost: returned with data read from the record
- return value: pointer to name of species in private memory
- raa_readkey reads a KEYWORDS record
char *raa_readkey(raa_db_access *raa, int num, char **plibel, int *plsub, int *desc, int *syno);
- raa: value of the remote acnuc connection
- num: rank of KEYWORDS record
- plibel: if plibel != NULL, *plibel is returned as pointer to label in private memory, or as NULL if no label exists
- plsub, desc, syno: if != NULL, returned with data read from the record
- return value: pointer to name of keyword in private memory
- raa_readsmj reads an SMJYT record
char *raa_readsmj(raa_db_access *raa, int num, char **plibel, int *plong);
- raa: value of the remote acnuc connection
- num: rank of SMJYT record
- plibel: if plibel != NULL, *plibel is returned as pointer to label in private memory, or as NULL if no label exists
- plong: if != NULL, upon return points to data read from record
- return value: name as a private string
- raa_readacc reads an ACCESS record
char *raa_readacc(raa_db_access *raa, int num, int *plsub);
- raa: value of the remote acnuc connection
- num: rank of ACCESS record
- plsub: if != NULL, upon return points to data read from record
- return value: name as a private string
- raa_readext reads an EXTRACT record
int raa_readext(raa_db_access *raa, int num, int *mere, int *deb, int *fin);
- raa: value of the remote acnuc connection
- num: rank of record
- mere, deb, fin: if != NULL, upon return point to data read from record
- return value: 0 or rank of next chained record
- raa_readlng reads a LONGL record
int raa_readlng(raa_db_access *raa, int point);
- raa: value of the remote acnuc connection
- point: rank of record
- return value: 0 or rank of next chained record
The read data is placed in structure pointed by raa->rlng_buffer
- raa_readshrt reads a short list element
unsigned raa_readshrt(raa_db_access *raa, unsigned point, int *val);
- raa: value of the remote acnuc connection
- point: rank of element
- val: upon return, the element value
- return value: 0 or the rank of the next short list element
- raa_long scan_raa_long(char *txt);
decodes a string as a number of type raa_long (capable of storing large file offset)
- char *print_raa_long(raa_long val, char *buffer);
encodes a number of type raa_long (capable of storing large file offset) as a string in buffer argument.
Returns buffer.
- char *raa_ghelp(raa_db_access *raa, char *hname, char *topic);
- raa: value of the remote acnuc connection
- hname: one of "HELP" or "HELP_WIN"
- topic: name of a help topic
Returns all topic from HELP or HELP_WIN in one string in private memory
- raa_savelist saves in a local file names or acc. nos of members of a list
int raa_savelist(raa_db_access *raa, int lrank, FILE *out, int use_acc, char *prefix);
- raa: value of the remote acnuc connection
- lrank: rank of list to be saved in a file (can be seq, species or keyw list)
- out: opened file where to save list members
- use_acc: if TRUE, save accession numbers, if FALSE, save names of list members
- prefix: if != NULL, write prefix before each name of each member in file out
- return value: 0 iff ok
- raa_modifylist modifies list according to length or date or by scanning the annotation of its elements
int raa_modifylist(raa_db_access *raa, int lrank, char *type, char *operation, int *pnewlist,
int (*check_interrupt)(void), int *p_processed);
- raa: value of the remote acnuc connection
- lrank: rank of list to be modified (must be a seq list)
- type: "length" or "date" or "scan"
- operation: (for length) ">10000" or "<500"
(for date) ">1/jan/2003" or " < 29/FEB/96"
(for scan) "string-to-be-searched-for".
The prep_getannots command must be sent to the server with functions sock_fputs and read_sock before using the scan operation
- pnewlist: upon return, points to rank of newly created list containing result of operation
- check_interrupt: NULL or pointer to a function that will be called iteratively by the function and that should return TRUE iff caller wants to interrupt the modification operation
- p_processed: NULL, or pointer to value that will be set, upon return, to the number of list elements processed by the function until interruption or completion (for scan operation only)
- return value: 0 if ok; 2 syntax error in operation; 3 creation of new list is impossible
- raa_knowndbs gets name and description of all known databases
int raa_knowndbs(raa_db_access *raa, char ***pnames, char ***pdescriptions);
- raa: value of the remote acnuc connection
- pnames: points to array of strings loaded with database names (in malloc'ed memory)
- pdescriptions: points to array of strings loaded with database descriptions (in malloc'ed memory)
- return value: number of elements in tables pnames and pdescriptions
- raa_prep_extract prepares for extraction of all members of a sequence list to a local file
void *raa_prep_extract(raa_db_access *raa, char *format, FILE *out, char *operation, char *feature_name, char *bounds, char *min_bounds, char **pmessage, int lrank);
- raa: value of the remote acnuc connection
- format: "fasta" or "flat" (e.g., genbank, embl) or "acnuc"
- out: FILE * variable to which extracted data should be sent. Left open after end of extraction operation.
- operation: "simple", "translation" (translates on the fly CDS sequences), "feature" (extracts fragment corresponding to given feature name), "fragment", or "region"
- feature_name: NULL or name of desired feature
- bounds, min_bounds: NULL unless operation is "fragment" or "region"
- pmessage: returned set to NULL or to error message
- lrank: rank of sequence list
- return value: NULL iff error
- raa_extract_1_seq successively extracts one sequence from list.
int raa_extract_1_seq(void *opaque);
- opaque: value returned by the previous call to raa_prep_extract
- return value: number of extracted sequences (0 is possible),
or -1 when all of list was processed.
Must call this function until -1 is returned, unless raa_extract_interrupt was called.
- raa_extract_interrupt cleanly interrupts an extraction before its full completion.
int raa_extract_interrupt(raa_db_access *raa, void *opaque);
- raa: value of the remote acnuc connection
- opaque: value returned by the previous call to raa_prep_extract
- return value: number of extracted sequences (0 is possible),
- sock_fputs send a character string to server
int sock_fputs(raa_db_access *raa, char *line);
- raa: value of the remote acnuc connection
- return value: 0 iff success
- sock_flush flush output to server
int sock_flush(raa_db_access *raa);
- raa: value of the remote acnuc connection
- return value: 0 iff success
Very rarely needed, because read_sock calls sock_flush.
- read_sock read a character line received from server
char *read_sock(raa_db_access *raa);
- raa: value of the remote acnuc connection
- return value: a full line of data received from server in private memory, or NULL if communication with server is lost.
- raa_error_mess_proc A global variable that points to a function called when connection gets lost
void (*raa_error_mess_proc)(raa_db_access *raa, char *message);
This function should call raa_acnucclose.
Usage example :
void my_error_proc(raa_db_access *raa, char *message)
{
fprintf(stderr,"%s from database %s\n", message, raa->dbname);
raa_acnucclose(raa);
}
raa_error_mess_proc = my_error_proc;
When no such function is assigned to raa_error_mess_proc, exit(0) is called after connection loss.
- raa_translate_cds translates a protein coding sequence
char *raa_translate_cds(raa_db_access *raa, int numseq);
- raa: value of the remote acnuc connection
- numseq: rank in db of a protein coding sequence (a CDS feature entry)
- return value: the resulting protein sequence, using the adequate genetic code and initiation codon translation, in private memory, or NULL if error.
- raa_translate_init_codon translates the first codon of a CDS
char raa_translate_init_codon(raa_db_access *raa, int numseq);
- raa: value of the remote acnuc connection
- numseq: rank in db of a protein coding sequence (a CDS feature entry)
- return value: the resulting amino acid
- codaa translates a codon
char codaa(char *codon, int gc);
- codon: a trinucleotide
- gc: the genetic code to be used (typically returned by raa_readsub)
- return value: the resulting amino acid
- raa_prep_coordinates prepares a coordinate extraction
void *raa_prep_coordinates(raa_db_access *raa, int lrank, int seqnum,
char *operation, char *feature_name, char *bounds, char *min_bounds);
See extractseqs for the semantics of this function.
- raa: value of the remote acnuc connection
- lrank: the rank of a sequence list
- seqnum: the acnuc number of a (sub)sequence
only one of the first 2 arguments is non zero.
- operation: "simple","fragment","feature","region"
- feature_name: the name of a feature key (e.g.: "cds", "tRNA")
- bounds: syntax by examples: "10,40" "-10, 40" "-10,e+10" "e-10,E+100" where e/E means sequence end
- min_bounds: minimum extension of required fragment, same syntax as bounds argument
- return value: NULL if error, or an opaque pointer to be transmitted to raa_1_coordinate_set
- raa_1_coordinate_set gets one set of sequence coordinates
int *raa_1_coordinate_set(void *v);
- v: the opaque pointer returned by a previous call to raa_prep_coordinates
- return value: NULL or an int array containing 3*C + 1 values, where C is the 1st array element and other
triples of elements are sequence-number, first-coordinate, last-coordinate.
This function must be called repetitively until it returns NULL.
first-coordinate > last-coordinate indicates the complementary strand of the parent sequence.
Usage example:
int mylist, *table, i, j, count; void *v; raa_db_access *raa;
raa_proc_query(raa, "sp=bos taurus", NULL, "bos", &mylist, NULL, NULL, NULL);
v = raa_prep_coordinates(raa, mylist, 0, "region", "CDS", "-10000,-1", "-2000,-1");
if(v == NULL) exit(1);
while( (table = raa_1_coordinate_set(v) ) != NULL) {
count = table[0] ;
j = 0;
for(i=0; i < count; i++) {
table[j+1]; // is the acnuc number of the sequence
table[j+2]; // is the start position in this sequence
table[j+3]; // is the end position in this sequence
j += 3;
}
}
- raa_get_taxon_info gets information about a taxon specified by name, acnuc rank or ncbi ID
char *raa_get_taxon_info(raa_db_access *raa, char *name, int rank, int tid,
int *p_rank, int *p_tid, int *p_parent, struct raa_pair **p_desc_list);
- raa: value of the remote acnuc connection
- name: NULL or a taxon name (case is not significant)
- rank: used only if name==NULL, the acnuc rank of a taxon
- tid: used only if name==NULL && rank==0, an ncbi taxon ID
- p_rank: if p_rank != NULL, *p_rank returned with the taxon acnuc rank
- p_tid: if p_tid != NULL, *p_tid returned with the taxon ncbi ID
- p_parent: if p_parent != NULL, *p_parent returned with acnuc rank of taxon's parent in species tree
- p_desc_list: if p_desc_list != NULL, *p_desc_list returned with first element of chain of taxon's descendants in tree. All descendants of taxon Escherichia can be found with:
struct raa_pair *pair;
raa_get_taxon_info(raa, "Escherichia", 0, 0, NULL, NULL, NULL, &pair);
while(pair != NULL) {
fprintf("Name: %s Rank:%d TID:%d\n", pair->value->name, pair->value->rank, pair->value->tid);
pair = pair->next;
}
- return value: taxon name or NULL if any error
This function may last for a few seconds at first call, but is fast for all subsequent calls.
- raa_loadtaxonomy loads full species taxonomy of the current database as a tree structure in memory.
int raa_loadtaxonomy(raa_db_access *raa, char *rootname,
int (*progress_function)(int percent, void *data), void *progress_arg,
int (*need_interrupt_function)(void *data), void *interrupt_arg);
- raa: value of the remote acnuc connection
- rootname: (read only) name to be given to the species tree root
- progress_function: NULL or function that gets called every time tree loading progresses by 1% and that should return TRUE when opportunity for calling program to ask for interruption is desired
- progress_arg: transmitted as 2nd argument of progress_function
- need_interrupt_function: NULL or function that gets called after progress_function returned TRUE and that should return TRUE when interruption of tree loading is desired
- interrupt_arg: argument transmitted to need_interrupt_function
- return value: 0 iff no error
This call initializes the sp_tree, tid_to_rank and max_tid fields of the raa_db_access structure.
sp_tree[i] is the species tree node representing taxon of acnuc rank i.
sp_tree[2] is the root of this tree.
tid_to_rank[i] is the acnuc rank of ncbi taxon ID 0 ≤ i ≤ max_tid, or 0 if no such acnuc taxon exists.
- raa_node One node of the species tree
typedef struct raa_node {
char *name; /* taxon name */
char *libel; /* taxon libel */
char *libel_upcase; /* taxon libel converted to upper case */
int rank; /* taxon acnuc rank */
int tid; /* taxon ncbi ID */
int count; /* number of seqs attached to taxon or below in database */
struct raa_node *parent; /* taxon's parent in species tree, or NULL if node is species tree root */
struct raa_pair {
raa_node *value; /* one descendant */
struct raa_pair *next; /* NULL or points to next descendant */
} *list_desc;/* to chained list of taxon's descendants in species tree */
/* taxon's next synonym, as a closed loop where a single member has parent != NULL */
struct raa_node *syno;
} raa_node; /* one species tree node */
- raa_prep_acnuc_query To initialize list of annotation record names.
int raa_prep_acnuc_query(raa_db_access *raa);
- raa: value of the remote acnuc connection
- return value: number of available lists on server or -1 if error
This call initializes the tot_key_annots, key_annots, key_annots_min and want_key_annots fields of the raa_db_access structure.
- raa_showannots Processes selected annotation lines of a sequence.
void raa_showannots(raa_db_access *raa, int seqnum, char **featurekey_name,
int featurekey_count, int *new_choice, void (*outoneline)(char *, void *), void *pdata);
- raa: value of the remote acnuc connection
- seqnum: rank of target sequence
- featurekey_name: NULL or array of names of desired feature keys (e.g. CDS, TRNA) if part only of the feature table is targeted
- featurekey_count: 0 or number of elements in array featurekey_name
- *new_choice: (input) TRUE iff choice of desired annotation lines has changed since previous raa_showannots call; (output) FALSE
- outoneline: function that gets called for each matching annotation line with 2 arguments: the matching line and a pointer to some data
- pdata: NULL or data pointer transmitted as 2nd argument to outoneline funtion calls
Targeted annotation records are specified by the want_key_annots field of the raa_db_access structure: set want_key_annots[i] to TRUE iff annotation record named key_annots[i] is targeted. Alternatively, set want_key_annots[0] to TRUE to target all annotation records.
The featurekey_name array allows to further specify targeted parts of the features table.
Function raa_prep_acnuc_query should be called once before using raa_showannots any number of times.