Python API for remote ACNUC access

Download source code.
Build with : make python
Use with : export PYTHONPATH='directory of acnucmodule.so' ; python
>>> import acnuc


All functions in alphabetical order:
acnucclose, acnucopen, alllistranks, bcount, bit0, bit1, btest, copylist, countfilles, extract_1_seq, extract_interrupt, fcode, followshrt2, get_current_db, get_list_open_dbs, get_next_desc, get_taxon_info, getannots, getattributes, getemptylist, getlistrank, getliststate, gfrag, iknum, isenum, knowndbs, maxlists, modifylist, next_annots, next_annots_offset, nexteltinlist, nexteltinlist_annots, nextmatchkey, open_socket, opendb, opendb_pw, prep_extract, prep_getannots, proc_query, proc_requete, py_codaa, py_read_sock, py_sock_flush, py_sock_fputs, read_annots, read_first_rec, readacc, readaut, readbib, readext, readkey, readlng, readloc, readshrt, readsmj, readspec, readsub, releaselist, residuecount, savelist, seq_to_annots, set_current_db, setlistname, setliststate, translate_cds, translate_init_codon, versionstring, zerolist,

Function descriptions:

    acnucclose(...)
        closes access to the db.
    
    acnucopen(...)
        opens access to a remote acnuc database using address info.
        
        Optional keywords:
             db_name : name of the database (default 'embl')
             port : port number (default 5558)
             server_ip : ip name of the acnuc server (default 'pbil.univ-lyon1.fr')
             maxlists : maximum number of distinct lists allowed (default 50)
                        use maxlists() command after to get the effective maximum number of lists
        Return value :
             0 iff OK
             1 if problem with remote host name
             2 if cannot create connection with remote host
             3 if database is unknown by remote host
             4 if database is currently unavailable on remote host
             5 if a database was previously opened and was not closed
             6 authorization failed for password-protected database
             7 if not enough memory
    
    alllistranks(...)
        all currently defined lists.
        
        Return value:
            list of ranks of all currently defined lists
    
    bcount(...)
        counts the number of elements in a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value:
            number of elements in the list
    
    bit0(...)
        removes an element from a list.
        
        Keywords :
            lrank : rank of the list
            num: rank of the element to remove
    
    bit1(...)
        adds an element to a list.
        
        Keywords :
            lrank : rank of the list
            num: rank of the element to add
    
    btest(...)
        tests presence of element in a list.
        
        Keywords :
            lrank : rank of the list
            num: rank of the element to remove
        
        Return value:
            1 if element num is in list rank, 0 otherwise
    
    copylist(...)
        duplicates a list.
        
        Keywords :
            lfrom : rank of the list to copy
            lto : rank of the destination list that must have been previously allocated
    
    countfilles(...)
        counts the number of subsequences present in a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value :
            number of subsequences present
    
    extract_1_seq(...)
        successively extracts one sequence from list [see prep_extract()].
        
        Return value :
            number of extracted sequences (0 is possible), or -1 when
                   all of list was processed. Must call this function
                   until -1 is returned or call extract_interrupt().
    
    extract_interrupt(...)
        can be called after prep_extract() and before extract_1_seq() was 
        repeatedly called until returned -1.
    
    fcode(...)
        returns the rank of a record of an index file from its key.
        
        Keywords :
            name : record key
            type : 'AUT', 'BIB','ACC', 'SMJ' or 'SUB'
        
        Return value :
            rank of the corresponding key
    
    followshrt2(...)
        reads a short list element.
        
        Keyword :
            num : start of list
            rank : rank within list (start with 0)
            kind : kind of list: 'sub_of_bib', 'spec_of_loc', 'bib_of_loc', 'aut_of_bib', 'bib_of_aut', 
          'sub_of_acc', 'key_of_sub', 'acc_of_loc' 
        
        Return dictionary :
            "val" : list element
            "num" : adequate to access next list element (0 when list is finished)
            "rank" : adequate to access next list element
    
    get_current_db(...)
        Return a pair of data identifying the currently opened db
        the pair is the db name as a string and an opaque string
    
    get_list_open_dbs(...)
        list all currently opened dbs.
        
        Return value :
            list of data pairs related to each opened db
    
    get_next_desc(...)
        returns one descendant species in species tree.
        
        Argument:
             integer returned by previous call to get_taxon_info in 'desc'
                           or by previous call to get_next_desc in 'next'
                or value 0 to access to first top-level taxon
        Return dictionary :
            "name" : name of descendant taxon
            "rank" : rank of descendant taxon 
            "next" : integer giving access to next taxon's descendants (0 indicates end)
    
    get_taxon_info(...)
        returns information about given taxon (by name or rank or ncbi's ID) in species tree.
        
        Optional keywords:
             name : name of the taxon (case insensitive, default (null) )
             rank : rank of the taxon (default 0)
             tid  : ncbi's ID number of the taxon (default 0)
        Return dictionary :
            "name" : name of taxon
            "rank" : rank of taxon (0 indicates taxon name does not exist)
            "tid"  : ncbi's ID of taxon (0 indicates taxon name does not exist)
            "parent" : rank of parent taxon in species tree (2 indicates top-level taxon)
            "label" : string containing genetic code and taxonomic depth information
            "desc" : integer giving access to taxon's descendants in species tree (0 means none)
    
    getannots(...)
        Write to a local file selected annotation items from a sequence or all members of a sequence list.
        
        Argument:
            "fname" : Local filename (or '-' to indicate standard output)
        Optional keywords (use only one of them):
            "nsub" : rank of a sequence
            "lrank" : rank of a list of sequences
        Return value :
            0 indicates OK, non-zero indicates error.
            Use function prep_getannots() to define what part(s) of sequence annotation is received.
        Use this function with caution with very large sequence lists because it can download a large volume of data.
    
    getattributes(...)
        gets various sequence attributes from name or acc. no.
        
        Keyword:
            ident : a sequence name or an accession number
        Optional keyword:
            fullsequence : 1 to get full sequence, 0 not to get sequence data (default), 2 to get also the full protein sequence
        Return dictionary :
            "name" : the sequence name
            "rank" : the sequence rank in the database
            "length" : the sequence length
            "frame" : the sequence reading frame (0,1,2)
            "gc" : the sequence genetic code (0 means standard)
                     (see list of acnuc-defined codes)
            "accession" : the sequence primary accession number
            "description" : one-line description of the sequence
            "species" : species name of the sequence
            "sequence" : full sequence (can be long!), or None if fullsequence = 0
            "protein" : full protein sequence, or None if fullsequence != 2
    
    getemptylist(...)
        finds an empty list.
        
        Keywords :
            name : name to give to the list
        Return value :
            list rank, or 0 if none is available; the list is set to zero
    
    getlistrank(...)
        gets the rank of a list from its name.
        
        Keyword :
            name : name of the list
        
        Return value :
            >0 if OK, 0 if no list with that name exists
    
    getliststate(...)
        gets the state of a list.
        
        Keyword :
            lrank: rank of the list
        
        Return dictionary :
            "locus" : 'T' iff list contains only parent sequences, 'F' otherwise
            "type" : 'SQ', 'KW', or 'SP' for list of seqs, keywords, or species
            "count" : number of elements in list
            "name" : list name
    
    gfrag(...)
        reads a sequence fragment.
        
        Keywords:
            nsub : rank of sequence
            length : number of residues to read
        
            start : first residue to read (counting from 1, default value)
        Return dictionary :
            "length" : number of residues read (can be 0)
            "seq" : string filled with residues
    
    iknum(...)
        gets rank of a species (SP) or a keyword (KW). 
        
        Keywords:
            name : a species or keyword (case is not significant)
            cas : "KW" or "SP"
        
        Return value:
            rank of given name (O if absent)
    
    isenum(...)
        gets rank of a sequence from its name.
        
        Keyword :
            name : a sequence name (case is not significant but must not contain \n)
        
        Return value :
            rank of given name (O if absent)
    
    knowndbs(...)
        gets name and description of all known databases.
        
        Return value :
            dictionary of {name of database :  (status , description of database)}
            where status can be 'ON' or 'OFF' to indicate if the db is currently available
    
    maxlists(...)
        Maximum number of lists.
        
        Return value :
             Maximum number of lists that can be created
        See acnucopen() for how to change this value from its default.
    
    modifylist(...)
        modifies list according to length or date of its elements.
        
        Keywords :
            lrank : rank of list to be modified
            type : 'length', 'date'
            operation : (for length) '>10000' or '<500'
                        (for date) '>1/jan/2003' or '< 29/FEB/96'
        Return value :
            If no error, rank of newly created list containing result of operation.
    
    next_annots(...)
        reads the annotations line following the last one read.
        
        Return string :
            "seq" : line read (empty if error)
    
    next_annots_offset(...)
        reads the next line of annotations of a sequence and gives its address.
        
        Return tuple ( (div,offset),line ) :
            (div,offset) : data pair identifying the address of line read (can be used later as argument to read_annots())
        line : line read (empty if error)
    
    nexteltinlist(...)
        returns the next element of a list.
        
        Keywords :
            lrank : rank of the list
        
            first : elements of the list are searched after this position (initiate this to 0)
        Return dictionary :
             "name" : name of the element
             "length" : element length (for seq list only)
             "next" : rank of the next element in the list, or 0 if none
    
    nexteltinlist_annots(...)
        returns the next element of a sequence list and related information.
        
        Keywords :
            first : elements of the list are searched after this position
                    (initiate this to 0)
            lrank : rank of the sequence list
        
        Return dictionary :
            "name" : name of the sequence
            "length" : sequence length
            "offset" : annotation offset of the seq
            "div" : division rank of the seq
            "next" : rank the next sequence in the list, or 0 if none
    
    nextmatchkey(...)
        returns the next keyword matching a given pattern.
        
        Keywords :
            num : rank beyond which next matching keyword if sought
                  (set num=2 the first time)
            pattern : must contain at least once the wild card character @ 
                      (used only if num = 2)
        
        Return dictionary:
            "num" : rank of next matching keyword, or 0 if none matches
            "name" : this matching keyword, if any
    
    open_socket(...)
        opens access to the remote acnuc server.
        
        Optional keywords:
             server_ip : ip name of the acnuc server (default 'pbil.univ-lyon1.fr')
             port : port number (default 5558)
        Return value :
             0 iff OK
             1 if problem with remote host name
             2 if cannot create connection with remote host
             7 not enough memory
    
    opendb(...)
        opens an acnuc database after open_socket call.
        
        Optional keyword:
             db_name : name of the database (default 'embl')
        
        Return value :
             0 iff OK
             3 if database is unknown by remote host
             4 if database is currently unavailable on remote host
             5 if a database was previously opened and was not closed
             6 authorization failed for password-protected database
             9 no socket was previously opened by open_socket
    
    opendb_pw(...)
        opens a password-protected database after open_socket call.
        
        Keywords:
             db_name : name of the database
             psswrd : password 
        
        Return value :
             same as open_db
    
    prep_extract(...)
        prepares for extraction of all members of a sequence list
         to a local file :
        
        Keywords :
            format : 'fasta' or 'flat' (e.g., genbank, embl) or 'acnuc'
            fname :  name of output local file (appended if exists already) ('-' represents the standard output)
            operation : 'simple', 'translate' (translates on the fly
                        CDS sequences), 'feature' (extracts fragment
                        corresponding to given feature name), 'fragment',
                        or 'region'
            lrank: rank of sequence list
        Optional keywords :
            feature_name : name of desired feature
            bornes, min_bornes: NULL unless operation is 'fragment' or 'region'
        Return value: 
            0 if OK, 1 if any error.
    
    prep_getannots(...)
        Defines what sort of annotation data will be received by future getannots() commands.
        
        Argument:
            String containing desired annotation items and feature table entries, separated by commas
            Example for EMBL formatted-database: 'KW,PR,FT|CDS,FT|rRNA'
            Example for GenBank formatted-database: 'KEYWORDS,DBLINK,FEATURES|CDS,FEATURES|rRNA'
        Return value :
            0 indicates OK, non-zero indicates error.
    
    proc_query(...)
        processes a query.
        
        Keywords :
            query : a string containing a query following the acnuc query language.
            name : the name of the list (case is not significant)
        
        Return dictionary :
            "lrank" : the rank of the created list
            "count" : the number of elements in the created list
            "locus" : 1 if list contains parent sequences only, 0 otherwise
            "type" : 'S', 'K', or 'E' for a list of seqs, keywords, or species
        
        In case of error, return dictionary:
            "error" : error messge
    
    proc_requete(...)
        synonym of proc_query.
    
    py_codaa(...)
        translates a codon (trinucleotide) into an amino acid.
        
        Keywords:
            codon :a codon (trinucleotide)
            gc : genetic code (0 is default value and is standard genetic code)
                 (see list of acnuc-defined codes)
        Return character :
            the amino acid
    
    py_read_sock(...)
        read a character line received from server.
        
        Return value :
            a full line of data received from server. The function exits if
                   communication with server is lost.
    
    py_sock_flush(...)
        flushes output to server.
        
        Return value :
            0 iff success
    
    py_sock_fputs(...)
        sends a character string to server according to remote acnuc access protocol.
        Argument :
            character string
        
        Return value : ≥ 0 if success
    
    read_annots(...)
        reads one line of annotations at given address.
        
        Argument tuple:
            (div,offset) : data pair identifying the address of the annotations line
          typically returned by seq_to_annots()
        Return string:
            line read (empty if error)
    
    read_first_rec(...)
        returns the total number of records in an index file.
        
        Keyword :
            type : an index file, either 'AUT', 'BIB', 'ACC', 'SMJ', 'SUB',
                      'LOC', 'KEY', 'SPEC', 'SHRT', 'LNG', 'EXT', 'TXT'
        
        Return value :
            total number of records in the index file
    
    readacc(...)
        reads an ACCESS record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "name" : name of the record
            "plsub" : start of short list of attached sequences
    
    readaut(...)
        reads an AUTHOR record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "name" : name of the author
            "plref" : start of short list of attached references
    
    readbib(...)
        reads a BIBLIO record.
        
        Keyword :
            num : rank of record
        
            Return dictionary :
            "name" : name of the reference
            "plsub" : start of short list of attached sequences
            "plaut" : start of short list of attached authors
            "journal" : rank in SMJYT of journal of publication
            "year" : rank in SMJYT of year of pubication
    
    readext(...)
        reads an EXTRACT record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "next" : rank of next chained record, 0 if none
            "mere" : rank of parent sequence of this part of the subsequence
            "deb" : position in parent sequence of the beginning of this part of the subsequence
            "fin" : position in parent sequence of the beginning of this part of the subsequence
                    deb > fin means this part of the subsequence is on the complementary strand of the parent
    
    readkey(...)
        reads a KEYWORDS record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "name" : name of keyword
            "libel" : label
            "plsub" : 
            "desc" : 
            "syno" :
    
    readlng(...)
        reads a long list record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "next" : rank of next chained record, 0 if none
            "val" : list of 63 integer values that, when non null, are members of the list
    
    readloc(...)
        reads a LOCUS record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "date" : date
            "sub" :
            "pnuc" : 
            "spec" : 
            "host" : 
            "plref" : 
            "molec" : 
            "placc" :
            "org" :
    
    readshrt(...)
        reads a short list element.
        
        Keyword :
            num : rank of element
        
        Return dictionary :
            "next" : rank of next chained record, 0 if none
            "val" : element value
    
    readsmj(...)
        reads an SMJYT record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "libel" : label
            "name" : name of the record
            "plong" : start of long list of attached sequences
    
    readspec(...)
        reads a SPECIES record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "name" : name of species
            "libel" : label
            "plsub" : 
            "desc" : 
            "syno" : 
            "host" :
    
    readsub(...)
        reads a SUBSEQ record.
        
        Keyword :
            num : sequence rank
        
        Return dictionary :
            "name" : sequence name
            "length" : sequence length
            "type" : rank of seq type
            "lkey" : start of short list of keywords
            "locus" : LOCUS rank for a parent sequence or 0 for a subsequence
            "frame" : reading frame
            "gc" : genetic code (0 if standard)
                     (see list of acnuc-defined codes)
            "ext" : > 0 indicates a subsequence and pext is a record
                        # in EXTRACT
                      <= 0 indicates a parent sequence and 
                        -ext is the start of long list of subsequences
    
    releaselist(...)
        releases a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value :
            0 if OK, !=0 if no list with that rank exists
    
    residuecount(...)
        counts the number of residues (nucl. or aa) in all seqs of a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value (with type 'str') :
            total number of residues
    
    savelist(...)
        saves in a local file names or acc. nos of members of a list.
        
        Keywords :
            lrank : rank of list to be saved
            file : name of the file to save list members
        
        Optional keywords :
            type : 'w' write, or 'a' (default value) append, in the file
            use_acc : if 0, save name of list members;
                      otherwise (default) accession members
            prefix : write prefix (default empty string) before each name
                     of each member in file
    
    seq_to_annots(...)
        gets address of start of annotations for a sequence.
        
        Keyword:
            nsub : rank of sequence
        
        Return tuple :
            (div,offset) : data pair identifying the address where seq annotations begin
          typically used as argument to read_annots()
    
    set_current_db(...)
        to set the currently accessed db to a new db .
        
        Argument :
            data pair related to the desired db, e.g., obtained from a previous call to get_current_db()
        
        Return value : none
    
    setlistname(...)
        sets the name of a list.
        
        Keywords :
            lrank : rank of the list
            name : name to give to the list
        
        Return value :
            0 : OK
            1 : a list with that name already existed and was deleted
            -1 : no list with that rank exists
    
    setliststate(...)
        sets the state of a list.
        
        Keywords :
            lrank: rank of the list
            locus : 'T' iff list contains only parent sequences,
                    'F' otherwise
            type : 'SQ', 'KW', or 'SP' for list of seqs, keywords, or species
    
    translate_cds(...)
        translates all of a protein-coding sequence into protein
           using adequate genetic code and reading frame.
        
        Keywords:
            nsub : rank of sequence
        Return string :
            string filled with residues
    
    translate_init_codon(...)
        translates the first codon of a protein-coding sequence into amino acid.
        
        Keywords:
            nsub : rank of sequence
        Return character :
            the amino acid
    
    versionstring(...)
        Return a string identifying the version of the currently opened db
    
    zerolist(...)
        empties a list.
        
        Keyword :
            lrank : rank of the list to empty that must have been previously allocated