Python API for remote ACNUC access

Download source code.
Build with : make python
Use with : export PYTHONPATH='directory of acnucmodule.so' ; python
>>> import acnuc


All functions in alphabetical order:
acnucclose, acnucopen, alllistranks, bcount, bit0, bit1, btest, copylist, countfilles, extract_1_seq, extract_interrupt, fcode, get_current_db, get_list_open_dbs, get_next_desc, get_taxon_info, getattributes, getemptylist, getlistrank, getliststate, gfrag, iknum, isenum, knowndbs, modifylist, next_annots, next_annots_offset, nexteltinlist, nexteltinlist_annots, nextmatchkey, open_socket, opendb, opendb_pw, prep_extract, proc_requete, py_codaa, py_read_sock, py_sock_flush, py_sock_fputs, read_annots, read_first_rec, readacc, readext, readkey, readlng, readloc, readshrt, readsmj, readspec, readsub, releaselist, residuecount, savelist, seq_to_annots, set_current_db, setlistname, setliststate, translate_cds, translate_init_codon, zerolist,

Function descriptions:

    acnucclose(...)
        closes access to the db.
    
    acnucopen(...)
        opens access to a remote acnuc database using address info.
        
        Optional keywords:
             db_name : name of the database (default 'embl')
             port : port number (default 5558)
             server_ip : ip name of the acnuc server (default 'pbil.univ-lyon1.fr')
        Return value :
             0 iff OK
             1 if problem with remote host name
             2 if cannot create connection with remote host
             3 if database is unknown by remote host
             4 if database is currently unavailable on remote host
             5 if a database was previously opened and was not closed
             6 authorization failed for password-protected database
             7 if not enough memory
    
    alllistranks(...)
        all currently defined lists.
        
        Return value:
            list of ranks of all currently defined lists
    
    bcount(...)
        counts the number of elements in a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value:
            number of elements in the list
    
    bit0(...)
        removes an element from a list.
        
        Keywords :
            lrank : rank of the list
            num: rank of the element to remove
    
    bit1(...)
        adds an element to a list.
        
        Keywords :
            lrank : rank of the list
            num: rank of the element to add
    
    btest(...)
        tests presence of element in a list.
        
        Keywords :
            lrank : rank of the list
            num: rank of the element to remove
        
        Return value:
            1 if element num is in list rank, 0 otherwise
    
    copylist(...)
        duplicates a list.
        
        Keywords :
            lfrom : rank of the list to copy
            lto : rank of the destination list that must have been previously allocated
    
    countfilles(...)
        counts the number of subsequences present in a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value :
            number of subsequences present
    
    extract_1_seq(...)
        successively extracts one sequence from list [see prep_extract()].
        
        Return value :
            number of extracted sequences (0 is possible), or -1 when
                   all of list was processed. Must call this function
                   until -1 is returned or call extract_interrupt.
    
    extract_interrupt(...)
        can be called after prep_extract and before extract_1_seq was 
        repeatedly called until returned -1.
    
    fcode(...)
        returns the rank of a record of an index file from its key.
        
        Keywords :
            name : record key
            type : 'AUT', 'BIB','ACC', 'SMJ' or 'SUB'
        
        Return value :
            rank of the corresponding key
    
    get_current_db(...)
        Return a pair of data identifying the currently opened db
        the pair is the db name as a string and an opaque string
    
    get_list_open_dbs(...)
        list all currently opened dbs.
        
        Return value :
            list of data pairs related to each opened db
    
    get_next_desc(...)
        returns one descendant species in species tree.
        
        Argument:
             integer returned by previous call to get_taxon_info in 'desc'
                           or by previous call to get_next_desc in 'next'
                or value 0 to access to first top-level taxon
        Return dictionary :
            "name" : name of descendant taxon
            "rank" : rank of descendant taxon 
            "next" : integer giving access to next taxon's descendants (0 indicates end)
    
    get_taxon_info(...)
        returns information about given taxon (by name or rank or ncbi's ID) in species tree.
        
        Optional keywords:
             name : name of the taxon (case insensitive, default (null) )
             rank : rank of the taxon (default 0)
             tid  : ncbi's ID number of the taxon (default 0)
        Return dictionary :
            "name" : name of taxon
            "rank" : rank of taxon (0 indicates taxon name does not exist)
            "tid"  : ncbi's ID of taxon (0 indicates taxon name does not exist)
            "parent" : rank of parent taxon in species tree (2 indicates top-level taxon)
            "desc" : integer giving access to taxon's descendants in species tree (0 means none)
    
    getattributes(...)
        gets various sequence attributes from name or acc. no.
        
        Keyword:
            ident : a sequence name or an accession number
        Optional keyword:
            fullsequence : 1 to get full sequence, 0 not to get sequence data (default)
        Return dictionary :
            "name" : the sequence name
            "length" : the sequence length
            "frame" : the sequence reading frame (0,1,2)
            "gc" : the sequence genetic code (0 means standard)
            "accession" : the sequence primary accession number
            "description" : one-line description of the sequence
            "species" : species name of the sequence
            "sequence" : full sequence (can be long!), or None if fullsequence=0
    
    getemptylist(...)
        finds an empty list.
        
        
        Keywords :
            name : name to give to the list
        
            Return value :
            list rank, or 0 if none is available; the list is set to zero
    
    getlistrank(...)
        gets the rank of a list from its name.
        
        Keyword :
            name : name of the list
        
        Return value :
            >0 if OK, 0 if no list with that name exists
    
    getliststate(...)
        gets the state of a list.
        
        Keyword :
            lrank: rank of the list
        
        Return dictionary :
            "locus" : 'T' iff list contains only parent sequences, 'F' otherwise
            "type" : 'SQ', 'KW', or 'SP' for list of seqs, keywords, or species
            "count" : number of elements in list
            "name" : list name
    
    gfrag(...)
        reads a sequence fragment.
        
        Keywords:
            nsub : rank of sequence
            length : number of residues to read
        
            start : first residue to read (counting from 1, default value)
        Return dictionary :
            "length" : number of residues read (can be 0)
            "seq" : string filled with residues
    
    iknum(...)
        gets rank of a species (SP) or a keyword (KW). 
        
        Keywords:
            name : a species or keyword (case is not significant)
            cas : "KW" or "SP"
        
        Return value:
            rank of given name (O if absent)
    
    isenum(...)
        gets rank of a sequence from its name.
        
        Keyword :
            name : a sequence name (case is not significant but must not contain \n)
        
        Return value :
            rank of given name (O if absent)
    
    knowndbs(...)
        gets name and description of all known databases.
        
        Return value :
            dictionary of {name of database :  (status , description of database)}
            where status can be 'ON' or 'OFF' to indicate if the db is currently available
    
    modifylist(...)
        modifies list according to length or date of its elements.
        
        Keywords :
            lrank : rank of list to be modified
            type : 'length', 'date'
            operation : (for length) '>10000' or '<500'
                        (for date) '>1/jan/2003' or '< 29/FEB/96'
        Return value :
            If no error, rank of newly created list containing result of operation.
    
    next_annots(...)
        reads the annotations line following the last one read.
        
        Return string :
            "seq" : line read (empty if error)
    
    next_annots_offset(...)
        reads the next line of annotations of a sequence and gives its address.
        
        Return tuple ( (div,offset),line ) :
            (div,offset) : data pair identifying the address of line read (can be used later as argument to read_annots)
        line : line read (empty if error)
    
    nexteltinlist(...)
        returns the next element of a list.
        
        Keywords :
            lrank : rank of the list
        
            first : elements of the list are searched after this position (initiate this to 0)
        Return dictionary :
             "name" : name of the element
             "length" : element length (for seq list only)
             "next" : rank of the next element in the list, or 0 if none
    
    nexteltinlist_annots(...)
        returns the next element of a sequence list and related information.
        
        Keywords :
            first : elements of the list are searched after this position
                    (initiate this to 0)
            lrank : rank of the sequence list
        
        Return dictionary :
            "name" : name of the sequence
            "length" : sequence length
            "offset" : annotation offset of the seq
            "div" : division rank of the seq
            "next" : rank the next sequence in the list, or 0 if none
    
    nextmatchkey(...)
        returns the next keyword matching a given pattern.
        
        Keywords :
            num : rank beyond which next matching keyword if sought
                  (set num=2 the first time)
            pattern : must contain at least once the wild card character @ 
                      (used only if num = 2)
        
        Return dictionary:
            "num" : rank of next matching keyword, or 0 if none matches
            "name" : this matching keyword, if any
    
    open_socket(...)
        opens access to the remote acnuc server.
        
        Optional keywords:
             server_ip : ip name of the acnuc server (default 'pbil.univ-lyon1.fr')
             port : port number (default 5558)
        Return value :
             0 iff OK
             1 if problem with remote host name
             2 if cannot create connection with remote host
             7 not enough memory
    
    opendb(...)
        opens an acnuc database after open_socket call.
        
        Optional keyword:
             db_name : name of the database (default 'embl')
        
        Return value :
             0 iff OK
             3 if database is unknown by remote host
             4 if database is currently unavailable on remote host
             5 if a database was previously opened and was not closed
             6 authorization failed for password-protected database
             9 no socket was previously opened by open_socket
    
    opendb_pw(...)
        opens a password-protected database after open_socket call.
        
        Keywords:
             db_name : name of the database
             psswrd : password 
        
        Return value :
             same as open_db
    
    prep_extract(...)
        prepares for extraction of all members of a sequence list
         to a local file :
        
        Keywords :
            format : 'fasta' or 'flat' (e.g., genbank, embl) or 'acnuc'
            fname :  name of output local file (appended if exists already)
            operation : 'simple', 'translate' (translates on the fly
                        CDS sequences), 'feature' (extracts fragment
                        corresponding to given feature name), 'fragment',
                        or 'region'
            lrank: rank of sequence list
        
        Optional keywords :
            feature_name : name of desired feature
            bornes, min_bornes: NULL unless operation is 'fragment' or 'region'
            
        Return value: 
            0 if OK, 1 if any error.
    
    proc_requete(...)
        processes a query.
        
        Keywords :
            query : a string containing a query following the acnuc query language
            name : the name of the list (case is not significant)
        
        Return dictionary :
            "lrank" : the rank of the created list
            "count" : the number of elements in the created list
            "locus" : 1 if list contains parent sequences only, 0 otherwise
            "type" : 'S', 'K', or 'E' for a list of seqs, keywords, or species
        
        In case of error, return dictionary:
            "error" : error messge
    
    py_codaa(...)
        translates a codon (trinucleotide) into an amino acid.
        
        Keywords:
            codon :a codon (trinucleotide)
            gc : genetic code (0 is default value and is standard genetic code)
        Return character :
            the amino acid
    
    py_read_sock(...)
        read a character line received from server.
        
        Return value :
            a full line of data received from server. The function exits if
                   communication with server is lost.
    
    py_sock_flush(...)
        flushes output to server.
        
        Return value :
            0 iff success
    
    py_sock_fputs(...)
        sends a character string to server according to protocol given in http://pbil.univ-lyon1.fr/databases/acnuc/remote_acnuc.html.
        
        Argument :
            character string
        
        Return value : 0 iff success
    
    read_annots(...)
        reads one line of annotations at given address.
        
        Argument tuple:
            (div,offset) : data pair identifying the address of the annotations line
          typically returned by seq_to_annots
        Return string:
            line read (empty if error)
    
    read_first_rec(...)
        returns the total number of records in an index file.
        
        Keyword :
            type : an index file, either 'AUT', 'BIB', 'ACC', 'SMJ', 'SUB',
                      'LOC', 'KEY', 'SPEC', 'SHRT', 'LNG', 'EXT', 'TXT'
        
        Return value :
            total number of records in the index file
    
    readacc(...)
        reads an ACCESS record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "name" : name of the record
            "plsub" : point to data read from record
    
    readext(...)
        reads an EXTRACT record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "next" : rank of next chained record, 0 if none
            "mere" : point to data read from record
            "deb" : 
            "fin" :
    
    readkey(...)
        reads a KEYWORDS record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "libel" : label
            "plsub" : 
            "desc" : 
            "syno" :
    
    readlng(...)
        reads a long list record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "next" : rank of next chained record, 0 if none
            "val" : list of 63 integer values that, when non null, are members of the list
    
    readloc(...)
        reads a LOCUS record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "date" : date
            "sub" :
            "pnuc" : 
            "spec" : 
            "host" : 
            "plref" : 
            "molec" : 
            "placc" :
            "org" :
    
    readshrt(...)
        reads a short list element.
        
        Keyword :
            num : rank of element
        
        Return dictionary :
            "next" : rank of next chained record, 0 if none
            "val" : element value
    
    readsmj(...)
        reads an SMJYT record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "libel" : label
            "name" : name of the record
            "plong" : point to data read from record
    
    readspec(...)
        reads a SPECIES record.
        
        Keyword :
            num : rank of record
        
        Return dictionary :
            "name" : name of species
            "libel" : label
            "plsub" : 
            "desc" : 
            "syno" : 
            "host" :
    
    readsub(...)
        reads a SUBSEQ record.
        
        Keyword :
            num : sequence rank
        
        Return dictionary :
            "name" : sequence name
            "length" : sequence length
            "type" : rank of seq type
            "lkey" : start of short list of keywords
            "locus" : LOCUS rank for a parent sequence or 0 for a subsequence
            "frame" : reading frame
            "gc" : genetic code (0 if standard)
            "ext" : > 0 indicates a subsequence and pext is a record
                        # in EXTRACT
                      <= 0 indicates a parent sequence and 
                        -ext is the start of long list of subsequences
    
    releaselist(...)
        releases a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value :
            0 if OK, !=0 if no list with that rank exists
    
    residuecount(...)
        counts the number of residues (nucl. or aa) in all seqs of a list.
        
        Keyword :
            lrank : rank of the list
        
        Return value (with type 'str') :
            total number of residues
    
    savelist(...)
        saves in a local file names or acc. nos of members of a list.
        
        Keywords :
            lrank : rank of list to be saved
            file : name of the file to save list members
        
        Optional keywords :
            type : 'w' write, or 'a' (default value) append, in the file
            use_acc : if 0, save name of list members;
                      otherwise (default) accession members
            prefix : write prefix (default empty string) before each name
                     of each member in file
    
    seq_to_annots(...)
        gets address of start of annotations for a sequence.
        
        Keyword:
            nsub : rank of sequence
        
        Return tuple :
            (div,offset) : data pair identifying the address where seq annotations begin
          typically used as argument to read_annots
    
    set_current_db(...)
        to set the currently accessed db to a new db .
        
        Argument :
            data pair related to the desired db, e.g., obtained from a previous call to get_current_db()
        
        Return value : none
    
    setlistname(...)
        sets the name of a list.
        
        Keywords :
            lrank : rank of the list
            name : name to give to the list
        
        Return value :
            0 : OK
            1 : a list with that name already existed and was deleted
            -1 : no list with that rank exists
    
    setliststate(...)
        sets the state of a list.
        
        Keywords :
            lrank: rank of the list
            locus : 'T' iff list contains only parent sequences,
                    'F' otherwise
            type : 'SQ', 'KW', or 'SP' for list of seqs, keywords, or species
    
    translate_cds(...)
        translates all of a protein-coding sequence into protein
           using adequate genetic code and reading frame.
        
        Keywords:
            nsub : rank of sequence
        Return string :
            string filled with residues
    
    translate_init_codon(...)
        translates the first codon of a protein-coding sequence into amino acid.
        
        Keywords:
            nsub : rank of sequence
        Return character :
            the amino acid
    
    zerolist(...)
        empties a list.
        
        Keyword :
            lrank : rank of the list to empty that must have been previously allocated