EMBL: AY088514

ID   AY088514; SV 1; linear; mRNA; STD; PLN; 699 BP.
AC   AY088514;
DT   14-JUN-2002 (Rel. 72, Created)
DT   24-FEB-2006 (Rel. 86, Last updated, Version 4)
DE   Arabidopsis thaliana clone 742 mRNA, complete sequence.
OS   Arabidopsis thaliana (thale cress)
OC   Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta;
OC   Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae;
OC   rosids; malvids; Brassicales; Brassicaceae; Camelineae; Arabidopsis.
RN   [1]
RP   1-699
RX   PUBMED; 12093376.
RA   Haas B.J., Volfovsky N., Town C.D., Troukhan M., Alexandrov N.,
RA   Feldmann K.A., Flavell R.B., White O., Salzberg S.L.;
RT   "Full-length messenger RNA sequences greatly improve genome annotation";
RL   Genome Biol. 3(6):RESEARCH0029-RESEARCH0029(2002).
RN   [2]
RP   1-699
RA   Alexandrov N.A., Troukhan M.E., Brover V.V., Flavell R.B., Feldmann K.A.;
RT   "Features of Arabidopsis genes and genome discovered using full-length
RT   cDNAs";
RL   Plant Mol. Biol. 60(1):71-87(2006).
RN   [3]
RP   1-699
RA   Brover V., Troukhan M., Alexandrov N., Lu Y.-P., Flavell R., Feldmann K.;
RT   ;
RL   Submitted (11-MAR-2002) to the INSDC.
RL   Ceres, Inc, 3007 Malibu Canyon Road, Malibu, CA 90265, USA
DR   MD5; 83859f32d3d3ac63d1de570917f067ce.
CC   This clone sequence is one of 5,000 Ceres full-length cDNAs made
CC   available to TIGR and Genbank. The following quality assessment of
CC   this set was done by comparison with known proteins: two percent of
CC   the clones are estimated to be 5'-truncated; less than one percent
CC   are 3'-truncated; approximately two percent represent alternative
CC   splice variants, including unspliced introns and spliced exons; one
CC   percent may contain premature stop codons; five percent may have
CC   frame shifts in a coding region. A sequence is considered to be
CC   5'-truncated if it lacks the translation initiation start (ATG). A
CC   sequence is considered to be 3'-truncated if it lacks the
CC   C-terminal end of the encoded protein. Please note that these cDNA
CC   sequences are derived from the Ws or LAer ecotypes and therefore
CC   may contain polymorphisms when compared to sequences from Col-0.
CC   Genset carried out the library production and sequencing of the
CC   full-length clones. Ceres, Inc. carried out the clustering of the
CC   5' sequences, selection of clones, and sequence assembly.
FH   Key             Location/Qualifiers
FT   source          1..699
FT                   /organism="Arabidopsis thaliana"
FT                   /mol_type="mRNA"
FT                   /clone="742"
FT                   /db_xref="taxon:3702"
FT   CDS_pept        110..478
FT                   /codon_start=1
FT                   /product="copia-like retroelement pol polyprotein"
FT                   /db_xref="GOA:Q9STR3"
FT                   /db_xref="InterPro:IPR006808"
FT                   /db_xref="UniProtKB/TrEMBL:Q9STR3"
FT                   /protein_id="AAM66049.1"
FT                   YCAGEIVGRGFTFTGYYP"
SQ   Sequence 699 BP; 204 A; 142 C; 154 G; 199 T; 0 other;
     aaaaagctcc caagcctaga gattctgttg ttgatccgat tcccgctctc cttctccggt        60
     agtacaatct ccgtcgccgg ttacatcagc cagggaagag ttttaagaga tggcatcaaa       120
     gttgctacaa ttgaaatcca aggcatgtga agcatcaaag tttgtttcca agcatggaac       180
     aacttactac aaacagttgc tggataagaa caagatgtat atccaggagc cagctactat       240
     agagaaatgc aatgaattgt ctaagcagct tctctacact cgtcttgcta gcattcctgg       300
     gcgttctgag tcattttgga aggaagtcaa tcacgtgaag ggtttatgga agaatcgggc       360
     ggatttgaag gttgaagatg ctggaatagc tgcacttttt ggtctggaat gctttgcgtg       420
     gtactgtgca ggagagattg taggaagggg cttcactttc accggctact acccttgaag       480
     aaagaaagca accccctcat aaacaaacat gcgtttcaga tccattgaat accgaaaagc       540
     ctttttatat cataattagc ggcttattat ctatctcact gctgacatta gaagcaaagt       600
     taatttctgt ttctctggat aatttataca caaaagtttt gaaggtgagg catcatcact       660
     cgtgctagct ctaataatat cagtcgacta atgttggcc                              699

