EMBL: HSINS01

ID   HSINS01    standard; DNA; HUM; 4044 BP.
XX
AC   J00265;
XX
SV   J00265.1
XX
DT   16-JUL-1988 (Rel. 16, Created)
DT   13-FEB-2001 (Rel. 66, Last updated, Version 6)
XX
DE   Human insulin gene, complete cds.
XX
KW   GC rich region; insulin; polymorphic variation; tandem repeat.
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX
RN   [1]
RP   2414-2610
RX   MEDLINE; 80054779.
RX   PUBMED; 503234.
RA   Bell G.I., Swain W.F., Pictet R., Cordell B., Goodman H.M., Rutter W.J.;
RT   "Nucleotide sequence of a cDNA clone encoding human preproinsulin";
RL   Nature 282(5738):525-527(1979).
XX
RN   [2]
RP   1925-3715
RX   MEDLINE; 80120725.
RA   Bell G.I., Pictet R.L., Rutter W.J., Cordell B., Tischer E., Goodman H.M.;
RT   "Sequence of the human insulin gene";
RL   Nature 284(5751):26-32(1980).
XX
RN   [3]
RP   2411-2610
RX   MEDLINE; 80147417.
RX   PUBMED; 6927840.
RA   Sures I., Goeddel D.V., Gray A., Ullrich A.;
RT   "Nucleotide sequence of human preproinsulin complementary DNA";
RL   Science 208(4439):57-59(1980).
XX
RN   [4]
RP   1928-3651
RX   MEDLINE; 80236313.
RA   Ullrich A., Dull T.J., Gray A., Brosius J., Sures I.;
RT   "Genetic variation in the human insulin gene";
RL   Science 209(4456):612-615(1980).
XX
RN   [5]
RP   1-4044
RX   MEDLINE; 81053754.
RX   PUBMED; 6253909.
RA   Bell G.I., Pictet R., Rutter W.J.;
RT   "Analysis of the regions flanking the human insulin gene and sequence of an
RT   Alu family member";
RL   Nucleic Acids Res. 8(18):4091-4109(1980).
XX
RN   [6]
RP   1-2227
RX   MEDLINE; 82125365.
RX   PUBMED; 7035959.
RA   Bell G.I., Selby M.J., Rutter W.J.;
RT   "The highly polymorphic region near the human insulin gene is composed of
RT   simple tandemly repeating sequences";
RL   Nature 295(5844):31-35(1982).
XX
RN   [7]
RP   917-1428, 1828-2185, 3615-4044
RX   MEDLINE; 82221404.
RX   PUBMED; 6283472.
RA   Ullrich A., Dull T.J., Gray A., Philips J.A., Peter S.;
RT   "Variation in the sequence and modification state of the human insulin gene
RT   flanking regions";
RL   Nucleic Acids Res. 10(7):2225-2240(1982).
XX
DR   EPD; EP07109; HS_INS.
DR   GDB; 119349; INS.
DR   SWISS-PROT; P01308; INS_HUMAN.
DR   TRANSFAC; R02707; HS$INS_02.
DR   TRANSFAC; R02708; HS$INS_01.
DR   TRANSFAC; R02709; HS$INS_03.
DR   TRANSFAC; R02710; HS$INS_04.
DR   TRANSFAC; R02712; HS$INS_06.
DR   TRANSFAC; R04457; HS$INS_07.
XX
CC   The human insulin gene region consists of three exons and two
CC   introns coding for a signal peptide, a b-chain, a c-peptide, and an
CC   a-chain. Present evidence favors a single insulin gene per haploid
CC   genome; however, allelic and polymorphic variation are conspicuous.
CC   The two major alleles studied thus far are denoted alpha and beta.
CC   The 5' flanks for these are so different, largely because of the
CC   presence of tandem repeats not found elsewhere in the human genome,
CC   that separate entries have been made for this region (see J00266
CC   and J00267). Thus differences in the first 2000 bases are not
CC   annotated below. This sequence heterogeneity is generated largely,
CC   though not exclusively, by a family of G+C-rich oligonucleotides
CC   whose consensus sequence is ACAGGGGTGTGGGG. In the 5' sequence
CC   reported below (from [5]), these occur most obviously between bases
CC   1340 and 1823. While the variation in the 5' flank may be
CC   significant for gene expression, it has not been associated to date
CC   with diabetic conditions. [4],[5],[6] discuss this variation in
CC   detail. Variation in the form of base modification is observed in
CC   the 3' flanking sequence ([6]). Conflicts between [5],[6] in this
CC   region may ultimately prove to be polymorphic variations.  This
CC   sequence of 4044 bases (which most closely represents the beta
CC   allele) was communicated with revisions by G.I.Bell. An additional
CC   stretch of about 950 bases in the 3' flank, which has not been
CC   published, is available through G.I.Bell or this library. See other
CC   loci beginning  and other loci with ins as the 4th-6th
CC   characters of the locus name.
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..4044
FT                   /db_xref="taxon:9606"
FT                   /organism="Homo sapiens"
FT                   /dev_stage="foetus"
FT                   /tissue_type="liver"
FT                   /map="11p15.5"
FT   exon            2186..2227
FT                   /note="G00-119-349"
FT                   /number=1
FT                   /gene="INS"
FT   intron          2228..2406
FT                   /note="G00-119-349"
FT                   /number=1
FT                   /gene="INS"
FT   variation       2401
FT                   /note="a in alpha-allele; t in beta allele ([4])"
FT                   /gene="INS"
FT   CDS_pept        join(2424..2610,3397..3542)
FT                   /codon_start=1
FT                   /db_xref="SWISS-PROT:P01308"
FT                   /note="precursor"
FT                   /gene="INS"
FT                   /product="insulin"
FT                   /protein_id="AAA59172.1"
FT                   /translation="MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGE
FT                   RGFFYTPKTRREAEDLQVGQVELGGGPGAGSLQPLALEGSLQKRGIVEQCCTSICSLYQ
FT                   LENYCN"
FT   sig_peptide     2424..2495
FT                   /note="G00-119-349"
FT                   /gene="INS"
FT   mat_peptide     join(2496..2610,3397..3539)
FT                   /gene="INS"
FT                   /product="c peptide; G00-119-349"
FT   intron          2611..3396
FT                   /note="G00-119-349"
FT                   /number=2
FT                   /gene="INS"
FT   variation       3229
FT                   /note="c in alpha-allele; g in beta-allele ([4])"
FT                   /gene="INS"
FT   exon            3397..>3615
FT                   /note="G00-119-349"
FT                   /number=2
FT                   /gene="INS"
FT   variation       3551
FT                   /note="c in alpha-allele; t in beta-allele ([4])"
FT                   /gene="INS"
FT   variation       3564
FT                   /note="c in alpha-allele; a in beta-allele ([4])"
FT                   /gene="INS"
XX
SQ   Sequence 4044 BP; 680 A; 1239 C; 1417 G; 708 T; 0 other;
     ctcgaggggc ctagacattg ccctccagag agagcaccca acaccctcca ggcttgaccg        60
     gccagggtgt ccccttccta ccttggagag agcagcccca gggcatcctg cagggggtgc       120
     tgggacacca gctggccttc aaggtctctg cctccctcca gccaccccac tacacgctgc       180
     tgggatcctg gatctcagct ccctggccga caacactggc aaactcctac tcatccacga       240
     aggccctcct gggcatggtg gtccttccca gcctggcagt ctgttcctca cacaccttgt       300
     tagtgcccag cccctgaggt tgcagctggg ggtgtctctg aagggctgtg agcccccagg       360
     aagccctggg gaagtgcctg ccttgcctcc ccccggccct gccagcgcct ggctctgccc       420
     tcctacctgg gctcccccca tccagcctcc ctccctacac actcctctca aggaggcacc       480
     catgtcctct ccagctgccg ggcctcagag cactgtggcg tcctggggca gccaccgcat       540
     gtcctgctgt ggcatggctc agggtggaaa gggcggaagg gaggggtcct gcagatagct       600
     ggtgcccact accaaacccg ctcggggcag gagagccaaa ggctgggtgt gtgcagagcg       660
     gccccgagag gttccgaggc tgaggccagg gtgggacata gggatgcgag gggccggggc       720
     acaggatact ccaacctgcc tgcccccatg gtctcatcct cctgcttctg ggacctcctg       780
     atcctgcccc tggtgctaag aggcaggtaa ggggctgcag gcagcagggc tcggagccca       840
     tgccccctca ccatgggtca ggctggacct ccaggtgcct gttctgggga gctgggaggg       900
     ccggaggggt gtaccccagg ggctcagccc agatgacact atgggggtga tggtgtcatg       960
     ggacctggcc aggagagggg agatgggctc ccagaagagg agtgggggct gagagggtgc      1020
     ctggggggcc aggacggagc tgggccagtg cacagcttcc cacacctgcc cacccccaga      1080
     gtcctgccgc cacccccaga tcacacggaa gatgaggtcc gagtggcctg ctgaggactt      1140
     gctgcttgtc cccaggtccc caggtcatgc cctccttctg ccaccctggg gagctgaggg      1200
     cctcagctgg ggctgctgtc ctaaggcagg gtgggaacta ggcagccagc agggagggga      1260
     cccctccctc actcccactc tcccaccccc accaccttgg cccatccatg gcggcatctt      1320
     gggccatccg ggactgggga caggggtcct ggggacaggg gtccggggac agggtcctgg      1380
     ggacaggggt gtggggacag gggtctgggg acaggggtgt ggggacaggg gtgtggggac      1440
     aggggtctgg ggacaggggt gtggggacag gggtccgggg acaggggtgt ggggacaggg      1500
     gtctggggac aggggtgtgg ggacaggggt gtggggacag gggtctgggg acaggggtgt      1560
     ggggacaggg gtcctgggga caggggtgtg gggacagggg tgtggggaca ggggtgtggg      1620
     gacaggggtg tggggacagg ggtcctgggg ataggggtgt ggggacaggg gtgtggggac      1680
     aggggtcccg gggacagggg tgtggggaca ggggtgtggg gacaggggtc ctggggacag      1740
     gggtctgagg acaggggtgt gggcacaggg gtcctgggga caggggtcct ggggacaggg      1800
     gtcctgggga caggggtctg gggacagcag cgcaaagagc cccgccctgc agcctccagc      1860
     tctcctggtc taatgtggaa agtggcccag gtgagggctt tgctctcctg gagacatttg      1920
     cccccagctg tgagcaggga caggtctggc caccgggccc ctggttaaga ctctaatgac      1980
     ccgctggtcc tgaggaagag gtgctgacga ccaaggagat cttcccacag acccagcacc      2040
     agggaaatgg tccggaaatt gcagcctcag cccccagcca tctgccgacc cccccacccc      2100
     gccctaatgg gccaggcggc aggggttgac aggtagggga gatgggctct gagactataa      2160
     agccagcggg ggcccagcag ccctcagccc tccaggacag gctgcatcag aagaggccat      2220
     caagcaggtc tgttccaagg gcctttgcgt caggtgggct cagggttcca gggtggctgg      2280
     accccaggcc ccagctctgc agcagggagg acgtggctgg gctcgtgaag catgtggggg      2340
     tgagcccagg ggccccaagg cagggcacct ggccttcagc ctgcctcagc cctgcctgtc      2400
     tcccagatca ctgtccttct gccatggccc tgtggatgcg cctcctgccc ctgctggcgc      2460
     tgctggccct ctggggacct gacccagccg cagcctttgt gaaccaacac ctgtgcggct      2520
     cacacctggt ggaagctctc tacctagtgt gcggggaacg aggcttcttc tacacaccca      2580
     agacccgccg ggaggcagag gacctgcagg gtgagccaac cgcccattgc tgcccctggc      2640
     cgcccccagc caccccctgc tcctggcgct cccacccagc atgggcagaa gggggcagga      2700
     ggctgccacc cagcaggggg tcaggtgcac ttttttaaaa agaagttctc ttggtcacgt      2760
     cctaaaagtg accagctccc tgtggcccag tcagaatctc agcctgagga cggtgttggc      2820
     ttcggcagcc ccgagataca tcagagggtg ggcacgctcc tccctccact cgcccctcaa      2880
     acaaatgccc cgcagcccat ttctccaccc tcatttgatg accgcagatt caagtgtttt      2940
     gttaagtaaa gtcctgggtg acctggggtc acagggtgcc ccacgctgcc tgcctctggg      3000
     cgaacacccc atcacgcccg gaggagggcg tggctgcctg cctgagtggg ccagacccct      3060
     gtcgccagcc tcacggcagc tccatagtca ggagatgggg aagatgctgg ggacaggccc      3120
     tggggagaag tactgggatc acctgttcag gctcccactg tgacgctgcc ccggggcggg      3180
     ggaaggaggt gggacatgtg ggcgttgggg cctgtaggtc cacacccagt gtgggtgacc      3240
     ctccctctaa cctgggtcca gcccggctgg agatgggtgg gagtgcgacc tagggctggc      3300
     gggcaggcgg gcactgtgtc tccctgactg tgtcctcctg tgtccctctg cctcgccgct      3360
     gttccggaac ctgctctgcg cggcacgtcc tggcagtggg gcaggtggag ctgggcgggg      3420
     gccctggtgc aggcagcctg cagcccttgg ccctggaggg gtccctgcag aagcgtggca      3480
     ttgtggaaca atgctgtacc agcatctgct ccctctacca gctggagaac tactgcaact      3540
     agacgcagcc tgcaggcagc cccacacccg ccgcctcctg caccgagaga gatggaataa      3600
     agcccttgaa ccagccctgc tgtgccgtct gtgtgtcttg ggggccctgg gccaagcccc      3660
     acttcccggc actgttgtga gcccctccca gctctctcca cgctctctgg gtgcccacag      3720
     gtgccaacgc cggccaggcc cagcatgcag tggctctccc caaagcggcc atgcctgttg      3780
     gctgcctgct gcccccaccc tgtggctcag ggtccagtat gggagcttcg ggggtctctg      3840
     aggggccagg gatggtgggg ccactgagaa gtgacttctt gttcagtagc tctggactct      3900
     tggagtcccc agagaccttg ttcaggaaag ggaatgagaa cattccagca attttccccc      3960
     cacctagccc tcccaggttc tatttttaga gttatttctg atggagtccc tgtggaggga      4020
     ggaggctggg ctgagggagg gggt                                             4044
//

If you have problems or comments...

Back to PBIL home page