(data stored in SCRATCH zone)

EMBL: M29691

ID   M29691; SV 1; linear; genomic DNA; STD; PRO; 6278 BP.
XX
AC   M29691; M22854;
XX
DT   19-APR-1990 (Rel. 23, Created)
DT   17-APR-2005 (Rel. 83, Last updated, Version 5)
XX
DE   Bacillus subtilis (clone pED4) comG-(1,2,3,4,5,6,and 7) proteins in comG
DE   operon, complete cds.
XX
KW   exogenous DNA binding.
XX
OS   Bacillus subtilis
OC   Bacteria; Firmicutes; Bacilli; Bacillales; Bacillaceae; Bacillus.
XX
RN   [1]
RP   1-6278
RX   PUBMED; 2507524.
RA   Albano M., Breitling R., Dubnau D.A.;
RT   "Nucleotide sequence and genetic organization of the Bacillus subtilis comG
RT   operon";
RL   J. Bacteriol. 171(10):5386-5404(1989).
XX
DR   MD5; eade5c75aaf31aa48b8b912d0b18dfeb.
DR   EuropePMC; PMC107223; 9573156.
XX
CC   On Dec 14, 1995 this sequence version replaced gi:142705.
CC   Draft entry and computer readable sequence for [1] kindly provided
CC   by D.A.Dubnau, 02-MAR-1989.
CC   ORF1 is required for the ability of competent cultures to resolve
CC   into two populations with different cell densities on Renografin
CC   gradients, as well as for full expression of comE, another late
CC   competent locus. ORF1 shows significant similarity to the virB
CC   ORF11 protein from Agrobacterium tumefaciens, which is probably
CC   involved in T-DNA transfer.  The N-terminal sequences of comG ORF3
CC   and, to a lesser extent, the comG ORF4 and 5 proteins are similar
CC   to a class of pilin proteins from members of the genera
CC   Bacteroides, Pseudomonas, Neisseria and Moraxella.  All of the comG
CC   proteins, except ORF1, possess hydrophobic domains that are
CC   potentially capable of spanning the bacterial membrane and may be
CC   part of the DNA transport machinery.
XX
FH   Key             Location/Qualifiers
FH
FT   source          1..6278
FT                   /organism="Bacillus subtilis"
FT                   /strain="168"
FT                   /mol_type="genomic DNA"
FT                   /clone="pED4"
FT                   /db_xref="taxon:1423"
FT   CDS_pept        <1..887
FT                   /codon_start=2
FT                   /transl_table=11
FT                   /product="unknown protein"
FT                   /note="ORF1; putative"
FT                   /db_xref="GOA:P40948"
FT                   /db_xref="InterPro:IPR002523"
FT                   /db_xref="UniProtKB/Swiss-Prot:P40948"
FT                   /protein_id="AAA83366.1"
FT                   /translation="DLIHFSHWPQCEKWFENNHHVNFLRVDTTETENEAVFGSIVYDQG
FT                   LGEEKDHTVFHFYITRQYFFTINFDFSILREIKGKEVVRQMERADNAIEGFLILLGELM
FT                   NAYLIGVDEFEVKLRKLRWQIKDDNSKSILNRVHLLRHELMIWKNLILSAKKIEMALKE
FT                   TFLPQNEGKKDYQRTQLKIDRGFTYISEFEGELNNLLHSEEVITSHRGNEIVKALTIFT
FT                   TLFTPITALGALWGMNFSVMPELNWKYGYLFSLLLIVTSTVLIYLYLRKKGWTGDMLQE
FT                   RKKKKKPRKRRTL"
FT   regulatory      1100..1124
FT                   /note="putative"
FT                   /regulatory_class="terminator"
FT   regulatory      1243..1248
FT                   /note="comG -35 region"
FT                   /regulatory_class="minus_35_signal"
FT   regulatory      1266..1271
FT                   /note="comG -10 region"
FT                   /regulatory_class="minus_10_signal"
FT   mRNA            1278..>6278
FT                   /note="comG mRNA"
FT   regulatory      1281..1287
FT                   /gene="comG1"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   gene            1281..2368
FT                   /gene="comG1"
FT   CDS_pept        1298..2368
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG1"
FT                   /db_xref="GOA:P25953"
FT                   /db_xref="InterPro:IPR001482"
FT                   /db_xref="InterPro:IPR003593"
FT                   /db_xref="InterPro:IPR027417"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25953"
FT                   /protein_id="AAA83367.1"
FT                   /translation="MDSIEKVSKNLIEEAYLTKASDIHIVPRERDAIIHFRVDHALLKK
FT                   RDMKKEECVRLISHFKFLSAMDIGERRKPQNGSLTLKLKEGNVHLRMSTLPTINEESLV
FT                   IRVMPQYNIPSIDKLSLFPKTGATLLSFLKHSHGMLIFTGPTGSGKTTTLYSLVQYAKK
FT                   HFNRNIVTLEDPVETRDEDVLQVQVNEKAGVTYSAGLKAILRHDPDMIILGEIRDAETA
FT                   EIAVRAAMTGHLVLTSLHTRDAKGAIYRLLEFGINMNEIEQTVIAIAAQRLVDLACPFC
FT                   ENGCSSVYCRQSRNTRRASVYELLYGKNLQQCIQEAKGNHANYQYQTLRQIIRKGIALG
FT                   YLTTNNYDRWVYHEKD"
FT   gene            2404..3391
FT                   /gene="comG2"
FT   regulatory      2404..2410
FT                   /gene="comG2"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   CDS_pept        2420..3391
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG2"
FT                   /db_xref="GOA:P25954"
FT                   /db_xref="InterPro:IPR001992"
FT                   /db_xref="InterPro:IPR003004"
FT                   /db_xref="InterPro:IPR018076"
FT                   /db_xref="InterPro:IPR042094"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25954"
FT                   /protein_id="AAA83368.1"
FT                   /translation="MTAGGYTLLDGLRLMELQMNKRQAADLTDSVTCLREGAPFYQVLK
FT                   SLSFHKEAVGICYFAETHGELPASMIQSGELLERKIAQADQLKRVLRYPLFLIFTVAVM
FT                   FYMLQSIIIPQFSGIYQSMNMETSRSTDMLFAFFQHIDLVIILLVLFTAGIGIYYWLVF
FT                   KKKSPARQMLICIRIPLVGKLVKLFNSYFFSLQLSSLLKSGLSIYDSLNAFKHQTFLPF
FT                   YRCEAEQLIERLKAGESIESAICGSLFYETDLSKVISHGQLSGRLDRELFTYSQFILQR
FT                   LEHKAQKWTGILQPMIYGFVAAMILLVYLSMLVPMYQMMNQM"
FT   gene            3391..3701
FT                   /gene="comG3"
FT   regulatory      3391..3398
FT                   /gene="comG3"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   CDS_pept        3405..3701
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG3"
FT                   /db_xref="GOA:P25955"
FT                   /db_xref="InterPro:IPR012902"
FT                   /db_xref="InterPro:IPR016940"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25955"
FT                   /protein_id="AAA83369.1"
FT                   /translation="MNEKGFTLVEMLIVLFIISILLLITIPNVTKHNQTIQKKGCEGLQ
FT                   NMVKAQMTAFELDHEGQTPSLADLQSEGYVKKDAVCPNGKRIIITGGEVKVEH"
FT   regulatory      3678..3682
FT                   /gene="comG3"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   gene            3691..4122
FT                   /gene="comG4"
FT   CDS_pept        3691..4122
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG4"
FT                   /db_xref="GOA:P25956"
FT                   /db_xref="InterPro:IPR002416"
FT                   /db_xref="InterPro:IPR012902"
FT                   /db_xref="InterPro:IPR016785"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25956"
FT                   /protein_id="AAA83370.1"
FT                   /translation="MNIKLNEEKGFTLLESLLVLSLASILLVAVFTTLPPAYDNTAVRQ
FT                   AASQLKNDIMLTQQTAISRQQRTKILFHKKEYQLVIGDTVIERPYATGLSIELLTLKDR
FT                   LEFNEKGHPNAGGKIRVKGHAVYDITVYLGSGRVNVERK"
FT   regulatory      4096..4100
FT                   /gene="comG4"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   gene            4106..4453
FT                   /gene="comG5"
FT   CDS_pept        4106..4453
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG5"
FT                   /db_xref="GOA:P25957"
FT                   /db_xref="InterPro:IPR012902"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25957"
FT                   /protein_id="AAA83371.1"
FT                   /translation="MWRENKGFSTIETMSALSLWLFVLLTVVPLWDKLMADEKMAESRE
FT                   IGYQMMNESISKYVMSGEGAASKTITKNNHIYAMKWEEEGEYENVCIKAAAYKEKSFCL
FT                   SILQTEWLHAS"
FT   gene            4351..4862
FT                   /gene="comG6"
FT   regulatory      4351..4355
FT                   /gene="comG6"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   CDS_pept        4365..4862
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG6"
FT                   /note="putative"
FT                   /db_xref="GOA:P25958"
FT                   /db_xref="InterPro:IPR016977"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25958"
FT                   /protein_id="AAA83372.1"
FT                   /translation="MKTYVSKPQLIKKNHFASAFCRQNGYTLLNVLFSLSVFLLISGSL
FT                   AAIIHLFLSRQQEHDGFTQQEWMISIEQMMNECKESQAVKTAEHGSVLICTNLSGQDIR
FT                   FDIYHSMIRKRVDGKGHVPILDHITAMKADIENGVVLLKIESEDQKVYQTAFPVYSYLG
FT                   GG"
FT   gene            4851..5237
FT                   /gene="comG7"
FT   regulatory      4851..4855
FT                   /gene="comG7"
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   CDS_pept        4863..5237
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /gene="comG7"
FT                   /db_xref="GOA:P25959"
FT                   /db_xref="InterPro:IPR020372"
FT                   /db_xref="UniProtKB/Swiss-Prot:P25959"
FT                   /protein_id="AAA83373.1"
FT                   /translation="MYRTRGFIYPAVLFVSALVLLIVNFVAAQYISRCMFEKETKELYI
FT                   GENLLQNGVLLSIRHVLEERKGQEGTQQFLYGRVSYYIHDTSIKEQKEINLRVSTDSGT
FT                   ERTAQIVFDQKQKKLLRWTE"
FT   regulatory      5501..5533
FT                   /regulatory_class="terminator"
FT   regulatory      5696..5715
FT                   /regulatory_class="terminator"
FT   regulatory      6115..6119
FT                   /note="putative"
FT                   /regulatory_class="ribosome_binding_site"
FT   CDS_pept        6126..>6278
FT                   /codon_start=1
FT                   /transl_table=11
FT                   /product="unknown protein"
FT                   /note="ORF2; putative"
FT                   /db_xref="GOA:P40949"
FT                   /db_xref="InterPro:IPR023833"
FT                   /db_xref="InterPro:IPR023848"
FT                   /db_xref="UniProtKB/Swiss-Prot:P40949"
FT                   /protein_id="AAA83374.1"
FT                   /translation="MFRLFHNQQKAKTKLKVLLIFQLSVIFSLTAAICLQFSMIQALLF
FT                   MILKHL"
XX
SQ   Sequence 6278 BP; 1960 A; 1121 C; 1384 G; 1813 T; 0 other;
     agatctaatc catttttctc actggcctca gtgtgaaaag tggtttgaaa ataaccatca        60
     cgttaatttt ttgcgagtag atacaactga aacggaaaat gaagcagtat ttgggtcgat       120
     tgtttatgat caggggcttg gtgaagaaaa agaccatact gtttttcact tttatatcac       180
     cagacaatat ttttttacaa tcaactttga cttttcaatt ttgagagaga ttaaaggcaa       240
     agaagttgtt cggcaaatgg aaagagcgga caatgcgata gaggggtttt taattcttct       300
     cggcgaacta atgaatgcgt atttaatcgg tgttgatgaa tttgaagtca agctgagaaa       360
     gctcagatgg caaattaaag acgacaatag caaaagcatt ttaaaccgcg tccatctcct       420
     gcgccatgaa ctgatgattt ggaaaaattt gatattaagc gctaaaaaaa ttgaaatggc       480
     gttgaaagaa acctttttac ctcaaaatga agggaaaaag gattatcagc ggacacaact       540
     gaagattgac aggggattta catacatcag cgaatttgaa ggggagctta acaatctgct       600
     gcattcagag gaagtcatta cctcacatag ggggaatgaa attgtaaaag cgctgaccat       660
     tttcacgacg ctttttactc cgattacagc tctgggtgcc ttatggggga tgaacttttc       720
     agtgatgccg gaactgaatt ggaaatacgg atatctcttt tccctcttat tgattgtcac       780
     atctacagtt ctgatctatc tctatttgag aaaaaaaggc tggacgggag atatgctgca       840
     ggagcggaag aagaaaaaga aacctcgaaa aaggcggact ctataggatg tttcatattt       900
     tgtgcagcgt gccccgcttt ttcaccagac atatcagggt gaccggatac gatgtcaagg       960
     ggcttatgac agagcattaa atccgcagtt tatcgattct tgaaaatgac caaatgaccg      1020
     gtattgttgc attaggcgat ctttccgttg agaaagatac tggtcaataa gcgaaaacag      1080
     cataatgaaa atggaatcta gcaggcatgg tgaccatgtc tgctttttta tttataggga      1140
     aaattataat gacaggggta cattcagttg aaagtctttt ttcttgccag aaagaattgg      1200
     tttttcagca tataacatct cacaaaatca cgttttccct gtttgattac cttttcttct      1260
     ttttctacaa tatgcgttga aaggagaggg aatcaaattg gattcaatag aaaaggtaag      1320
     caaaaacttg attgaagagg catatctaac aaaggcttct gatattcaca ttgtgccgag      1380
     ggagcgggac gctatcattc attttcgggt cgatcatgcc ttgctgaaaa aaagggacat      1440
     gaaaaaagaa gagtgcgtaa gactgatttc acattttaaa tttctttcag caatggatat      1500
     aggtgaaagg cgaaagccgc aaaacggttc gcttacgtta aagttgaaag agggaaatgt      1560
     tcatttaaga atgtcaacgc tgcccacaat taatgaagaa agcctcgtga tcagagtgat      1620
     gccccaatac aatatccctt cgattgataa attgtcgcta tttccgaaga caggagccac      1680
     attactctcg tttttaaaac attcccatgg catgctcatt tttaccgggc cgactggttc      1740
     agggaagact accacattat actctctcgt tcaatatgca aaaaaacact ttaatcgaaa      1800
     tattgtcaca ttagaggacc ctgttgaaac aagggacgaa gatgttcttc aggttcaggt      1860
     gaatgaaaaa gccggtgtaa cttattccgc aggtctgaaa gcaattttgc gccatgaccc      1920
     cgatatgatt attttaggtg agatcagaga cgcggaaaca gctgaaattg cggtgcgggc      1980
     agcgatgacg ggacatctgg tactaacgag ccttcatacg agagacgcaa agggcgcaat      2040
     ttacagactg cttgaattcg gtatcaatat gaatgaaatc gaacagactg tcattgcaat      2100
     agcggctcag cgcttggttg atttggcttg cccgttttgt gaaaacggat gttcatcagt      2160
     gtattgccga cagtcacgaa atactaggag agctagcgtt tatgagcttc tatacgggaa      2220
     aaatcttcag caatgtatcc aggaggcaaa aggaaatcat gcaaattacc aatatcaaac      2280
     gcttcgtcaa attatcagaa aaggaattgc gctcggctat ttaacgacaa acaactatga      2340
     ccggtgggtt tatcatgaaa aagattagaa agtctggttg ttaaaggatc aagccaggtt      2400
     attaaagagg ctcggtgaaa tgactgcggg cggatataca cttctggatg gattacgcct      2460
     gatggaactt cagatgaata agaggcaggc ggctgacttg actgattcgg tcacttgttt      2520
     gagggaaggg gctccgtttt atcaagtact aaagagtttg tcatttcata aggaagccgt      2580
     aggtatttgt tattttgctg aaacacatgg tgaactgcct gcttcaatga tccagagcgg      2640
     agagctgctg gaacgaaaaa ttgcacaggc agaccagctg aaaagagtgc tgcgctatcc      2700
     gcttttcctc atctttacgg tcgctgtcat gttttatatg ttacagtcca tcatcattcc      2760
     tcagttttcc ggtatctatc aatcgatgaa tatggaaacc tcacgttcaa ccgatatgct      2820
     ttttgctttt tttcagcata ttgatcttgt gatcattttg cttgttcttt ttacagcagg      2880
     tatcgggatt tattattggc ttgtgtttaa gaaaaaatca cctgcccggc aaatgctgat      2940
     ttgtatcagg attcctttgg ttggaaagct tgtaaagctg tttaacagct actttttttc      3000
     tttgcagcta agcagccttt taaaatcagg cctctcaatt tatgacagcc ttaatgcatt      3060
     taaacatcaa acgtttctcc ctttctaccg ctgcgaggct gaacaattga ttgaacggct      3120
     aaaagccggt gagtcaattg aatccgctat ttgtggaagc cttttttatg aaactgattt      3180
     atcaaaagtc atatctcacg gccagctgag cggccgattg gatcgggagc ttttcacata      3240
     cagccaattc atattacagc ggctggaaca caaagcgcaa aaatggacag gcatccttca      3300
     gccaatgatt tatggatttg ttgcagcgat gatcttactt gtgtatttat ctatgcttgt      3360
     gcctatgtat cagatgatga atcaaatgtg aaaggaagag gctgatgaat gagaaaggat      3420
     ttacacttgt tgaaatgtta atcgtgctct ttattatttc gattttgctt ttaattacga      3480
     taccgaacgt cacgaaacat aatcaaacca ttcaaaaaaa gggctgtgaa ggcttacaaa      3540
     acatggttaa ggcacaaatg actgcatttg agcttgatca tgaaggacaa actccgagcc      3600
     ttgccgattt acagtcagag ggctatgtga aaaaggatgc tgtctgtcca aatggtaagc      3660
     gcattatcat caccggcgga gaagttaagg ttgaacatta aattaaacga ggagaagggg      3720
     tttacccttt tagaaagttt gcttgtgtta agccttgcct ctatcctcct ggtggccgtc      3780
     ttcactacac ttcctcctgc ttatgacaat acagctgtcc gacaggcagc aagtcagctg      3840
     aaaaatgata ttatgctcac acagcagact gctatttccc gtcaacaaag aacaaaaatt      3900
     ctctttcata aaaaagaata tcaattagtc attggtgata cggttattga acgtccgtat      3960
     gcaacgggac tttctataga actgctgaca ttaaaagacc gtttggaatt taatgagaaa      4020
     gggcacccga atgcaggcgg aaaaatacga gtaaaaggcc atgccgttta tgacataaca      4080
     gtttatctag ggagcgggag agtcaatgtg gagagaaaat aaaggttttt ctacaataga      4140
     aacaatgtct gcgctaagcc tgtggctgtt tgtgctgctg acagtcgtcc ccttgtggga      4200
     caagctgatg gctgatgaaa aaatggcgga atcacgagaa attggctatc agatgatgaa      4260
     tgagagcatt agcaaatatg tcatgagtgg tgaaggagcc gcgtcaaaaa cgattacaaa      4320
     gaacaatcat atctatgcaa tgaagtggga ggaggagggc gaatatgaaa acgtatgtat      4380
     caaagccgca gcttataaag aaaaatcatt ttgcctcagc attttgcaga cagaatggct      4440
     acacgcttct taacgtatta ttttcgctct cagtcttttt gctcatatca ggatcgttag      4500
     ctgcgattat ccatctgttt ttgtctcgac agcaggaaca tgacggtttc acacagcagg      4560
     aatggatgat ttcgatagaa cagatgatga atgaatgcaa ggaatcacag gcagttaaga      4620
     cagccgagca tgggagcgtg ttaatctgca ccaatctttc cggacaagac atccgttttg      4680
     acatttatca ttcaatgata agaaaaagag tggatggcaa agggcatgtt ccgattttag      4740
     atcatattac tgccatgaaa gctgatattg aaaatggtgt tgttttgctg aaaattgaga      4800
     gtgaagacca aaaagtgtat caaactgctt ttccagtcta ttcgtattta ggaggggggt      4860
     gaatgtatcg tacaagaggg tttatttatc cagctgttct ttttgtgtca gcgcttgtgc      4920
     tgttaatcgt gaactttgtt gctgctcaat atatttcacg ctgcatgttt gagaaggaaa      4980
     caaaagagtt atacatagga gagaatttgc ttcaaaatgg ggtgcttctt tcgattcggc      5040
     atgttctaga ggaacggaaa ggccaggagg gtacgcagca atttctatat ggacgggttt      5100
     cttattacat tcatgataca tcgataaaag aacaaaaaga aatcaactta agagtgtcaa      5160
     cggattcggg aacagaaaga actgcacaga tcgtgtttga ccaaaaacag aaaaaactgc      5220
     tgagatggac agaataaaac agtgtaaagg gtataaaaaa agtcatgtga gacaacactc      5280
     ataataattg aatgatgagg tgatcacgtg aaaacgaatg attatgttaa atatatgacg      5340
     cagcaatttg tcaaatatat agatactccg agagatgagc gaaaagaacg aaaagaggtg      5400
     cggaaagaaa caaaaacgcc tgtttcccag cagtggttcg gtattttacc ctatggcttc      5460
     cgactttggc tgaaacggaa aaaataaccg caaataaacg aataaggtcc ttcaaaaaat      5520
     ggaggacctt attgatattc ttctaatatg gcaattttat tgaccttttg gctataagga      5580
     tcaaatgaaa tcgtcacaaa aacgccgaat tcttttgacc cttccctcag agttaaatgg      5640
     tattgcttca ctgcttcatc ttttctttta cggtcccata ctttttgttt gaacagtacc      5700
     tgtgcgagcg ggtacctttt ttttgcttct tttacagcaa tctcttccca tttggacatg      5760
     tggcgggcgg ttacaagcgg tgtttcttct gcgtgagcgg ctgtggtgcc aaagacgaga      5820
     agagatagac aaatcacaca ttgtttgatc atcatgctgt cacctttctt tgtttattat      5880
     taccaaataa taatgggata tgcatttaac ttctcacata acaatcccaa aaatttctaa      5940
     aaaattgaaa aaatgagcaa tactgagcaa gactttgtaa tatgatgaaa acattctttt      6000
     aaacgaacaa aatgagcgat ttcggtgttt ttaaatctat aaatcgttga ttatactcta      6060
     tttgtgaagt tctttaaaga gaacgattgt catatcaagt tacagtgttt tacaggaggt      6120
     aagatatgtt tcgattgttt cacaatcagc aaaaggcgaa gacgaaactg aaagttctgc      6180
     ttatctttca gctttcagtc attttcagtc tgactgccgc aatatgctta caattttcga      6240
     tgatacaagc gctgcttttc atgatattga aacatttg                              6278
//

If you have problems or comments...

PBIL Back to PBIL home page