Supplementary table 1


Table 1: Contingency table for the distribution of replication-related effects on the GC-skew and AT-skew in Bacteria and Archaea. The columns of the table indicate the sign of the nucleotide skews on the leading strand, induced by the replication mechanism. For instance, G > C means that the leading strand is enriched in G over C, thus the sign of the GC-skew on the leading strand is positive, while G = C means that no significant effect of replication on the GC-skew could be detected. The cells of the table contain the number of chromosomes that we found to present the respective GC-skew and AT-skew signs on the leading strand.
Table 1
 G>C A>TG>C A=TG>C A<TG=C A>TG=C A=TG=C A<TG<C A>TG<C A=TG<C A<T
All Bacteria337320513313110
Acidobacteria000010000
Actinobacteria0212123110
Aquificales000010000
Bacteroidetes024000000
Chlorobi003000000
Chlamydiae128000000
Chloroflexi002000000
Cyanobacteria009071000
Deinococcus Thermus001013000
Firmicutes Bacilli27274000000
Firmicutes Clostridia322000000
Firmicutes Mollicutes0310102000
Fusobacteria100000000
Nitrospirae000000000
Planctomycetes001000000
α Proteobacteria0642052000
β Proteobacteria0136000000
δ Proteobacteria0110000000

є Proteobacteria

036000000
γ Proteobacteria12257062000
Spirochaetes017000000
Thermotogae010000000

All Archaea

0460116011
Crenarchaeota Thermoprotei000050000
Euryarchaeota Archaeoglobi000010000
Euryarchaeota Halobacteria000021011
Euryarchaeota Methanobacteria010010000
Euryarchaeota Methanomicrobia003002000
Euryarchaeota Methanococci011000000
Euryarchaeota Methanopyri000001000
Euryarchaeota Thermoplasmata011010000
Euryarchaeota Thermococci011002000
Nanoarchaeota Nanoarchaeum000010000

Supplementary table 2


Table 2: Comparison between our approach and the methods proposed by Tillier and Collins, 2000 and Lobry and Sueoka, 2002, for 173 chromosomes for which we have performed significance computations. Cases where the p-values are not available for these two methods correspond to chromosomes where the origin and terminus could not be determined with certainty.
 speciesaccessionATT&C, 2000L&S, 2002GCT&C, 2000L& S, 2002
1A. bacterium ElNC_008009011011
2A. sp ADP1NC_005966011111
3A. pernixNC_0008540NANA0NANA
4A. tumefaciensNC_003062111111
5A. tumefaciensNC_003063111111
6A. variabilis ANC_007413000000
7A. dehalogenansNC_007760000101
8A. marginale StNC_004842111111
9A. aeolicusNC_000918000000
10A. fulgidusNC_0009170NANA0NANA
11A. yellows witcNC_007716111111
12A. sp EbN1NC_006513111111
13B. anthracis AmNC_003997111111
14B. fragilis NCTNC_003228111111
15B. henselae HouNC_005956111111
16B. cicadellinicNC_0079840NANA0NANA
17B. bacteriovoruNC_005363111111
18B. longumNC_004307011011
19B. bronchiseptiNC_002927111111
20B. burgdorferiNC_001318111111
21B. japonicumNC_004463111111
22B. abortus 9-94NC_006932111111
23B. abortus 9-94NC_006933111111
24B. aphidicolaNC_004545111111
25B. 383NC_007509111111
26B. 383NC_007510111111
27B. 383NC_007511111111
28C. jejuniNC_002163111111
29C. BlochmanniaNC_005061111111
30C. hydrogenoforNC_007503000101
31C. crescentusNC_002696111111
32C. muridarumNC_002620111111
33C. abortus S26NC_004552111111
34C. chlorochromaNC_007514111111
35C. violaceumNC_005085111111
36C. salexigens DNC_007963111111
37C. acetobutylicNC_003030111111
38C. psychrerythrNC_003910111111
39C. diphtheriaeNC_002935111111
40C. burnetiiNC_002971011111
41C. bacterium YeNC_007775000000
42D. aromatica RCNC_007298111111
43D. CBDB1NC_007356111111
44D. geothermalisNC_008025111111
45D. hafniense Y5NC_007907000101
46D. psychrophilaNC_006138111111
47D. desulfuricanNC_007519111111
48E. canis JakeNC_007354111111
49E. faecalis V58NC_004668111111
50E. carotovora aNC_004547111111
51E. litoralis HTNC_007722111111
52E. coli 536NC_008253011111
53F. tularensis hNC_007880111111
54F. CcI3NC_007777111111
55F. nucleatumNC_003454111111
56G. kaustophilusNC_006510111111
57G. metallireducNC_007517111111
58G. violaceusNC_0051250NANA0NANA
59G. oxydans 621HNC_006677111111
60H. ducreyi 3500NC_002940111111
61H. chejuensis KNC_007645111111
62H. marismortuiNC_006396111011
63H. marismortuiNC_0063970NANA0NANA
64H. spNC_0026070NANA1NANA
65H. walsbyiNC_0082120NANA0NANA
66H. acinonychisNC_008229001101
67I. loihiensis LNC_006512011111
68J. CCS1NC_007802111111
69L. acidophilusNC_006814000101
70C. hutchinsoniiNC_008255010111
71L. lactisNC_002662111111
72L. intracellulaNC_008011111111
73L. pneumophilaNC_006369111111
74L. xyli xyli CTNC_006087111111
75L. interrogansNC_005823111111
76L. interrogansNC_005824011111
77L. innocuaNC_003212111111
78M. magneticum ANC_007626111111
79M. succiniciproNC_006300111111
80M. florum L1NC_006055011010
81M. BNC1NC_008254111011
82M. thermoautotrNC_000916011011
83M. burtonii DSMNC_007955111111
84M. jannaschiiNC_000909000100
85M. kandleriNC_003551111010
86M. acetivoransNC_003552111111
87M. stadtmanaeNC_007681001101
88M. hungatei JF-NC_007796111111
89M. flagellatusNC_007947111111
90M. capsulatus BNC_002977111111
91M. thermoaceticNC_007644101101
92M. MCSNC_008146111111
93M. capricolum ANC_007633000001
94M. xanthus DK 1NC_008095111111
95N. equitansNC_0052130NANA0NANA
96N. pharaonisNC_007426111111
97N. gonorrhoeaeNC_002946111111
98N. sennetsu MiyNC_007798111111
99N. hamburgensisNC_007964111111
100N. oceani ATCCNC_007484111111
101N. europaeaNC_004757111111
102N. multiformisNC_007614111111
103N. farcinica IFNC_0063610NANA0NANA
104N. spNC_003272000000
105N. aromaticivorNC_007794111111
106O. iheyensisNC_004193111111
107O. yellows phytNC_005303111011
108P. sp UWE25NC_005861111111
109P. multocidaNC_002663111111
110P. carbinolicusNC_007498111111
111P. luteolum DSMNC_007512111111
112P. profundum SSNC_006370111111
113P. profundum SSNC_006371111111
114P. luminescensNC_005126111111
115P. torridus DSMNC_0058770NANA0NANA
116P. spNC_005027111111
117P. JS666NC_007948111111
118P. gingivalis WNC_002950111111
119P. marinus CCMPNC_005042111111
120P. acnes KPA171NC_006085111111
121P. atlantica T6NC_008228111111
122P. aeruginosaNC_002516111111
123P. arcticum 273NC_007204001101
124P. aerophilumNC_0033640NANA0NANA
125P. abyssiNC_000868111111
126R. eutropha JMPNC_007347111111
127R. eutropha JMPNC_007348111111
128R. etli CFN 42NC_0077610NANA0NANA
129R. sphaeroidesNC_007493011111
130R. sphaeroidesNC_0074940NANA0NANA
131R. ferrireducenNC_007908111111
132R. palustris BiNC_007925111111
133R. rubrum ATCCNC_007643111111
134R. bellii RML36NC_007940010111
135R. denitrificanNC_008209111111
136R. xylanophilusNC_008148111111
137S. degradans 2-NC_007912111111
138S. ruber DSM 13NC_007677001101
139S. enterica ChoNC_006905011111
140S. denitrificanNC_007954111111
141S. boydii Sb227NC_007613010011
142S. TM1040NC_008044111111
143S. melilotiNC_003047111111
144S. glossinidiusNC_007712111111
145S. alaskensis RNC_008048111111
146S. aureus COLNC_002951111111
147S. agalactiae 2NC_004116000101
148S. avermitilisNC_003155110011
149S. acidocaldariNC_0071810NANA0NANA
150S. thermophilumNC_006177111111
151S. CC9605NC_007516111111
152S. PCC6803NC_000911000000
153S. aciditrophicNC_007759111111
154T. tengcongensiNC_003869111111
155T. fusca YXNC_007333111111
156T. kodakaraensiNC_006624111010
157T. acidophilumNC_002578011111
158T. elongatusNC_004113000001
159T. maritimaNC_000853011111
160T. thermophilusNC_005835111011
161T. denitrificanNC_007404111111
162T. crunogena XCNC_007520111111
163T. denticola ATNC_002967111111
164T. whipplei TW0NC_004551111010
165U. urealyticumNC_002162000101
166V. choleraeNC_002505111111
167V. choleraeNC_002506111111
168W. brevipalpisNC_0043440NANA0NANA
169W. endosymbiontNC_0068330NANA0NANA
170W. succinogenesNC_005090111111
171X. campestrisNC_003902111111
172X. fastidiosaNC_002488111111
173Y. pestis AntiqNC_008150111111
174Z. mobilis ZM4NC_006526011011

Supplementary figure 1

Figure 1:




Figure 1: Results from simulations where the effect of replication is constant (r=0.1), the standard variation of the nucleotide skews is constant for both groups of genes (sd=0.01), the breakpoint location is constant (fr=0.25), and the coding sequence related effects (m) vary. The y-axis represents the distance between the real and the estimated breakpoints.

Supplementary figure 2

Figure 2:




Figure 2: Results from simulations where the coding-sequence related effect is constant (m=0.1), the standard variation of the nucleotide skews is constant for both groups of genes (sd=0.01), the breakpoint location is constant (fr=0.25), and the replication effect (r) varies. The y-axis represents the distance between the real and the estimated breakpoints.

Supplementary figure 3

Figure 3:




Figure 3: Results from simulations where the replication and the coding-sequence related effects are constant (m=2.0 and r=0.1), the standard variation of the nucleotide skews is constant for both groups of genes (sd=0.01), but the position of the breakpoint (fr) varies. The y-axis represents the distance between the real and the estimated breakpoints.

Supplementary figure 4

Figure 4:




Figure 4: Results from simulations where the effect of replication is constant (r=0.01 for the first graph and r=0.1 for the second graph), the coding sequence related effect is constant (m=2.0), the breakpoint location is constant (fr=0.25) and the standard variation of the nucleotide skews varies. The y-axis represents the distance between the real and the estimated breakpoints.

This document was translated from LATEX by HEVEA.