Statistics of RNA Secondary Structures
W. Fontana, D. A. M. Konings, P. F. Stadler, and P. Schuster
![](../images/rainbow.gif)
A statistical reference for RNA secondary structures with minimum free energies is computed by folding large ensembles of random
RNA sequences. Four nucleotide alphabets are used: two binary alphabets, AU and GC, the biophysical AUGC and the synthetic GCXK
alphabet. RNA secondary structures are made of structural elements, such as stacks, loops, joints and free ends. Statistical properties of
these elements are computed for small RNA molecules of chain lengths up to 100. The results of RNA structure statistics depend
strongly on the particular alphabet chosen. The statistical reference is compared with the data derived from natural RNA molecules
with similar base frequencies.
Secondary structures are represented as trees. Tree editing provides a quantitative measure for the distance, dt, between two structures.
We compute a structure density surface as the conditional probability of two structures having distance t given that their sequences have
distance h. This surface indicates that the vast majority of possible minimum free energy secondary structures occur within a fairly
small neighbourhood of any typical random sequence.
Correlation lengths for secondary structures in their tree representations are computed from probability densities. They are appropriate
measures for the complexity of the sequence-structure-relation. The correlation length also provides a quantitative estimate for the
mean sensitivity of structures to point-mutations.
![](../images/island.gif)
Retour à la bibliographie