Extracting structured motifs
using a suffix tree - Algorithms and application to promoter
consensus identification
Laurent Marsan and Marie-France Sagot
in Proceedings of the
Fourth Annual International Conference on Computational Molecular
Biology,
Tokyo, Japan, pages 210-219, ACM Press, 2000
This paper introduces two exact algorithms for extracting conserved
structured motifs from a set of DNA sequences. Structured motifs are
composed of p >= 2 parts separated by constrained
spacers. These algorithms use a suffix tree for fulfilling this
task. They are efficient enough to be able to extract site consensus,
such as promoter sequences, from a whole collection of non coding
sequences extracted from a genome. In particular, their time
complexity scales linearly with N^2 * n where $n$ is the
average length of the sequences and N their number. An
application with interesting results to the identification of promoter
consensus sequences in bacterial genomes is shown.
key words: motif extraction, structured motif,
promoter, consensus, model, suffix tree
Paper in postscript format
Back to the Publications page