Searching for repeated words in a
text allowing for mismatches and gaps
Marie-France Sagot, Vincent Escalier, Alain Viari and Henri
Soldano
in Second South American Workshop on String Processing,
Viņas del Mar, Chile
pages 87-100, Proceedings University of Chile,
1995
We present in this paper an algorithm that locates similar words
common to a set of strings defined over an alphabet Sigma,
where the similarity is stated in terms of a Levenshtein edit
distance. The comparison of the words in the strings is realized by
using a reference object called a model which is a word over
Sigma. This allows us to perform a multiple comparison of the
strings as opposed to pairwise comparisons, and the algorithm is
particularly appropriate for the analysis of DNA/RNA sequences
key words: multiple comparison, Levenshtein edit distance,
model
Paper in postscript format
Back to the Publications page