Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
426591 | Information and Computation | 2012 | 7 Pages |
Abstract
We reconsider the well-known problem of pattern matching under the Hamming distance. Previous approaches have shown how to count the number of mismatches efficiently, especially when a bound is known for the maximum Hamming distance. Our interest is different in that we wish to collect a random sample of mismatches of fixed size at each position in the text. Given a pattern p of length m and a text t of length n, we show how to sample with high probability up to c mismatches from every alignment of p and t in time. Further, we guarantee that the mismatches are sampled uniformly and can therefore be seen as representative of the types of mismatches that occur.
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics