کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
427917 | 686576 | 2011 | 4 صفحه PDF | دانلود رایگان |
It has long been known that pattern matching in the Hamming distance metric can be done in O(min(|Σ|,m/logm)nlogm) time, where n is the length of the text, m is the length of the pattern, and Σ is the alphabet. The classic algorithm for this is due to Abrahamson and Kosaraju. This paper considers the following generalization, motivated by the situation where the entries in the text and pattern are analog, or distorted by additive noise, or imprecisely given for some other reason: in any alignment of the pattern with the text, two aligned symbols a and b contribute +1 to the similarity score if they differ by no more than a given threshold θ , otherwise they contribute zero. We give an O(min(|Σ|,m/logm)nlogm) time algorithm for this more general version of the problem; the classic Hamming distance matching problem is the special case of θ=0θ=0.
► We consider the problem of pattern matching for Hamming distance with thresholds.
► Two symbol are considered to match only if they differ by no more than a threshold.
► We compute the score vector for every alignment of the pattern with the text.
► Framework is useful when inputs are imprecisely given, e.g., from noisy sources.
Journal: Information Processing Letters - Volume 111, Issue 14, 31 July 2011, Pages 674–677