کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
431033 | 688255 | 2011 | 7 صفحه PDF | دانلود رایگان |

This paper revisits the problem of indexing a text S[1..n]S[1..n] for pattern matching with up to k errors. A naive solution either has a worst-case matching time complexity of Ω(mk)Ω(mk) or requires Ω(nk)Ω(nk) space, where m is the length of the pattern. Devising a solution with better performance has been a challenge until Cole et al. (2004) [5] showed an O(nlogkn)-space index that can support k -error matching in O(m+occ+logknloglogn) time, where occ is the number of occurrences. Motivated by the indexing of long sequences like DNA, we have investigated the feasibility of devising a linear-size index that still has a time complexity linear in pattern length. This paper in particular presents an O(n)O(n)-space index that supports k -error matching in O(m+occ+(logn)k(k+1)loglogn) worst-case time. This index can be further compressed from O(n)O(n) words into O(n)O(n) bits with a slight increase in the time complexity.
Journal: Journal of Discrete Algorithms - Volume 9, Issue 4, December 2011, Pages 358–364