Article ID Journal Published Year Pages File Type
430742 Journal of Computer and System Sciences 2010 10 Pages PDF
Abstract

We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give a Θ(n(k+logmlogk)logn) time randomised algorithm which finds the correct answer with high probability. We then present a new deterministic Θ(nk2log2m) time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we develop an approach based on k-selectors that runs in Θ(nkpolylogm) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics