Article ID Journal Published Year Pages File Type
430311 Journal of Computer and System Sciences 2012 15 Pages PDF
Abstract

Given a set of n strings of length L and a radius d, the closest string problem (CSP for short) asks for a string tsol that is within a Hamming distance of d to each of the given strings. It is known that the problem is NP-hard and its optimization version admits a polynomial time approximation scheme (PTAS). Parameterized algorithms have been then developed to solve the problem when d is small. In this paper, with a new approach (called the 3-string approach), we first design a parameterized algorithm for binary strings that runs in O(nL+nd3⋅d6.731) time, while the previous best runs in O(nL+nd⋅d8) time. We then extend the algorithm to arbitrary alphabet sizes, obtaining an algorithm that runs in time O(nL+nd⋅(1.612d(|Σ|+β2+β−2))), where |Σ| is the alphabet size and β=α2+1−2α−1+α−2 with . This new time bound is better than the previous best for small alphabets, including the very important case where |Σ|=4 (i.e., the case of DNA strings).

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics