Article ID Journal Published Year Pages File Type
430615 Journal of Discrete Algorithms 2010 12 Pages PDF
Abstract

A gapped pattern is a sequence consisting of regular alphabet symbols and of joker symbols that match any alphabet symbol. The content of a gapped pattern is defined as the number of its non-joker symbols. A gapped motif is a gapped pattern that occurs repeatedly in a string or in a set of strings. The aim of this paper is to study the complexity of several gapped-motif-finding problems. The following three decision problems are shown NP-complete, even if the input alphabet is binary. (i) Given a string T and two integers c and q, decide whether or not there exists a gapped pattern with content c (or more) that occurs in T at q   distinct positions (or more). (ii) Given a set of strings SS and an integer c, decide whether or not there exists a gapped pattern with content c   that occurs at least once in each string of SS. (iii) Given m strings with the same length, and two integers c and q, decide whether or not there exists a gapped pattern with content c that matches at least q input strings. We also present a non-naive quadratic-time algorithm that solves the following optimization problem: given a string T   and an integer q⩾1q⩾1, compute a maximum-content gapped pattern Q such that q consecutive copies of Q occur in T.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,