Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
431149 | Journal of Discrete Algorithms | 2007 | 14 Pages |
In this paper we present three algorithms for the Motif Identification Problem in Biological Weighted Sequences. The first algorithm extracts repeated motifs from a biological weighted sequence. The motifs correspond to repetitive words which are approximately equal, under a Hamming distance, with probability of occurrence ⩾1/k⩾1/k, where k is a small constant. The second algorithm extracts common motifs from a set of N⩾2N⩾2 weighted sequences. In this case, the motifs consists of words that must occur with probability ⩾1/k⩾1/k, in 1⩽q⩽N1⩽q⩽N distinct sequences of the set. The third algorithm extracts maximal pairs from a biological weighted sequence. A pair in a sequence is the occurrence of the same word twice. In addition, the algorithms presented in this paper improve previous work on these problems.