Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
435392 | Theoretical Computer Science | 2009 | 9 Pages |
The problems of finding maximal and minimal equivalent representations for gapped and non-gapped motifs as well as finding motifs that characterize a fixed set of occurrence locations for a given string are studied. We apply two equivalence relations on representations. The first one is the well-known occurrence-equivalence of motifs. The second equivalence is introduced for patterns of occurrence locations, to characterize such patterns by motifs. For both equivalences, quadratic-time algorithms are given for finding a maximal representative of an equivalence class. Finding a minimal representative is shown to be NP-complete in both cases. For non-gapped motifs suffix-tree-based linear-time algorithms are given for finding maximal and minimal representatives. Maximal (minimal) gapped motifs are composed of blocks that are maximal (minimal) non-gapped motifs, maximal and minimal non-gapped motifs thus making up a small basis for all motifs. The implied bound on the number of gapped motifs that have a fixed number of non-gapped blocks is also given.