Article ID Journal Published Year Pages File Type
429260 Information Processing Letters 2006 7 Pages PDF
Abstract

Finding the longest common subsequence (LCS) of two given sequences A=a0a1…am−1 and B=b0b1…bn−1 is an important and well studied problem. We consider its generalization, transposition-invariant LCS (LCTS), which has recently arisen in the field of music information retrieval. In LCTS, we look for the LCS between the sequences A+t=(a0+t)(a1+t)…(am−1+t) and B where t is any integer. We introduce a family of algorithms (motivated by the Hunt–Szymanski scheme for LCS), improving the currently best known complexity from O(mnloglogσ) to O(Dloglogσ+mn), where σ is the alphabet size and D⩽mn is the total number of dominant matches for all transpositions. Then, we demonstrate experimentally that some of our algorithms outperform the best ones from literature.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics