Article ID Journal Published Year Pages File Type
15214 Computational Biology and Chemistry 2010 9 Pages PDF
Abstract

The problem of finding the longest common subsequence (LCS) for an arbitrary number of sequences is a very interesting and challenging problem in computer science. This problem is NP-complete, but because of its importance, many heuristic algorithms have been proposed, such as Long Run, Expansion Algorithm and THSB. However, the performance, either in result quality or in process time, of many current heuristic algorithms deteriorates fast when the number of sequences and sequence length increase. In this paper, we have proposed a post-process heuristic algorithm for the LCS problem, the Deposition and Extension Algorithm (DEA). This algorithm first generates common subsequence by “sequence deposition” based on fine tuning of search range, and then extends this common subsequence. The algorithm is proven to generate Common Subsequences (CSs) with guaranteed lengths. The experiments on different dataset showed that the results of DEA algorithm were better than those of Long Run and Expansion Algorithm, especially on many long sequences. The algorithm also had superior efficiency both in time and memory space.

Related Topics
Physical Sciences and Engineering Chemical Engineering Bioengineering
Authors
,