An efficient algorithm to detect common ancestor genes for non-overlapping inversion and applications

Article ID	Journal	Published Year	Pages	File Type
4952352	Theoretical Computer Science	2016	27 Pages	PDF

Abstract

In this paper, an algorithm is proposed that detects the existence of common ancestor gene sequences for non-overlapping inversion (reversed complement) metric given two input DNA sequences. Theoretical worst case running time complexity of the algorithm is proven to be O(n4), where n is the length of each input sequence. However, by experiment, the running time complexity is found to be O(n3) for the worst case and O(n2) for average case. Moreover, the worst case occurs when both input sequences have the similarity of around 90% which is very rare. This work is motivated by the purpose of diagnosing an unknown genetic disease that shows allelic heterogeneity, a case where a normal gene mutates in different orders resulting in two different gene sequences causing two different genetic diseases. Our algorithm can potentially save huge energy and cost of the existing diagnostic approaches. The algorithm can be useful as well in the study of breed-related hereditary conditions to determine the genetic spread of a defective gene in the population.

Keywords

Iterative algorithm Inversion Genetic disease Matrix Allelic heterogeneity