Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
429054 | Information Processing Letters | 2011 | 5 Pages |
We present an efficient algorithm for finding all approximate occurrences of a given pattern p of length m in a text t of length n allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an O(nmmax(α,β))O(nmmax(α,β))-time complexity in the worst case and O(max(α,β,σ))O(max(α,β,σ))-space complexity, where α and β are respectively the maximum length of the factors involved in any translocation and inversion, and σ is the alphabet size. Moreover we show that our algorithm has an O(n)O(n) average time complexity, whenever σ=Ω(logm/loglog1−εm), for ε>0ε>0. Experiments show that the proposed algorithm achieves very good results in practical cases.
► New algorithm for approximate matching with inversions and translocations. ► Application in DNA analysis, due to chromosomal rearrangements. ► Counting filter to detect pattern permutations, verified with another algorithm. ► Experimental results: about 4–25×4–25× faster than its predecessor. ► Linear time on average if only the alphabet is not very small.