Article ID Journal Published Year Pages File Type
6875600 Theoretical Computer Science 2018 25 Pages PDF
Abstract
Comparative genomics is a line of study which aims at determining dis/similarities between genomes. One way of doing this is to infer the large scale mutations (also called genome rearrangements) that are responsible for transforming one genome into another, under the parsimony principle. Numerous efforts have been made over the past years regarding the traditional approach, in which all genome rearrangements have the same unit cost. However, since long rearrangements are more likely to disturb the organism, some works considered the length-weighted approach, in which the cost of each rearrangement is a function of its length. The problem is thus modeled as follows: given two genomes G1 and G2 and a set β of allowed genome rearrangements, each with a given cost, determine their distance, i.e., the total cost of a minimum-cost sequence of rearrangements from β that transforms G1 into G2. In this paper, we study length-weighted reversals and transpositions, mainly considering their prefix variants (in which the leftmost part of the genome must always be included). We present approximation algorithms and bounds on the distances and diameters (i.e., maximum possible distance over any two genomes of same length) for eight problems when genomes are represented as permutations, and another eight problems when genomes are represented as binary strings, in each case considering two different cost functions.
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,