Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6875600 | Theoretical Computer Science | 2018 | 25 Pages |
Abstract
Comparative genomics is a line of study which aims at determining dis/similarities between genomes. One way of doing this is to infer the large scale mutations (also called genome rearrangements) that are responsible for transforming one genome into another, under the parsimony principle. Numerous efforts have been made over the past years regarding the traditional approach, in which all genome rearrangements have the same unit cost. However, since long rearrangements are more likely to disturb the organism, some works considered the length-weighted approach, in which the cost of each rearrangement is a function of its length. The problem is thus modeled as follows: given two genomes G1 and G2 and a set β of allowed genome rearrangements, each with a given cost, determine their distance, i.e., the total cost of a minimum-cost sequence of rearrangements from β that transforms G1 into G2. In this paper, we study length-weighted reversals and transpositions, mainly considering their prefix variants (in which the leftmost part of the genome must always be included). We present approximation algorithms and bounds on the distances and diameters (i.e., maximum possible distance over any two genomes of same length) for eight problems when genomes are represented as permutations, and another eight problems when genomes are represented as binary strings, in each case considering two different cost functions.
Related Topics
Physical Sciences and Engineering
Computer Science
Computational Theory and Mathematics
Authors
Carla Negri Lintzmayer, Guillaume Fertin, Zanoni Dias,