ABSTRACT
During the evolutionary process, genomes are affected by various genome rearrangements, that is, events that modify large stretches of the genetic material. In the literature, a large number of models have been proposed to estimate the number of events that occurred during evolution; most of them represent a genome as an ordered sequence of genes, and, in particular, disregard the genetic material between consecutive genes. However, recent studies showed that taking into account the genetic material between consecutive genes can enhance evolutionary distance estimations. Reversal and transposition are genome rearrangements that have been widely studied in the literature. A reversal inverts a (contiguous) segment of the genome, while a transposition swaps the positions of two consecutive segments. Genomes also undergo nonconservative events (events that alter the amount of genetic material) such as insertions and deletions, in which genetic material from intergenic regions of the genome is inserted or deleted, respectively. In this article, we study a genome rearrangement model that considers both gene order and sizes of intergenic regions. We investigate the reversal distance, and also the reversal and transposition distance between two genomes in two scenarios: with and without nonconservative events. We show that these problems are NP-hard and we present constant ratio approximation algorithms for all of them. More precisely, we provide a 4-approximation algorithm for the reversal distance, both in the conservative and nonconservative versions. For the reversal and transposition distance, we provide a 4.5-approximation algorithm, both in the conservative and nonconservative versions. We also perform experimental tests to verify the behavior of our algorithms, as well as to compare the practical and theoretical results. We finally extend our study to scenarios in which events have different costs, and we present constant ratio approximation algorithms for each scenario.
ABSTRACT
In comparative genomics, rearrangements are mutations that affect a stretch of DNA sequences. Reversals and transpositions are well-known rearrangements, and each has a vast literature. The reversal and transposition distance, that is, the minimum number of reversals and transpositions needed to transform one genome into another is a relevant evolutionary distance. The problem of computing this distance when genomes are represented by permutations was proposed >20 years ago and received the name of sorting by reversals and transpositions problem. It has been the focus of a number of studies, but the computational complexity has remained open until now. We hereby solve this question and prove that it is NP-hard no matter whether genomes are represented by signed or unsigned permutations. In addition, we prove that a usual generalization of this problem, which assigns weights wρ for reversals and wτ for transpositions, is also NP-hard as long as wτ/wρ ≤ 1.5 for both signed and unsigned permutations.
Subject(s)
Base Sequence/genetics , Computational Biology/methods , Genomics/methods , Algorithms , Gene Rearrangement , Genome/genetics , Mutation/geneticsABSTRACT
Genome rearrangements are global mutations that change large stretches of DNA sequence throughout genomes. They are rare but accumulate during the evolutionary process leading to organisms with similar genetic material in different places and orientations within the genome. Sorting by Genome Rearrangements problems seek for minimum-length sequences of rearrangements that transform one genome into the other. These problems accept alternative versions that assign weights for each event, and the goal is to find a minimum-weight sequence. We study the Sorting by Weighted Reversals and Transpositions problem on signed permutations. In this study, we use weight 2 for reversals and 3 for transpositions and consider theoretical and practical aspects in our analysis. We present two algorithms with approximation factors of 5/3 and 3/2. We also developed a generic approximation algorithm to deal with different weights for reversals and transpositions, and we show the approximation factor reached in each scenario.