Approximation algorithm for rearrangement distances considering repeated genes and intergenic regions.
Algorithms Mol Biol
; 16(1): 21, 2021 Oct 13.
Article
in En
| MEDLINE
| ID: mdl-34645469
The rearrangement distance is a method to compare genomes of different species. Such distance is the number of rearrangement events necessary to transform one genome into another. Two commonly studied events are the transposition, which exchanges two consecutive blocks of the genome, and the reversal, which reverts a block of the genome. When dealing with such problems, seminal works represented genomes as sequences of genes without repetition. More realistic models started to consider gene repetition or the presence of intergenic regions, sequences of nucleotides between genes and in the extremities of the genome. This work explores the transposition and reversal events applied in a genome representation considering both gene repetition and intergenic regions. We define two problems called Minimum Common Intergenic String Partition and Reverse Minimum Common Intergenic String Partition. Using a relation with these two problems, we show a [Formula: see text]-approximation for the Intergenic Transposition Distance, the Intergenic Reversal Distance, and the Intergenic Reversal and Transposition Distance problems, where k is the maximum number of copies of a gene in the genomes. Our practical experiments on simulated genomes show that the use of partitions improves the estimates for the distances.
Full text:
1
Collection:
01-internacional
Database:
MEDLINE
Type of study:
Prognostic_studies
Language:
En
Journal:
Algorithms Mol Biol
Year:
2021
Document type:
Article
Affiliation country:
Brazil
Country of publication:
United kingdom