Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Comput Biol ; 29(9): 961-973, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35638936

RESUMO

Reference-based scaffolding is an important process used in genomic sequencing to order and orient the contigs in a draft genome based on a reference genome. In this study, we utilize the concept of genome rearrangement to formulate this process as an exemplar breakpoint distance (EBD)-based scaffolding problem, whose aim is to scaffold the contigs of two given draft genomes, both containing duplicate genes (or sequence markers) and acting with each other as a reference, such that the EBD between the scaffolded genomes is minimized. The EBD-based scaffolding problem is difficult to solve because it is non-deterministic polynomial-time (NP)-hard. In this work, we design an integer linear programming (ILP)-based algorithm to exactly solve the EBD-based scaffolding problem. Our experimental results on both simulated and biological data sets show that our ILP-based scaffolding algorithm can accurately and efficiently use a reference genome to scaffold the contigs of a draft genome. Moreover, our ILP-based scaffolding algorithm with considering duplicate genes indeed has better accuracy performance than that without considering duplicate genes, suggesting that duplicate genes and their exemplars are helpful for the application of genome rearrangement in the study of the reference-based scaffolding problem. When compared with RaGOO, a current state-of-the-art alignment-based scaffolder, our ILP-based scaffolding algorithm still has better accuracy performance on the biological data sets.


Assuntos
Genoma , Programação Linear , Algoritmos , Sequência de Bases , Genes Duplicados , Genoma/genética , Análise de Sequência de DNA/métodos
2.
Bioinformatics ; 34(1): 109-111, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28968788

RESUMO

Summary: Advances in next generation sequencing have generated massive amounts of short reads. However, assembling genome sequences from short reads still remains a challenging task. Due to errors in reads and large repeats in the genome, many of current assembly tools usually produce just collections of contigs whose relative positions and orientations along the genome being sequenced are still unknown. To address this issue, a scaffolding process to order and orient the contigs of a draft genome is needed for completing the genome sequence. In this work, we propose a new scaffolding tool called CSAR that can efficiently and more accurately order and orient the contigs of a given draft genome based on a reference genome of a related organism. In particular, the reference genome required by CSAR is not necessary to be complete in sequence. Our experimental results on real datasets have shown that CSAR outperforms other similar tools such as Projector2, OSLay and Mauve Aligner in terms of average sensitivity, precision, F-score, genome coverage, NGA50 and running time. Availability and implementation: The program of CSAR can be downloaded from https://github.com/ablab-nthu/CSAR. Contact: hchiu@mail.ncku.edu.tw or cllu@cs.nthu.edu.tw. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Bactérias/genética , Genoma , Genômica/métodos , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...