ABSTRACT
The DNA sequencing process has evolved rapidly due to the development of new technologies and equipment capable of producing large amounts of sequencing data. Among these methods, PacBio stands out. The PacBio method uses single molecule real-time, generating sequence files composed by long reads. Storage and analysis of the data generated became a challenge ushering in the development of bioinformatic tools. One of these challenges is the alignment of these sequences. This article describes techniques and processes developed for long DNA sequence alignment using manycore architecture.
Subject(s)
Computational Biology/trends , DNA/genetics , Sequence Alignment/methods , Software , Algorithms , Base Sequence , High-Throughput Nucleotide Sequencing , Sequence Analysis, DNAABSTRACT
The development of next-generation sequencing platforms increased substantially the capacity of data generation. In addition, in the past years, the costs for whole genome sequencing have been reduced that made it easier to access this technology. As a result, the storage and analysis of the data generated became a challenge, ushering in the development of bioinformatic tools, such as programs and programming languages, able to store, process, and analyze this huge amount of information. In this article, we present MELC genomics, a framework for genome assembly in a simple and fast workflow.