ABSTRACT
Summary: AC-DIAMOND (v1) is a DNA-protein alignment tool designed to tackle the efficiency challenge of aligning large amount of reads or contigs to protein databases. When compared with the previously most efficient method DIAMOND, AC-DIAMOND gains a 6- to 7-fold speed-up, while retaining a similar degree of sensitivity. The improvement is rooted at two aspects: first, using a compressed index of seeds with adaptive-length to speed-up the matching between query and reference sequences; second, adopting a compact form of dynamic programing to fully utilize the parallelism of the SIMD capability. Availability and implementation: Software source codes and binaries available at https://github.com/Maihj/AC-DIAMOND/. Supplementary information: Supplementary data are available at Bioinformatics online.
Subject(s)
Software , DNA , Databases, Protein , Proteins , Sequence Analysis, DNAABSTRACT
BACKGROUND: The recent advancement of whole genome alignment software has made it possible to align two genomes very efficiently and with only a small sacrifice in sensitivity. Yet it becomes very slow if the extra sensitivity is needed. This paper proposes a simple but effective method to improve the sensitivity of existing whole-genome alignment software without paying much extra running time. RESULTS AND CONCLUSIONS: We have applied our method to a popular whole genome alignment tool LAST, and we called the resulting tool LASTM. Experimental results showed that LASTM could find more high quality alignments with a little extra running time. For example, when comparing human and mouse genomes, to produce the similar number of alignments with similar average length and similarity, LASTM was about three times faster than LAST. We conclude that our method can be used to improve the sensitivity, and the extra time it takes is small, and thus it is worthwhile to be implemented in existing tools.