Búsqueda | Portal Regional de la BVS

Sensitive and error-tolerant annotation of protein-coding DNA with BATH.

Krause, Genevieve R; Shands, Walt; Wheeler, Travis J.

Bioinform Adv ; 4(1): vbae088, 2024.

Artículo en Inglés | MEDLINE | ID: mdl-38966592

RESUMEN

Summary: We present BATH, a tool for highly sensitive annotation of protein-coding DNA based on direct alignment of that DNA to a database of protein sequences or profile hidden Markov models (pHMMs). BATH is built on top of the HMMER3 code base, and simplifies the annotation workflow for pHMM-based translated sequence annotation by providing a straightforward input interface and easy-to-interpret output. BATH also introduces novel frameshift-aware algorithms to detect frameshift-inducing nucleotide insertions and deletions (indels). BATH matches the accuracy of HMMER3 for annotation of sequences containing no errors, and produces superior accuracy to all tested tools for annotation of sequences containing nucleotide indels. These results suggest that BATH should be used when high annotation sensitivity is required, particularly when frameshift errors are expected to interrupt protein-coding regions, as is true with long-read sequencing data and in the context of pseudogenes. Availability and implementation: The software is available at https://github.com/TravisWheelerLab/BATH.

Sensitive and error-tolerant annotation of protein-coding DNA with BATH.

Krause, Genevieve R; Shands, Walt; Wheeler, Travis J.

bioRxiv ; 2024 Jan 01.

Artículo en Inglés | MEDLINE | ID: mdl-38260252

RESUMEN

We present BATH, a tool for highly sensitive annotation of protein-coding DNA based on direct alignment of that DNA to a database of protein sequences or profile hidden Markov models (pHMMs). BATH is built on top of the HMMER3 code base, and simplifies the annotation workflow for pHMM-based annotation by providing a straightforward input interface and easy-to-interpret output. BATH also introduces novel frameshift-aware algorithms to detect frameshift-inducing nucleotide insertions and deletions (indels). BATH matches the accuracy of HMMER3 for annotation of sequences containing no errors, and produces superior accuracy to all tested tools for annotation of sequences containing nucleotide indels. These results suggest that BATH should be used when high annotation sensitivity is required, particularly when frameshift errors are expected to interrupt protein-coding regions, as is true with long read sequencing data and in the context of pseudogenes.

The Dockstore: enhancing a community platform for sharing reproducible and accessible computational protocols.

Yuen, Denis; Cabansay, Louise; Duncan, Andrew; Luu, Gary; Hogue, Gregory; Overbeck, Charles; Perez, Natalie; Shands, Walt; Steinberg, David; Reid, Chaz; Olunwa, Nneka; Hansen, Richard; Sheets, Elizabeth; O'Farrell, Ash; Cullion, Kim; O'Connor, Brian D; Paten, Benedict; Stein, Lincoln.

Nucleic Acids Res ; 49(W1): W624-W632, 2021 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-33978761

RESUMEN

Dockstore (https://dockstore.org/) is an open source platform for publishing, sharing, and finding bioinformatics tools and workflows. The platform has facilitated large-scale biomedical research collaborations by using cloud technologies to increase the Findability, Accessibility, Interoperability and Reusability (FAIR) of computational resources, thereby promoting the reproducibility of complex bioinformatics analyses. Dockstore supports a variety of source repositories, analysis frameworks, and language technologies to provide a seamless publishing platform for authors to create a centralized catalogue of scientific software. The ready-to-use packaging of hundreds of tools and workflows, combined with the implementation of interoperability standards, enables users to launch analyses across multiple environments. Dockstore is widely used, more than twenty-five high-profile organizations share analysis collections through the platform in a variety of workflow languages, including the Broad Institute's GATK best practice and COVID-19 workflows (WDL), nf-core workflows (Nextflow), the Intergalactic Workflow Commission tools (Galaxy), and workflows from Seven Bridges (CWL) to highlight just a few. Here we describe the improvements made over the last four years, including the expansion of system integrations supporting authors, the addition of collaboration features and analysis platform integrations supporting users, and other enhancements that improve the overall scientific reproducibility of Dockstore content.

Asunto(s)

Biología Computacional/métodos , Difusión de la Información , Internet , Programas Informáticos , Flujo de Trabajo , Nube Computacional , Biología Computacional/educación , Visualización de Datos , Humanos , National Heart, Lung, and Blood Institute (U.S.) , National Human Genome Research Institute (U.S.) , Reproducibilidad de los Resultados , Estados Unidos

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA