Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 1 de 1
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 119(35): e2122636119, 2022 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-36018838

RESUMO

Taxonomic classification, that is, the assignment to biological clades with shared ancestry, is a common task in genetics, mainly based on a genome similarity search of large genome databases. The classification quality depends heavily on the database, since representative relatives must be present. Many genomic sequences cannot be classified at all or only with a high misclassification rate. Here we present BERTax, a deep neural network program based on natural language processing to precisely classify the superkingdom and phylum of DNA sequences taxonomically without the need for a known representative relative from a database. We show BERTax to be at least on par with the state-of-the-art approaches when taxonomically similar species are part of the training data. For novel organisms, however, BERTax clearly outperforms any existing approach. Finally, we show that BERTax can also be combined with database approaches to further increase the prediction quality in almost all cases. Since BERTax is not based on similar entries in databases, it allows precise taxonomic classification of a broader range of genomic sequences, thus increasing the overall information gain.


Assuntos
Código de Barras de DNA Taxonômico , DNA , Aprendizado Profundo , Software , Algoritmos , Sequência de Bases , DNA/classificação , DNA/genética , Código de Barras de DNA Taxonômico/métodos , Genoma , Genômica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...