Entropy Based Clustering of Viral Sequences
Bioinformatics Research and Applications, Isbra 2022
; 13760:369-380, 2022.
Artículo
en Inglés
| Web of Science | ID: covidwho-2309148
ABSTRACT
Clustering viral sequences allows us to characterize the composition and structure of intrahost and interhost viral populations, which play a crucial role in disease progression and epidemic spread. In this paper we propose and validate a new entropy based method for clustering aligned viral sequences considered as categorical data. The method finds a homogeneous clustering by minimizing information entropy rather than distance between sequences in the same cluster. We have applied our entropy based clustering method to SARS-CoV-2 viral sequencing data. We report the information content extracted from the sequences by entropy based clustering. Our method converges to similar minimum-entropy clusterings across different runs and limited permutations of data. We also show that a parallelized version of our tool is scalable to very large SARS-CoV-2 datasets.
Texto completo:
Disponible
Colección:
Bases de datos de organismos internacionales
Base de datos:
Web of Science
Idioma:
Inglés
Revista:
Bioinformatics Research and Applications, Isbra 2022
Año:
2022
Tipo del documento:
Artículo
Similares
MEDLINE
...
LILACS
LIS