Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add more filters










Database
Language
Publication year range
1.
ACS Omega ; 6(31): 20158-20165, 2021 Aug 10.
Article in English | MEDLINE | ID: mdl-34395967

ABSTRACT

A very simple, fast, and efficient approach to analyze and identify respiratory-related virus sequences based on machine learning is proposed. Such schemes are very important in identifying viruses, especially in view of spreading pandemics. The method is based on genetic code rules and the open reading frame (ORF). Data from the respiratory-related coronaviruses are collected and features are extracted based on reoccurring nucleobase 3-tuples in the RNA. Our methodology is simply based on counting nucleobase triplets, normalizing the count to the length of the sequence, and applying principal component analysis (PCA) techniques. The triplet counting can be further used for classification purposes. DNA sequences from the herpes virus family can be considered as the first step towards a complete and accurate classification including more complex factors, such as mutations. The proposed classification scheme is simply based on "counting" biological information. It can serve as the first fast detection method, widely accessible and portable to a variety of distinct architectures for fast and on-the-fly detection. We provide an approach that can be further optimized and combined with supervised techniques to allow for more accurate detection and read out of the exact virus type or sequence. We discuss the relevance of this scheme in identifying differences in similar viruses and their impact on biochemical analysis.

SELECTION OF CITATIONS
SEARCH DETAIL
...