This article is a Preprint
Preprints are preliminary research reports that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.
Preprints posted online allow authors to receive rapid feedback and the entire scientific community can appraise the work for themselves and respond appropriately. Those comments are posted alongside the preprints for anyone to read them and serve as a post publication assessment.
A new method to study genome mutations using the information entropy (preprint)
biorxiv; 2021.
Preprint
in English
| bioRxiv | ID: ppzbmed-10.1101.2021.05.27.445958
ABSTRACT
We report a non-clinical, mathematical method of studying genetic sequences based on the information theory. Our method involves calculating the information entropy spectrum of genomes by splitting them into windows containing a fixed number of nucleotides. The information entropy value of each window is computed using the m-block information entropy formula. We show that the information entropy spectrum of genomes contains sufficient information to allow detection of genetic mutations, as well as possibly predicting future ones. Our study indicates that the best m-block size is 2 and the optimal window size should contain more than 9, and less than 33 nucleotides. In order to implement the proposed technique, we created specialized software, which is freely available. Here we report the successful test of this method on the reference RNA sequence of the SARS-CoV-2 virus collected in Wuhan, Dec. 2019 (MN908947) and one of its randomly selected variants from Taiwan, Feb. 2020 (MT370518), displaying 7 mutations.
Full text:
Available
Collection:
Preprints
Database:
bioRxiv
Language:
English
Year:
2021
Document Type:
Preprint
Similar
MEDLINE
...
LILACS
LIS