Your browser doesn't support javascript.
Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences.
Rout, Ranjeet Kumar; Hassan, Sk Sarif; Sheikh, Sabha; Umer, Saiyed; Sahoo, Kshira Sagar; Gandomi, Amir H.
  • Rout RK; Department of Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, Jammu and Kashmir, India. Electronic address: ranjeetkumarrout@nitsri.net.
  • Hassan SS; Department of Mathematics, Pingla Thana Mahavidyalaya, Maligram, Paschim Medinipur, 721140, India. Electronic address: sksarifhassan@pinglacollege.ac.in.
  • Sheikh S; Department of Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, Jammu and Kashmir, India. Electronic address: sabha99sheikh@gmail.com.
  • Umer S; Department of Computer Science and Engineering, Aliah University, Kolkata, India. Electronic address: saiyedumer@gmail.com.
  • Sahoo KS; Department of Computer Science and Engineering, SRM University, Amaravati, AP, 522240, India. Electronic address: kshirasagar12@gmail.com.
  • Gandomi AH; Faculty of Engineering and Information Technology, University of Technology Sydney, NSW, Australia. Electronic address: gandomi@uts.edu.au.
Comput Biol Med ; 141: 105024, 2022 02.
Article in English | MEDLINE | ID: covidwho-1509702
ABSTRACT
BACKGROUND AND

OBJECTIVE:

The world is currently facing a global emergency due to COVID-19, which requires immediate strategies to strengthen healthcare facilities and prevent further deaths. To achieve effective remedies and solutions, research on different aspects, including the genomic and proteomic level characterizations of SARS-CoV-2, are critical. In this work, the spatial representation/composition and distribution frequency of 20 amino acids across the primary protein sequences of SARS-CoV-2 were examined according to different parameters.

METHOD:

To identify the spatial distribution of amino acids over the primary protein sequences of SARS-CoV-2, the Hurst exponent and Shannon entropy were applied as parameters to fetch the autocorrelation and amount of information over the spatial representations. The frequency distribution of each amino acid over the protein sequences was also evaluated. In the case of a one-dimensional sequence, the Hurst exponent (HE) was utilized due to its linear relationship with the fractal dimension (D), i.e. D+HE=2, to characterize fractality. Moreover, binary Shannon entropy was considered to measure the uncertainty in a binary sequence then further applied to calculate amino acid conservation in the primary protein sequences. RESULTS AND

CONCLUSION:

Fourteen (14) SARS-CoV protein sequences were evaluated and compared with 105 SARS-CoV-2 proteins. The simulation results demonstrate the differences in the collected information about the amino acid spatial distribution in the SARS-CoV-2 and SARS-CoV proteins, enabling researchers to distinguish between the two types of CoV. The spatial arrangement of amino acids also reveals similarities and dissimilarities among the important structural proteins, E, M, N and S, which is pivotal to establish an evolutionary tree with other CoV strains.
Subject(s)
Keywords

Full text: Available Collection: International databases Database: MEDLINE Main subject: SARS-CoV-2 / COVID-19 Type of study: Experimental Studies Limits: Humans Language: English Journal: Comput Biol Med Year: 2022 Document Type: Article

Similar

MEDLINE

...
LILACS

LIS


Full text: Available Collection: International databases Database: MEDLINE Main subject: SARS-CoV-2 / COVID-19 Type of study: Experimental Studies Limits: Humans Language: English Journal: Comput Biol Med Year: 2022 Document Type: Article