Cross-attention PHV: Prediction of human and virus protein-protein interactions using cross-attention-based neural networks.
Comput Struct Biotechnol J
; 20: 5564-5573, 2022.
Article
in English
| MEDLINE | ID: covidwho-2061048
ABSTRACT
Viral infections represent a major health concern worldwide. The alarming rate at which SARS-CoV-2 spreads, for example, led to a worldwide pandemic. Viruses incorporate genetic material into the host genome to hijack host cell functions such as the cell cycle and apoptosis. In these viral processes, protein-protein interactions (PPIs) play critical roles. Therefore, the identification of PPIs between humans and viruses is crucial for understanding the infection mechanism and host immune responses to viral infections and for discovering effective drugs. Experimental methods including mass spectrometry-based proteomics and yeast two-hybrid assays are widely used to identify human-virus PPIs, but these experimental methods are time-consuming, expensive, and laborious. To overcome this problem, we developed a novel computational predictor, named cross-attention PHV, by implementing two key technologies of the cross-attention mechanism and a one-dimensional convolutional neural network (1D-CNN). The cross-attention mechanisms were very effective in enhancing prediction and generalization abilities. Application of 1D-CNN to the word2vec-generated feature matrices reduced computational costs, thus extending the allowable length of protein sequences to 9000 amino acid residues. Cross-attention PHV outperformed existing state-of-the-art models using a benchmark dataset and accurately predicted PPIs for unknown viruses. Cross-attention PHV also predicted human-SARS-CoV-2 PPIs with area under the curve values >0.95. The Cross-attention PHV web server and source codes are freely available at https//kurata35.bio.kyutech.ac.jp/Cross-attention_PHV/ and https//github.com/kuratahiroyuki/Cross-Attention_PHV, respectively.
1D-CNN, One-dimensional-CNN; AC, Accuracy; AUC, Area under the curve; CNN, Convolutional neural network; Convolutional neural network; DT, Decision tree; F1, F1-score; HV-PPIs, Human-virus PPIs; HuV-PPI, Humanunknown virus PPI; Human; LR, Linear regression; MCC, Matthews correlation coefficient; PPIs, Protein-protein interactions; Proteinprotein interaction; RF, Random forest; SARS-CoV-2; SARS-CoV-2, Severe acute respiratory syndrome coronavirus 2; SN, Sensitivity; SP, Specificity; SVM, Support vector machine; T-SNE, T-distributed stochastic neighbor embedding; Virus; W2V, Word2vec; Word2vec
Full text:
Available
Collection:
International databases
Database:
MEDLINE
Type of study:
Prognostic study
/
Randomized controlled trials
Language:
English
Journal:
Comput Struct Biotechnol J
Year:
2022
Document Type:
Article
Affiliation country:
J.csbj.2022.10.012
Similar
MEDLINE
...
LILACS
LIS