Búsqueda | Portal Regional de la BVS

1.

Accurate prediction of drug combination risk levels based on relational graph convolutional network and multi-head attention.

He, Shi-Hui; Yun, Lijun; Yi, Hai-Cheng.

J Transl Med ; 22(1): 572, 2024 Jun 16.

Artículo en Inglés | MEDLINE | ID: mdl-38880914

RESUMEN

BACKGROUND: Accurately identifying the risk level of drug combinations is of great significance in investigating the mechanisms of combination medication and adverse reactions. Most existing methods can only predict whether there is an interaction between two drugs, but cannot directly determine their accurate risk level. METHODS: In this study, we propose a multi-class drug combination risk prediction model named AERGCN-DDI, utilizing a relational graph convolutional network with a multi-head attention mechanism. Drug-drug interaction events with varying risk levels are modeled as a heterogeneous information graph. Attribute features of drug nodes and links are learned based on compound chemical structure information. Finally, the AERGCN-DDI model is proposed to predict drug combination risk level based on heterogenous graph neural network and multi-head attention modules. RESULTS: To evaluate the effectiveness of the proposed method, five-fold cross-validation and ablation study were conducted. Furthermore, we compared its predictive performance with baseline models and other state-of-the-art methods on two benchmark datasets. Empirical studies demonstrated the superior performances of AERGCN-DDI. CONCLUSIONS: AERGCN-DDI emerges as a valuable tool for predicting the risk levels of drug combinations, thereby aiding in clinical medication decision-making, mitigating severe drug side effects, and enhancing patient clinical prognosis.

Asunto(s)

Redes Neurales de la Computación , Humanos , Interacciones Farmacológicas , Combinación de Medicamentos , Medición de Riesgo , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Reproducibilidad de los Resultados , Gráficos por Computador

2.

MathEagle: Accurate prediction of drug-drug interaction events via multi-head attention and heterogeneous attribute graph learning.

Hou, Lin-Xuan; Yi, Hai-Cheng; You, Zhu-Hong; Chen, Shi-Hong; Zheng, Jia; Kwoh, Chee Keong.

Comput Biol Med ; 177: 108642, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38820777

RESUMEN

BACKGROUND: Drug-drug interaction events influence the effectiveness of drug combinations and can lead to unexpected side effects or exacerbate underlying diseases, jeopardizing patient prognosis. Most existing methods are restricted to predicting whether two drugs interact or the type of drug-drug interactions, while very few studies endeavor to predict the specific risk levels of side effects of drug combinations. METHODS: In this study, we propose MathEagle, a novel approach to predict accurate risk levels of drug combinations based on multi-head attention and heterogeneous attribute graph learning. Initially, we model drugs and three distinct risk levels between drugs as a heterogeneous information graph. Subsequently, behavioral and chemical structure features of drugs are utilized by message passing neural networks and graph embedding algorithms, respectively. Ultimately, MathEagle employs heterogeneous graph convolution and multi-head attention mechanisms to learn efficient latent representations of drug nodes and estimates the risk levels of pairwise drugs in an end-to-end manner. RESULTS: To assess the effectiveness and robustness of the model, five-fold cross-validation, ablation experiments, and case studies were conducted. MathEagle achieved an accuracy of 85.85 % and an AUC of 0.9701 on the drug risk level prediction task and is superior to all comparative models. The MathEagle predictor is freely accessible at http://120.77.11.78/MathEagle/. CONCLUSIONS: The experimental results indicate that MathEagle can function as an effective tool for predicting accurate risk of drug combinations, aiding in guiding clinical medication, and enhancing patient outcomes.

Asunto(s)

Interacciones Farmacológicas , Humanos , Algoritmos , Redes Neurales de la Computación , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Aprendizaje Automático

3.

A learning-based method to predict LncRNA-disease associations by combining CNN and ELM.

Guo, Zhen-Hao; Chen, Zhan-Heng; You, Zhu-Hong; Wang, Yan-Bin; Yi, Hai-Cheng; Wang, Mei-Neng.

BMC Bioinformatics ; 22(Suppl 5): 622, 2022 Mar 22.

Artículo en Inglés | MEDLINE | ID: mdl-35317723

RESUMEN

BACKGROUND: lncRNAs play a critical role in numerous biological processes and life activities, especially diseases. Considering that traditional wet experiments for identifying uncovered lncRNA-disease associations is limited in terms of time consumption and labor cost. It is imperative to construct reliable and efficient computational models as addition for practice. Deep learning technologies have been proved to make impressive contributions in many areas, but the feasibility of it in bioinformatics has not been adequately verified. RESULTS: In this paper, a machine learning-based model called LDACE was proposed to predict potential lncRNA-disease associations by combining Extreme Learning Machine (ELM) and Convolutional Neural Network (CNN). Specifically, the representation vectors are constructed by integrating multiple types of biology information including functional similarity and semantic similarity. Then, CNN is applied to mine both local and global features. Finally, ELM is chosen to carry out the prediction task to detect the potential lncRNA-disease associations. The proposed method achieved remarkable Area Under Receiver Operating Characteristic Curve of 0.9086 in Leave-one-out cross-validation and 0.8994 in fivefold cross-validation, respectively. In addition, 2 kinds of case studies based on lung cancer and endometrial cancer indicate the robustness and efficiency of LDACE even in a real environment. CONCLUSIONS: Substantial results demonstrated that the proposed model is expected to be an auxiliary tool to guide and assist biomedical research, and the close integration of deep learning and biology big data will provide life sciences with novel insights.

Asunto(s)

ARN Largo no Codificante , Biología Computacional/métodos , Aprendizaje Automático , Redes Neurales de la Computación , ARN Largo no Codificante/genética , Curva ROC

4.

GBDR: a Bayesian model for precise prediction of pathogenic microorganisms using 16S rRNA gene sequences.

Huang, Yu-An; Huang, Zhi-An; Li, Jian-Qiang; You, Zhu-Hong; Wang, Lei; Yi, Hai-Cheng; Yu, Chang-Qing.

BMC Genomics ; 22(Suppl 1): 916, 2022 Mar 16.

Artículo en Inglés | MEDLINE | ID: mdl-35296232

RESUMEN

BACKGROUND: Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological "haystack" merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale. RESULTS: Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures. CONCLUSION: Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers.

Asunto(s)

Algoritmos , Bacterias/clasificación , Biología Computacional , ARN Ribosómico 16S , Teorema de Bayes , Biología Computacional/métodos , Genes de ARNr , Humanos , ARN Ribosómico 16S/genética

5.

DeepWalk based method to predict lncRNA-miRNA associations via lncRNA-miRNA-disease-protein-drug graph.

Yang, Long; Li, Li-Ping; Yi, Hai-Cheng.

BMC Bioinformatics ; 22(Suppl 12): 621, 2022 Feb 25.

Artículo en Inglés | MEDLINE | ID: mdl-35216549

RESUMEN

BACKGROUND: Long non-coding RNAs (lncRNAs) play a crucial role in diverse biological processes and have been confirmed to be concerned with various diseases. Largely uncharacterized of the physiological role and functions of lncRNA remains. MicroRNAs (miRNAs), which are usually 20-24 nucleotides, have several critical regulatory parts in cells. LncRNA can be regarded as a sponge to adsorb miRNA and indirectly regulate transcription and translation. Thus, the identification of lncRNA-miRNA associations is essential and valuable. RESULTS: In our work, we present DWLMI to infer the potential associations between lncRNAs and miRNAs by representing them as vectors via a lncRNA-miRNA-disease-protein-drug graph. Specifically, DeepWalk can be used to learn the behavior representation of vertices. The methods of fingerprint, k-mer and MeSH descriptors were mainly used to learn the attribute representation of vertices. By combining the above two kinds of information, unknown lncRNA-miRNA associations can be predicted by the random forest classifier. Under the five-fold cross-validation, the proposed DWLMI model obtained an average prediction accuracy of 95.22% with a sensitivity of 94.35% at the AUC of 98.56%. CONCLUSIONS: The experimental results demonstrated that DWLMI can effectively predict the potential lncRNA-miRNA associated pairs, and the results can provide a new insight for related non-coding RNA researchers in the field of combing biology big data with deep learning.

Asunto(s)

MicroARNs , Preparaciones Farmacéuticas , ARN Largo no Codificante , Biología Computacional/métodos , MicroARNs/genética , ARN Largo no Codificante/genética

6.

Graph representation learning in bioinformatics: trends, methods and applications.

Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Kwoh, Chee Keong.

Brief Bioinform ; 23(1)2022 01 17.

Artículo en Inglés | MEDLINE | ID: mdl-34471921

RESUMEN

Graph is a natural data structure for describing complex systems, which contains a set of objects and relationships. Ubiquitous real-life biomedical problems can be modeled as graph analytics tasks. Machine learning, especially deep learning, succeeds in vast bioinformatics scenarios with data represented in Euclidean domain. However, rich relational information between biological elements is retained in the non-Euclidean biomedical graphs, which is not learning friendly to classic machine learning methods. Graph representation learning aims to embed graph into a low-dimensional space while preserving graph topology and node properties. It bridges biomedical graphs and modern machine learning methods and has recently raised widespread interest in both machine learning and bioinformatics communities. In this work, we summarize the advances of graph representation learning and its representative applications in bioinformatics. To provide a comprehensive and structured analysis and perspective, we first categorize and analyze both graph embedding methods (homogeneous graph embedding, heterogeneous graph embedding, attribute graph embedding) and graph neural networks. Furthermore, we summarize their representative applications from molecular level to genomics, pharmaceutical and healthcare systems level. Moreover, we provide open resource platforms and libraries for implementing these graph representation learning methods and discuss the challenges and opportunities of graph representation learning in bioinformatics. This work provides a comprehensive survey of emerging graph representation learning algorithms and their applications in bioinformatics. It is anticipated that it could bring valuable insights for researchers to contribute their knowledge to graph representation learning and future-oriented bioinformatics studies.

Asunto(s)

Biología Computacional , Redes Neurales de la Computación , Algoritmos , Biología Computacional/métodos , Conocimiento , Aprendizaje Automático

7.

In silico drug repositioning using deep learning and comprehensive similarity measures.

Yi, Hai-Cheng; You, Zhu-Hong; Wang, Lei; Su, Xiao-Rui; Zhou, Xi; Jiang, Tong-Hai.

BMC Bioinformatics ; 22(Suppl 3): 293, 2021 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-34074242

RESUMEN

BACKGROUND: Drug repositioning, meanings finding new uses for existing drugs, which can accelerate the processing of new drugs research and development. Various computational methods have been presented to predict novel drug-disease associations for drug repositioning based on similarity measures among drugs and diseases. However, there are some known associations between drugs and diseases that previous studies not utilized. METHODS: In this work, we develop a deep gated recurrent units model to predict potential drug-disease interactions using comprehensive similarity measures and Gaussian interaction profile kernel. More specifically, the similarity measure is used to exploit discriminative feature for drugs based on their chemical fingerprints. Meanwhile, the Gaussian interactions profile kernel is employed to obtain efficient feature of diseases based on known disease-disease associations. Then, a deep gated recurrent units model is developed to predict potential drug-disease interactions. RESULTS: The performance of the proposed model is evaluated on two benchmark datasets under tenfold cross-validation. And to further verify the predictive ability, case studies for predicting new potential indications of drugs were carried out. CONCLUSION: The experimental results proved the proposed model is a useful tool for predicting new indications for drugs or new treatments for diseases, and can accelerate drug repositioning and related drug research and discovery.

Asunto(s)

Aprendizaje Profundo , Reposicionamiento de Medicamentos , Algoritmos , Biología Computacional , Simulación por Computador

8.

An Efficient Computational Model for Large-Scale Prediction of Protein-Protein Interactions Based on Accurate and Scalable Graph Embedding.

Su, Xiao-Rui; You, Zhu-Hong; Hu, Lun; Huang, Yu-An; Wang, Yi; Yi, Hai-Cheng.

Front Genet ; 12: 635451, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-33719344

RESUMEN

Protein-protein interaction (PPI) is the basis of the whole molecular mechanisms of living cells. Although traditional experiments are able to detect PPIs accurately, they often encounter high cost and require more time. As a result, computational methods have been used to predict PPIs to avoid these problems. Graph structure, as the important and pervasive data carriers, is considered as the most suitable structure to present biomedical entities and relationships. Although graph embedding is the most popular approach for graph representation learning, it usually suffers from high computational and space cost, especially in large-scale graphs. Therefore, developing a framework, which can accelerate graph embedding and improve the accuracy of embedding results, is important to large-scale PPIs prediction. In this paper, we propose a multi-level model LPPI to improve both the quality and speed of large-scale PPIs prediction. Firstly, protein basic information is collected as its attribute, including positional gene sets, motif gene sets, and immunological signatures. Secondly, we construct a weighted graph by using protein attributes to calculate node similarity. Then GraphZoom is used to accelerate the embedding process by reducing the size of the weighted graph. Next, graph embedding methods are used to learn graph topology features from the reconstructed graph. Finally, the linear Logistic Regression (LR) model is used to predict the probability of interactions of two proteins. LPPI achieved a high accuracy of 0.99997 and 0.9979 on the PPI network dataset and GraphSAGE-PPI dataset, respectively. Our further results show that the LPPI is promising for large-scale PPI prediction in both accuracy and efficiency, which is beneficial to other large-scale biomedical molecules interactions detection.

9.

MeSHHeading2vec: a new method for representing MeSH headings as vectors based on graph embedding algorithm.

Guo, Zhen-Hao; You, Zhu-Hong; Huang, De-Shuang; Yi, Hai-Cheng; Zheng, Kai; Chen, Zhan-Heng; Wang, Yan-Bin.

Brief Bioinform ; 22(2): 2085-2095, 2021 03 22.

Artículo en Inglés | MEDLINE | ID: mdl-32232320

RESUMEN

Effectively representing Medical Subject Headings (MeSH) headings (terms) such as disease and drug as discriminative vectors could greatly improve the performance of downstream computational prediction models. However, these terms are often abstract and difficult to quantify. In this paper, we converted the MeSH tree structure into a relationship network and applied several graph embedding algorithms on it to represent these terms. Specifically, the relationship network consisting of nodes (MeSH headings) and edges (relationships), which can be constructed by the tree num. Then, five graph embedding algorithms including DeepWalk, LINE, SDNE, LAP and HOPE were implemented on the relationship network to represent MeSH headings as vectors. In order to evaluate the performance of the proposed methods, we carried out the node classification and relationship prediction tasks. The results show that the MeSH headings characterized by graph embedding algorithms can not only be treated as an independent carrier for representation, but also can be utilized as additional information to enhance the representation ability of vectors. Thus, it can serve as an input and continue to play a significant role in any computational models related to disease, drug, microbe, etc. Besides, our method holds great hope to inspire relevant researchers to study the representation of terms in this network perspective.

Asunto(s)

Algoritmos , Medical Subject Headings , Simulación por Computador , Sistemas de Liberación de Medicamentos , Predisposición Genética a la Enfermedad , Humanos , MicroARNs/genética , Semántica

10.

Learning Representation of Molecules in Association Network for Predicting Intermolecular Associations.

Yi, Hai-Cheng; You, Zhu-Hong; Guo, Zhen-Hao; Huang, De-Shuang; Chan, Keith C C.

IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2546-2554, 2021.

Artículo en Inglés | MEDLINE | ID: mdl-32070992

RESUMEN

A key aim of post-genomic biomedical research is to systematically understand molecules and their interactions in human cells. Multiple biomolecules coordinate to sustain life activities, and interactions between various biomolecules are interconnected. However, existing studies usually only focusing on associations between two or very limited types of molecules. In this study, we propose a network representation learning based computational framework MAN-SDNE to predict any intermolecular associations. More specifically, we constructed a large-scale molecular association network of multiple biomolecules in human by integrating associations among long non-coding RNA, microRNA, protein, drug, and disease, containing 6,528 molecular nodes, 9 kind of,105,546 associations. And then, the feature of each node is represented by its network proximity and attribute features. Furthermore, these features are used to train Random Forest classifier to predict intermolecular associations. MAN-SDNE achieves a remarkable performance with an AUC of 0.9552 and an AUPR of 0.9338 under five-fold cross-validation. To indicate the ability to predict specific types of interactions, a case study for predicting lncRNA-protein interactions using MAN-SDNE is also executed. Experimental results demonstrate this work offers a systematic insight for understanding the synergistic associations between molecules and complex diseases and provides a network-based computational tool to systematically explore intermolecular interactions.

Asunto(s)

Modelos Biológicos , Biología de Sistemas/métodos , Simulación por Computador , Humanos , MicroARNs/genética , MicroARNs/metabolismo , Preparaciones Farmacéuticas/metabolismo , ARN Largo no Codificante/genética , ARN Largo no Codificante/metabolismo

11.

NEMPD: a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information.

Ji, Bo-Ya; You, Zhu-Hong; Chen, Zhan-Heng; Wong, Leon; Yi, Hai-Cheng.

BMC Bioinformatics ; 21(1): 401, 2020 Sep 10.

Artículo en Inglés | MEDLINE | ID: mdl-32912137

RESUMEN

BACKGROUND: As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. RESULTS: In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. CONCLUSIONS: The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.

Asunto(s)

Neoplasias de la Mama/diagnóstico , Neoplasias del Colon/diagnóstico , Biología Computacional/métodos , Neoplasias Pulmonares/diagnóstico , MicroARNs/metabolismo , Algoritmos , Área Bajo la Curva , Neoplasias de la Mama/genética , Neoplasias del Colon/genética , Femenino , Humanos , Neoplasias Pulmonares/genética , MicroARNs/genética , Curva ROC

12.

MIPDH: A Novel Computational Model for Predicting microRNA-mRNA Interactions by DeepWalk on a Heterogeneous Network.

Wong, Leon; You, Zhu-Hong; Guo, Zhen-Hao; Yi, Hai-Cheng; Chen, Zhan-Heng; Cao, Mei-Yuan.

ACS Omega ; 5(28): 17022-17032, 2020 Jul 21.

Artículo en Inglés | MEDLINE | ID: mdl-32715187

RESUMEN

Analysis of miRNA-target mRNA interaction (MTI) is of crucial significance in discovering new target candidates for miRNAs. However, the biological experiments for identifying MTIs have a high false positive rate and are high-priced, time-consuming, and arduous. It is an urgent task to develop effective computational approaches to enhance the investigation of miRNA-target mRNA relationships. In this study, a novel method called MIPDH is developed for miRNA-mRNA interaction prediction by using DeepWalk on a heterogeneous network. More specifically, MIPDH extracts two kinds of features, in which a biological behavior feature is learned using a network embedding algorithm on a constructed heterogeneous network derived from 17 kinds of associations among drug, disease, and 6 kinds of biomolecules, and the attribute feature is learned using the k-mer method on sequences of miRNAs and target mRNAs. Then, a random forest classifier is trained on the features combined with the biological behavior feature and attribute feature. When implementing a 5-fold cross-validation experiment, MIPDH achieved an average accuracy, sensitivity, specificity and AUC of 75.85, 74.37, 77.33%, and 0.8044, respectively. To further evaluate the performance of MIPDH, other classifiers and feature descriptors are conducted for comparisons. MIPDH can achieve a better performance. Additionally, case studies on hsa-miR-106b-5p, hsa-let-7d-5p, and hsa-let-7e-5p are also implemented. As a result, 14, 9, and 9 out of the top 15 targets that interacted with these miRNAs were verified using the experimental literature or other databases. All these prediction results indicate that MIPDH is an effective method for predicting miRNA-target mRNA interactions.

13.

Bioentity2vec: Attribute- and behavior-driven representation for predicting multi-type relationships between bioentities.

Guo, Zhen-Hao; You, Zhu-Hong; Wang, Yan-Bin; Huang, De-Shuang; Yi, Hai-Cheng; Chen, Zhan-Heng.

Gigascience ; 9(6)2020 06 01.

Artículo en Inglés | MEDLINE | ID: mdl-32533701

RESUMEN

BACKGROUND: The explosive growth of genomic, chemical, and pathological data provides new opportunities and challenges for humans to thoroughly understand life activities in cells. However, there exist few computational models that aggregate various bioentities to comprehensively reveal the physical and functional landscape of biological systems. RESULTS: We constructed a molecular association network, which contains 18 edges (relationships) between 8 nodes (bioentities). Based on this, we propose Bioentity2vec, a new method for representing bioentities, which integrates information about the attributes and behaviors of a bioentity. Applying the random forest classifier, we achieved promising performance on 18 relationships, with an area under the curve of 0.9608 and an area under the precision-recall curve of 0.9572. CONCLUSIONS: Our study shows that constructing a network with rich topological and biological information is important for systematic understanding of the biological landscape at the molecular level. Our results show that Bioentity2vec can effectively represent biological entities and provides easily distinguishable information about classification tasks. Our method is also able to simultaneously predict relationships between single types and multiple types, which will accelerate progress in biological experimental research and industrial product development.

Asunto(s)

Algoritmos , Biología Computacional/métodos , Programas Informáticos , Biología de Sistemas/métodos , Perfilación de la Expresión Génica/métodos , Curva ROC

14.

Prediction of Drug-Target Interactions From Multi-Molecular Network Based on Deep Walk Embedding Model.

Chen, Zhan-Heng; You, Zhu-Hong; Guo, Zhen-Hao; Yi, Hai-Cheng; Luo, Gong-Xu; Wang, Yan-Bin.

Front Bioeng Biotechnol ; 8: 338, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-32582646

RESUMEN

Predicting drug-target interactions (DTIs) is crucial in innovative drug discovery, drug repositioning and other fields. However, there are many shortcomings for predicting DTIs using traditional biological experimental methods, such as the high-cost, time-consumption, low efficiency, and so on, which make these methods difficult to widely apply. As a supplement, the in silico method can provide helpful information for predictions of DTIs in a timely manner. In this work, a deep walk embedding method is developed for predicting DTIs from a multi-molecular network. More specifically, a multi-molecular network, also called molecular associations network, is constructed by integrating the associations among drug, protein, disease, lncRNA, and miRNA. Then, each node can be represented as a behavior feature vector by using a deep walk embedding method. Finally, we compared behavior features with traditional attribute features on an integrated dataset by using various classifiers. The experimental results revealed that the behavior feature could be performed better on different classifiers, especially on the random forest classifier. It is also demonstrated that the use of behavior information is very helpful for addressing the problem of sequences containing both self-interacting and non-interacting pairs of proteins. This work is not only extremely suitable for predicting DTIs, but also provides a new perspective for the prediction of other biomolecules' associations.

15.

Learning Representations to Predict Intermolecular Interactions on Large-Scale Heterogeneous Molecular Association Network.

Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Guo, Zhen-Hao; Chan, Keith C C; Li, Yangming.

iScience ; 23(7): 101261, 2020 Jul 24.

Artículo en Inglés | MEDLINE | ID: mdl-32580123

RESUMEN

Molecular components that are functionally interdependent in human cells constitute molecular association networks. Disease can be caused by disturbance of multiple molecular interactions. New biomolecular regulatory mechanisms can be revealed by discovering new biomolecular interactions. To this end, a heterogeneous molecular association network is formed by systematically integrating comprehensive associations between miRNAs, lncRNAs, circRNAs, mRNAs, proteins, drugs, microbes, and complex diseases. We propose a machine learning method for predicting intermolecular interactions, named MMI-Pred. More specifically, a network embedding model is developed to fully exploit the network behavior of biomolecules, and attribute features are also calculated. Then, these discriminative features are combined to train a random forest classifier to predict intermolecular interactions. MMI-Pred achieves an outstanding performance of 93.50% accuracy in hybrid associations prediction under 5-fold cross-validation. This work provides systematic landscape and machine learning method to model and infer complex associations between various biological components.

16.

A deep learning-based method for drug-target interaction prediction based on long short-term memory neural network.

Wang, Yan-Bin; You, Zhu-Hong; Yang, Shan; Yi, Hai-Cheng; Chen, Zhan-Heng; Zheng, Kai.

BMC Med Inform Decis Mak ; 20(Suppl 2): 49, 2020 03 18.

Artículo en Inglés | MEDLINE | ID: mdl-32183788

RESUMEN

BACKGROUND: The key to modern drug discovery is to find, identify and prepare drug molecular targets. However, due to the influence of throughput, precision and cost, traditional experimental methods are difficult to be widely used to infer these potential Drug-Target Interactions (DTIs). Therefore, it is urgent to develop effective computational methods to validate the interaction between drugs and target. METHODS: We developed a deep learning-based model for DTIs prediction. The proteins evolutionary features are extracted via Position Specific Scoring Matrix (PSSM) and Legendre Moment (LM) and associated with drugs molecular substructure fingerprints to form feature vectors of drug-target pairs. Then we utilized the Sparse Principal Component Analysis (SPCA) to compress the features of drugs and proteins into a uniform vector space. Lastly, the deep long short-term memory (DeepLSTM) was constructed for carrying out prediction. RESULTS: A significant improvement in DTIs prediction performance can be observed on experimental results, with AUC of 0.9951, 0.9705, 0.9951, 0.9206, respectively, on four classes important drug-target datasets. Further experiments preliminary proves that the proposed characterization scheme has great advantage on feature expression and recognition. We also have shown that the proposed method can work well with small dataset. CONCLUSION: The results demonstration that the proposed approach has a great advantage over state-of-the-art drug-target predictor. To the best of our knowledge, this study first tests the potential of deep learning method with memory and Turing completeness in DTIs prediction.

Asunto(s)

Aprendizaje Profundo , Memoria a Corto Plazo/efectos de los fármacos , Redes Neurales de la Computación , Preparaciones Farmacéuticas , Desarrollo de Medicamentos , Humanos , Análisis de Componente Principal , Proteínas

17.

A learning based framework for diverse biomolecule relationship prediction in molecular association network.

Guo, Zhen-Hao; You, Zhu-Hong; Huang, De-Shuang; Yi, Hai-Cheng; Chen, Zhan-Heng; Wang, Yan-Bin.

Commun Biol ; 3(1): 118, 2020 03 13.

Artículo en Inglés | MEDLINE | ID: mdl-32170157

RESUMEN

Abundant life activities are maintained by various biomolecule relationships in human cells. However, many previous computational models only focus on isolated objects, without considering that cell is a complete entity with ample functions. Inspired by holism, we constructed a Molecular Associations Network (MAN) including 9 kinds of relationships among 5 types of biomolecules, and a prediction model called MAN-GF. More specifically, biomolecules can be represented as vectors by the algorithm called biomarker2vec which combines 2 kinds of information involved the attribute learned by k-mer, etc and the behavior learned by Graph Factorization (GF). Then, Random Forest classifier is applied for training, validation and test. MAN-GF obtained a substantial performance with AUC of 0.9647 and AUPR of 0.9521 under 5-fold Cross-validation. The results imply that MAN-GF with an overall perspective can act as ancillary for practice. Besides, it holds great hope to provide a new insight to elucidate the regulatory mechanisms.

Asunto(s)

Neoplasias del Colon/metabolismo , Biología Computacional/métodos , MicroARNs/metabolismo , Modelos Biológicos , Mapas de Interacción de Proteínas , Proteínas/metabolismo , ARN Largo no Codificante/metabolismo , Algoritmos , Área Bajo la Curva , Exactitud de los Datos , Minería de Datos/métodos , Humanos , Curva ROC , Sensibilidad y Especificidad

18.

RPI-SE: a stacking ensemble learning framework for ncRNA-protein interactions prediction using sequence information.

Yi, Hai-Cheng; You, Zhu-Hong; Wang, Mei-Neng; Guo, Zhen-Hao; Wang, Yan-Bin; Zhou, Ji-Ren.

BMC Bioinformatics ; 21(1): 60, 2020 Feb 18.

Artículo en Inglés | MEDLINE | ID: mdl-32070279

RESUMEN

BACKGROUND: The interactions between non-coding RNAs (ncRNA) and proteins play an essential role in many biological processes. Several high-throughput experimental methods have been applied to detect ncRNA-protein interactions. However, these methods are time-consuming and expensive. Accurate and efficient computational methods can assist and accelerate the study of ncRNA-protein interactions. RESULTS: In this work, we develop a stacking ensemble computational framework, RPI-SE, for effectively predicting ncRNA-protein interactions. More specifically, to fully exploit protein and RNA sequence feature, Position Weight Matrix combined with Legendre Moments is applied to obtain protein evolutionary information. Meanwhile, k-mer sparse matrix is employed to extract efficient feature of ncRNA sequences. Finally, an ensemble learning framework integrated different types of base classifier is developed to predict ncRNA-protein interactions using these discriminative features. The accuracy and robustness of RPI-SE was evaluated on three benchmark data sets under five-fold cross-validation and compared with other state-of-the-art methods. CONCLUSIONS: The results demonstrate that RPI-SE is competent for ncRNA-protein interactions prediction task with high accuracy and robustness. It's anticipated that this work can provide a computational prediction tool to advance ncRNA-protein interactions related biomedical research.

Asunto(s)

ARN no Traducido/metabolismo , Proteínas de Unión al ARN/metabolismo , Análisis de Secuencia de Proteína/métodos , Análisis de Secuencia de ARN/métodos , Posición Específica de Matrices de Puntuación , ARN no Traducido/química , Proteínas de Unión al ARN/química

19.

Using Weighted Extreme Learning Machine Combined With Scale-Invariant Feature Transform to Predict Protein-Protein Interactions From Protein Evolutionary Information.

Li, Jianqiang; Shi, Xiaofeng; You, Zhu-Hong; Yi, Hai-Cheng; Chen, Zhuangzhuang; Lin, Qiuzhen; Fang, Min.

IEEE/ACM Trans Comput Biol Bioinform ; 17(5): 1546-1554, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-31940546

RESUMEN

Protein-Protein Interactions (PPIs) play an irreplaceable role in biological activities of organisms. Although many high-throughput methods are used to identify PPIs from different kinds of organisms, they have some shortcomings, such as high cost and time-consuming. To solve the above problems, computational methods are developed to predict PPIs. Thus, in this paper, we present a method to predict PPIs using protein sequences. First, protein sequences are transformed into Position Weight Matrix (PWM), in which Scale-Invariant Feature Transform (SIFT) algorithm is used to extract features. Then Principal Component Analysis (PCA) is applied to reduce the dimension of features. At last, Weighted Extreme Learning Machine (WELM) classifier is employed to predict PPIs and a series of evaluation results are obtained. In our method, since SIFT and WELM are used to extract features and classify respectively, we called the proposed method SIFT-WELM. When applying the proposed method on three well-known PPIs datasets of Yeast, Human and Helicobacter.pylori, the average accuracies of our method using five-fold cross validation are obtained as high as 94.83, 97.60 and 83.64 percent, respectively. In order to evaluate the proposed approach properly, we compare it with Support Vector Machine (SVM) classifier and other recent-developed methods in different aspects. Moreover, the training time of our method is greatly shortened, which is obviously superior to the previous methods, such as SVM, ACC, PCVMZM and so on.

Asunto(s)

Biología Computacional/métodos , Aprendizaje Automático , Mapeo de Interacción de Proteínas/métodos , Mapas de Interacción de Proteínas/genética , Secuencia de Aminoácidos/genética , Animales , Bases de Datos de Proteínas , Evolución Molecular , Helicobacter pylori , Humanos , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Proteínas de Saccharomyces cerevisiae , Análisis de Secuencia de Proteína

20.

Integrative Construction and Analysis of Molecular Association Network in Human Cells by Fusing Node Attribute and Behavior Information.

Guo, Zhen-Hao; You, Zhu-Hong; Yi, Hai-Cheng.

Mol Ther Nucleic Acids ; 19: 498-506, 2020 Mar 06.

Artículo en Inglés | MEDLINE | ID: mdl-31923739

RESUMEN

Detecting whether a pair of biomolecules associate is of great significance in the study of molecular biology. Hence, computational methods are urgently needed as guidance for practice. However, most of the previous prediction models influenced by reductionism focused on isolated research objects, which have their own inherent defects. Inspired by holism, a machine-learning-based framework called MAN-node2vec is proposed to predict multi-type relationships in the molecular associations network (MAN). Specifically, we constructed a large-scale MAN composed of 1,023 miRNAs, 1,649 proteins, 769 long non-coding RNAs (lncRNAs), 1,025 drugs, and 2,062 diseases. Then, each biomolecule in MAN can be represented as a vector by its attribute learned by k-mer, etc. and its behavior learned by node2vec. Finally, the random forest classifier is applied to carry out the relationship prediction task. The proposed model achieved a reliable performance with 0.9677 areas under the curve (AUCs) and 0.9562 areas under the precision curve (AUPRs) under 5-fold cross-validation. Also, additional experiments proved that the proposed global model shows more competitive performance than the traditional local method. All of these provided a systematic insight for understanding the synergistic interactions between various molecules and diseases. It is anticipated that this work can bring beneficial inspiration and advance to related systems biology and biomedical research.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA