Pesquisa | Portal Regional da BVS (teste)

A Deep Neural Network-Based Co-Coding Method to Predict Drug-Protein Interactions by Analyzing the Feature Consistency Between Drugs and Proteins.

Sun, Chang; Tang, Rong; Huang, Jipeng; Wei, Jin-Mao; Liu, Jian.

IEEE/ACM Trans Comput Biol Bioinform ; 20(3): 2200-2209, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37021862

RESUMO

Exploring drug-protein interactions (DPIs) through computational methods can effectively reduce the workload and the cost of DPI identification. Previous works try to predict DPIs by integrating and analyzing the unique features of drugs and proteins. They cannot adequately analyze the consistency between the drug features and the protein features due to their different semantics. However, the consistency of their features, such as the correlation originating from their sharing diseases, may reveal some potential DPIs. Here we propose a deep neural network-based co-coding method (DNNCC for short) to predict novel DPIs. DNNCC projects the original features of drugs and proteins to a common embedding space through a co-coding strategy. In this way, the embedding features of drugs and proteins have the same semantics. Therefore, the prediction module can discover the unknown DPIs by exploring the feature consistency between drugs and proteins. The experimental results indicate that the performance of DNNCC is significantly superior to five state-of-the-art DPI prediction methods under several evaluation metrics. The superiority of integrating and analyzing the common features of drugs and proteins is proved by the ablation experiments. The novel DPIs predicted by DNNCC verify that DNNCC is a powerful prior tool that can effectively discover potential DPIs.

Assuntos

Redes Neurais de Computação , Proteínas , Proteínas/genética

Drug-Protein interaction prediction by correcting the effect of incomplete information in heterogeneous information.

Li, Yanfei; Sun, Chang; Wei, Jin-Mao; Liu, Jian.

Bioinformatics ; 38(22): 5073-5080, 2022 11 15.

Artigo em Inglês | MEDLINE | ID: mdl-36111859

RESUMO

MOTIVATION: Large-scale heterogeneous data provide diverse perspectives for predicting drug-protein interactions (DPIs). However, the available information on molecular interactions and clinical associations related to drugs or proteins is incomplete because there may be unproven interactions and associations. This incomplete information in the available data is presented in the form of non-interaction and non-correlation, which may mislead the prediction model. Existing methods fuse incomplete and complete information without considering their integrity, so the negative effects of incomplete information still exist. RESULTS: We develop a network-based DPI prediction method named BRWCP, which uses the complete information network to correct the prediction results acquired by the incomplete information network. By integrating relevant heterogeneous information that may be incomplete, the feature similarities of drugs and proteins are obtained. Combining the feature similarities and known DPIs, an incomplete information-based drug-protein heterogeneous network is constructed. Then, a bidirectional random walk with pruning algorithm is adopted in this heterogeneous network to predict potential DPIs. Next, the predicted DPIs are combined with the chemical fingerprint similarity of drugs and amino acid sequence similarity of proteins to construct the complete information network. The bidirectional random walk with pruning algorithm is applied in the new network to obtain the final prediction results until it converges. Experimental results show that BRWCP is superior to several state-of-the-art DPI prediction methods, and case studies further confirm its ability to tap potential DPIs. AVAILABILITY AND IMPLEMENTATION: The code and data used in BRWCP are available at https://github.com/lyfdomain/BRWCP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Proteínas , Interações Medicamentosas

CPMCP: a database of Chinese patent medicine and compound prescription.

Sun, Chang; Huang, Jipeng; Tang, Rong; Li, Minglei; Yuan, Haili; Wang, Yuxiang; Wei, Jin-Mao; Liu, Jian.

Database (Oxford) ; 20222022 08 25.

Artigo em Inglês | MEDLINE | ID: mdl-36006844

RESUMO

Although several traditional Chinese medicine (TCM)-related databases have emerged, they focus on researching single medicinal materials, which is far from sufficient for clinical research and application. In comparison, compound prescriptions are more informative and meaningful in TCM, for they embody the information on the compatibility of TCM besides the relatively isolated information about single medicinal materials. The compatibility information is essential in TCM because it conveys not only what components are involved to treat special diseases but also how to combine these single medical materials. We established a database of Chinese patent medicine and compound prescription (CPMCP). It demonstrates the prescription information of Chinese patent medicines (CPMs) and ancient Chinese medicine prescriptions (CMPs). CPMCP reports their comprehensive and standardized information such as the components, indications and contraindications. It is worth mentioning that we organized relevant experts and spent lots of time manually mapping the functions of compound prescriptions in ancient Chinese to the standardized TCM symptom vocabularies, obtaining a total of 71 414 associations between compound prescriptions and TCM symptoms. In this way, CPMCP established the associations between TCM and modern medicine (MM) according to the associations between TCM symptoms and MM symptoms. In addition, to further exhibit the compatibility mechanism of compound prescriptions, CPMCP summarizes a set of common drug combination principles by analyzing the existing prescriptions. We believe that CPMCP can promote the modernization of TCM and make greater contributions to MM. Database URL http://cpmcp.top.

Assuntos

Medicamentos de Ervas Chinesas , China , Medicamentos de Ervas Chinesas/uso terapêutico , Medicina Tradicional Chinesa , Medicamentos sem Prescrição/uso terapêutico , Prescrições

Multi-variable AUC for sifting complementary features and its biomedical application.

Su, Yue; Du, Keyu; Wang, Jun; Wei, Jin-Mao; Liu, Jian.

Brief Bioinform ; 23(2)2022 03 10.

Artigo em Inglês | MEDLINE | ID: mdl-35212712

RESUMO

Although sifting functional genes has been discussed for years, traditional selection methods tend to be ineffective in capturing potential specific genes. First, typical methods focus on finding features (genes) relevant to class while irrelevant to each other. However, the features that can offer rich discriminative information are more likely to be the complementary ones. Next, almost all existing methods assess feature relations in pairs, yielding an inaccurate local estimation and lacking a global exploration. In this paper, we introduce multi-variable Area Under the receiver operating characteristic Curve (AUC) to globally evaluate the complementarity among features by employing Area Above the receiver operating characteristic Curve (AAC). Due to AAC, the class-relevant information newly provided by a candidate feature and that preserved by the selected features can be achieved beyond pairwise computation. Furthermore, we propose an AAC-based feature selection algorithm, named Multi-variable AUC-based Combined Features Complementarity, to screen discriminative complementary feature combinations. Extensive experiments on public datasets demonstrate the effectiveness of the proposed approach. Besides, we provide a gene set about prostate cancer and discuss its potential biological significance from the machine learning aspect and based on the existing biomedical findings of some individual genes.

Assuntos

Algoritmos , Aprendizado de Máquina , Área Sob a Curva , Curva ROC

Autoencoder-based drug-target interaction prediction by preserving the consistency of chemical properties and functions of drugs.

Sun, Chang; Cao, Yangkun; Wei, Jin-Mao; Liu, Jian.

Bioinformatics ; 37(20): 3618-3625, 2021 Oct 25.

Artigo em Inglês | MEDLINE | ID: mdl-34019069

RESUMO

MOTIVATION: Exploring the potential drug-target interactions (DTIs) is a key step in drug discovery and repurposing. In recent years, predicting the probable DTIs through computational methods has gradually become a research hot spot. However, most of the previous studies failed to judiciously take into account the consistency between the chemical properties of drug and its functions. The changes of these relationships may lead to a severely negative effect on the prediction of DTIs. RESULTS: We propose an autoencoder-based method, AEFS, under spatial consistency constraints to predict DTIs. A heterogeneous network is established to integrate the information of drugs, proteins and diseases. The original drug features are projected to an embedding (protein) space by a multi-layer encoder, and further projected into label (disease) space by a decoder. In this process, the clinical information of drugs is introduced to assist the DTI prediction. By maintaining the distribution of drug correlation in the original feature, embedding and label space, AEFS keeps the consistency between chemical properties and functions of drugs. Experimental comparisons indicate that AEFS is more robust for imbalanced data and of significantly superior performance in DTI prediction. Case studies further confirm its ability to mine the latent DTIs. AVAILABILITY AND IMPLEMENTATION: The code of AEFS is available at https://github.com/JackieSun818/AEFS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Minimum Bayesian error probability-based gene subset selection.

Li, Jian; Yu, Tian; Wei, Jin-Mao.

Int J Data Min Bioinform ; 12(4): 434-50, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26510296

RESUMO

Sifting functional genes is crucial to the new strategies for drug discovery and prospective patient-tailored therapy. Generally, simply generating gene subset by selecting the top k individually superior genes may obtain an inferior gene combination, for some selected genes may be redundant with respect to some others. In this paper, we propose to select gene subset based on the criterion of minimum Bayesian error probability. The method dynamically evaluates all available genes and sifts only one gene at a time. A gene is selected if its combination with the other selected genes can gain better classification information. Within the generated gene subset, each individual gene is the most discriminative one in comparison with those that classify cancers in the same way as this gene does and different genes are more discriminative in combination than in individual. The genes selected in this way are likely to be functional ones from the system biology perspective, for genes tend to co-regulate rather than regulate individually. Experimental results show that the classifiers induced based on this method are capable of classifying cancers with high accuracy, while only a small number of genes are involved.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genes , Análise de Sequência de DNA/métodos , Teorema de Bayes

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA