Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
Add more filters










Database
Language
Publication year range
1.
Signal Transduct Target Ther ; 9(1): 183, 2024 Jul 08.
Article in English | MEDLINE | ID: mdl-38972904

ABSTRACT

Helicobacter pylori (H. pylori) is currently recognized as the primary carcinogenic pathogen associated with gastric tumorigenesis, and its high prevalence and resistance make it difficult to tackle. A graph neural network-based deep learning model, employing different training sets of 13,638 molecules for pre-training and fine-tuning, was aided in predicting and exploring novel molecules against H. pylori. A positively predicted novel berberine derivative 8 with 3,13-disubstituted alkene exhibited a potency against all tested drug-susceptible and resistant H. pylori strains with minimum inhibitory concentrations (MICs) of 0.25-0.5 µg/mL. Pharmacokinetic studies demonstrated an ideal gastric retention of 8, with the stomach concentration significantly higher than its MIC at 24 h post dose. Oral administration of 8 and omeprazole (OPZ) showed a comparable gastric bacterial reduction (2.2-log reduction) to the triple-therapy, namely OPZ + amoxicillin (AMX) + clarithromycin (CLA) without obvious disturbance on the intestinal flora. A combination of OPZ, AMX, CLA, and 8 could further decrease the bacteria load (2.8-log reduction). More importantly, the mono-therapy of 8 exhibited comparable eradication to both triple-therapy (OPZ + AMX + CLA) and quadruple-therapy (OPZ + AMX + CLA + bismuth citrate) groups. SecA and BamD, playing a major role in outer membrane protein (OMP) transport and assembling, were identified and verified as the direct targets of 8 by employing the chemoproteomics technique. In summary, by targeting the relatively conserved OMPs transport and assembling system, 8 has the potential to be developed as a novel anti-H. pylori candidate, especially for the eradication of drug-resistant strains.


Subject(s)
Anti-Bacterial Agents , Berberine , Deep Learning , Helicobacter pylori , Helicobacter pylori/drug effects , Berberine/pharmacology , Berberine/chemistry , Berberine/pharmacokinetics , Anti-Bacterial Agents/pharmacology , Anti-Bacterial Agents/chemistry , Humans , Helicobacter Infections/drug therapy , Helicobacter Infections/microbiology , Microbial Sensitivity Tests , Drug Resistance, Multiple, Bacterial/drug effects , Drug Resistance, Multiple, Bacterial/genetics , Animals , Omeprazole/pharmacology , Clarithromycin/pharmacology , Amoxicillin/pharmacology
2.
Brief Bioinform ; 24(1)2023 01 19.
Article in English | MEDLINE | ID: mdl-36528809

ABSTRACT

MOTIVATION: Exploring the potential long noncoding RNA (lncRNA)-disease associations (LDAs) plays a critical role for understanding disease etiology and pathogenesis. Given the high cost of biological experiments, developing a computational method is a practical necessity to effectively accelerate experimental screening process of candidate LDAs. However, under the high sparsity of LDA dataset, many computational models hardly exploit enough knowledge to learn comprehensive patterns of node representations. Moreover, although the metapath-based GNN has been recently introduced into LDA prediction, it discards intermediate nodes along the meta-path and results in information loss. RESULTS: This paper presents a new multi-view contrastive heterogeneous graph attention network (GAT) for lncRNA-disease association prediction, MCHNLDA for brevity. Specifically, MCHNLDA firstly leverages rich biological data sources of lncRNA, gene and disease to construct two-view graphs, feature structural graph of feature schema view and lncRNA-gene-disease heterogeneous graph of network topology view. Then, we design a cross-contrastive learning task to collaboratively guide graph embeddings of the two views without relying on any labels. In this way, we can pull closer the nodes of similar features and network topology, and push other nodes away. Furthermore, we propose a heterogeneous contextual GAT, where long short-term memory network is incorporated into attention mechanism to effectively capture sequential structure information along the meta-path. Extensive experimental comparisons against several state-of-the-art methods show the effectiveness of proposed framework.The code and data of proposed framework is freely available at https://github.com/zhaoxs686/MCHNLDA.


Subject(s)
RNA, Long Noncoding , RNA, Long Noncoding/genetics , Learning
3.
Brief Bioinform ; 23(1)2022 01 17.
Article in English | MEDLINE | ID: mdl-34585231

ABSTRACT

MOTIVATION: Discovering long noncoding RNA (lncRNA)-disease associations is a fundamental and critical part in understanding disease etiology and pathogenesis. However, only a few lncRNA-disease associations have been identified because of the time-consuming and expensive biological experiments. As a result, an efficient computational method is of great importance and urgently needed for identifying potential lncRNA-disease associations. With the ability of exploiting node features and relationships in network, graph-based learning models have been commonly utilized by these biomolecular association predictions. However, the capability of these methods in comprehensively fusing node features, heterogeneous topological structures and semantic information is distant from optimal or even satisfactory. Moreover, there are still limitations in modeling complex associations between lncRNAs and diseases. RESULTS: In this paper, we develop a novel heterogeneous graph attention network framework based on meta-paths for predicting lncRNA-disease associations, denoted as HGATLDA. At first, we conduct a heterogeneous network by incorporating lncRNA and disease feature structural graphs, and lncRNA-disease topological structural graph. Then, for the heterogeneous graph, we conduct multiple metapath-based subgraphs and then utilize graph attention network to learn node embeddings from neighbors of these homogeneous and heterogeneous subgraphs. Next, we implement attention mechanism to adaptively assign weights to multiple metapath-based subgraphs and get more semantic information. In addition, we combine neural inductive matrix completion to reconstruct lncRNA-disease associations, which is applied for capturing complicated associations between lncRNAs and diseases. Moreover, we incorporate cost-sensitive neural network into the loss function to tackle the commonly imbalance problem in lncRNA-disease association prediction. Finally, extensive experimental results demonstrate the effectiveness of our proposed framework.


Subject(s)
RNA, Long Noncoding , Computational Biology/methods , Neural Networks, Computer , RNA, Long Noncoding/genetics
4.
Math Biosci Eng ; 18(5): 5347-5363, 2021 06 16.
Article in English | MEDLINE | ID: mdl-34517491

ABSTRACT

With the development of online medical service platform, patients can find more medical information resources and obtain better medical treatment. However, it is difficult for patients to discover the most suitable doctors from the complex information resources. Therefore, the analysis and mining of Electronic Health Record(EHR) is very important for patients' timely and accurate treatment. Discovering the most suitable doctor is actually predicting the exact performance of the doctor for a specific disease. We believe that "a curative/bad treatment is likely to be caused by a good/bad doctor, and a good/bad doctor has a higher/lower evaluation by the patient(s)". In this paper, we propose a novel approach named SeekDoc, which is to seek the most effective doctor for a specific disease. Specifically, we build a doctor-disease heterogeneous information network and collect patients reviews and rating records for doctors. Then, we embed the comprehensive comment data for doctors and the constructed heterogeneous information network. Next, we use the autoencoder mechanism to learn the embedded features, which is an effective learning algorithm for constructing the latent feature representation in an unsupervised manner. After this learning, the latent features are input into the extreme gradient boosting (XGBoost) algorithm to improve their detection capabilities. Finally, extensive experiments show that our method can effectively and efficiently predict the doctor's experience score for specific diseases and has good performance compared with other algorithms.


Subject(s)
Electronic Health Records , Physicians , Algorithms , Humans , Research Design
5.
BMC Bioinformatics ; 19(1): 237, 2018 06 25.
Article in English | MEDLINE | ID: mdl-29940836

ABSTRACT

BACKGROUND: Lysine succinylation is a new kind of post-translational modification which plays a key role in protein conformation regulation and cellular function control. To understand the mechanism of succinylation profoundly, it is necessary to identify succinylation sites in proteins accurately. However, traditional methods, experimental approaches, are labor-intensive and time-consuming. Computational prediction methods have been proposed recent years, and they are popular because of their convenience and high speed. In this study, we developed a new method to predict succinylation sites in protein combining multiple features, including amino acid composition, binary encoding, physicochemical property and grey pseudo amino acid composition, with a feature selection scheme (information gain). And then, it was trained using SVM (Support Vector Machine) and an ensemble learning algorithm. RESULTS: The performance of this method was measured with an accuracy of 89.14% and a MCC (Matthew Correlation Coefficient) of 0.79 using 10-fold cross validation on training dataset and an accuracy of 84.5% and a MCC of 0.2 on independent dataset. CONCLUSIONS: The conclusions made from this study can help to understand more of the succinylation mechanism. These results suggest that our method was very promising for predicting succinylation sites. The source code and data of this paper are freely available at https://github.com/ningq669/PSuccE .


Subject(s)
Computational Biology/methods , Support Vector Machine/standards , Algorithms
6.
Molecules ; 22(11)2017 Nov 03.
Article in English | MEDLINE | ID: mdl-29099805

ABSTRACT

Glycation is a non-enzymatic process occurring inside or outside the host body by attaching a sugar molecule to a protein or lipid molecule. It is an important form of post-translational modification (PTM), which impairs the function and changes the characteristics of the proteins so that the identification of the glycation sites may provide some useful guidelines to understand various biological functions of proteins. In this study, we proposed an accurate prediction tool, named Glypre, for lysine glycation. Firstly, we used multiple informative features to encode the peptides. These features included the position scoring function, secondary structure, AAindex, and the composition of k-spaced amino acid pairs. Secondly, the distribution of distinctive features of the residues surrounding the glycation and non-glycation sites was statistically analysed. Thirdly, based on the distribution of these features, we developed a new predictor by using different optimal window sizes for different properties and a two-step feature selection method, which utilized the maximum relevance minimum redundancy method followed by a greedy feature selection procedure. The performance of Glypre was measured with a sensitivity of 57.47%, a specificity of 90.78%, an accuracy of 79.68%, area under the receiver-operating characteristic (ROC) curve (AUC) of 0.86, and a Matthews's correlation coefficient (MCC) of 0.52 by 10-fold cross-validation. The detailed analysis results showed that our predictor may play a complementary role to other existing methods for identifying protein lysine glycation. The source code and datasets of the Glypre are available in the Supplementary File.


Subject(s)
Amino Acids/chemistry , Computer Simulation , Proteins/chemistry , Support Vector Machine , Algorithms , Area Under Curve , Binding Sites , Glycosylation , Lysine/chemistry , ROC Curve , Sensitivity and Specificity
7.
Molecules ; 22(9)2017 Sep 05.
Article in English | MEDLINE | ID: mdl-28872627

ABSTRACT

Protein pupylation is a type of post-translation modification, which plays a crucial role in cellular function of bacterial organisms in prokaryotes. To have a better insight of the mechanisms underlying pupylation an initial, but important, step is to identify pupylation sites. To date, several computational methods have been established for the prediction of pupylation sites which usually artificially design the negative samples using the verified pupylation proteins to train the classifiers. However, if this process is not properly done it can affect the performance of the final predictor dramatically. In this work, different from previous computational methods, we proposed an enhanced positive-unlabeled learning algorithm (EPuL) to the pupylation site prediction problem, which uses only positive and unlabeled samples. Firstly, we separate the training dataset into the positive dataset and the unlabeled dataset which contains the remaining non-annotated lysine residues. Then, the EPuL algorithm is utilized to select the reliably negative initial dataset and then iteratively pick out the non-pupylation sites. The performance of the proposed method was measured with an accuracy of 90.24%, an Area Under Curve (AUC) of 0.93 and an MCC of 0.81 by 10-fold cross-validation. A user-friendly web server for predicting pupylation sites was developed and was freely available at http://59.73.198.144:8080/EPuL.


Subject(s)
Algorithms , Computational Biology/methods , Machine Learning , Protein Processing, Post-Translational , Proteins/chemistry , Databases, Protein , Protein Binding , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...