Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 43
Filter
1.
BMC Genomics ; 24(1): 613, 2023 Oct 13.
Article in English | MEDLINE | ID: mdl-37828501

ABSTRACT

BACKGROUND: The domestic dog, Canis lupus familiaris, is a companion animal for humans as well as an animal model in cancer research due to similar spontaneous occurrence of cancers as humans. Despite the social and biological importance of dogs, the catalogue of genomic variations and transcripts for dogs is relatively incomplete. RESULTS: We developed CanISO, a new database to hold a large collection of transcriptome profiles and genomic variations for domestic dogs. CanISO provides 87,692 novel transcript isoforms and 60,992 known isoforms from whole transcriptome sequencing of canine tumors (N = 157) and their matched normal tissues (N = 64). CanISO also provides genomic variation information for 210,444 unique germline single nucleotide polymorphisms (SNPs) from the whole exome sequencing of 183 dogs, with a query system that searches gene- and transcript-level information as well as covered SNPs. Transcriptome profiles can be compared with corresponding human transcript isoforms at a tissue level, or between sample groups to identify tumor-specific gene expression and alternative splicing patterns. CONCLUSIONS: CanISO is expected to increase understanding of the dog genome and transcriptome, as well as its functional associations with humans, such as shared/distinct mechanisms of cancer. CanISO is publicly available at https://www.kobic.re.kr/caniso/ .


Subject(s)
Neoplasms , Wolves , Dogs , Animals , Humans , Transcriptome , Wolves/genetics , Genome , Genomics , Neoplasms/genetics , Neoplasms/veterinary , Protein Isoforms/genetics
2.
J Cheminform ; 15(1): 77, 2023 Sep 07.
Article in English | MEDLINE | ID: mdl-37674239

ABSTRACT

In recent years, the field of computational drug design has made significant strides in the development of artificial intelligence (AI) models for the generation of de novo chemical compounds with desired properties and biological activities, such as enhanced binding affinity to target proteins. These high-affinity compounds have the potential to be developed into more potent therapeutics for a broad spectrum of diseases. Due to the lack of data required for the training of deep generative models, however, some of these approaches have fine-tuned their molecular generators using data obtained from a separate predictor. While these studies show that generative models can produce structures with the desired target properties, it remains unclear whether the diversity of the generated structures and the span of their chemical space align with the distribution of the intended target molecules. In this study, we present a novel generative framework, LOGICS, a framework for Learning Optimal Generative distribution Iteratively for designing target-focused Chemical Structures. We address the exploration-exploitation dilemma, which weighs the choice between exploring new options and exploiting current knowledge. To tackle this issue, we incorporate experience memory and employ a layered tournament selection approach to refine the fine-tuning process. The proposed method was applied to the binding affinity optimization of two target proteins of different protein classes, κ-opioid receptors, and PIK3CA, and the quality and the distribution of the generative molecules were evaluated. The results showed that LOGICS outperforms competing state-of-the-art models and generates more diverse de novo chemical structures with optimized properties. The source code is available at the GitHub repository ( https://github.com/GIST-CSBL/LOGICS ).

3.
Protein Sci ; 32(1): e4529, 2023 01.
Article in English | MEDLINE | ID: mdl-36461699

ABSTRACT

Antimicrobial resistance is a growing health concern. Antimicrobial peptides (AMPs) disrupt harmful microorganisms by nonspecific mechanisms, making it difficult for microbes to develop resistance. Accordingly, they are promising alternatives to traditional antimicrobial drugs. In this study, we developed an improved AMP classification model, called AMP-BERT. We propose a deep learning model with a fine-tuned didirectional encoder representations from transformers (BERT) architecture designed to extract structural/functional information from input peptides and identify each input as AMP or non-AMP. We compared the performance of our proposed model and other machine/deep learning-based methods. Our model, AMP-BERT, yielded the best prediction results among all models evaluated with our curated external dataset. In addition, we utilized the attention mechanism in BERT to implement an interpretable feature analysis and determine the specific residues in known AMPs that contribute to peptide structure and antimicrobial function. The results show that AMP-BERT can capture the structural properties of peptides for model learning, enabling the prediction of AMPs or non-AMPs from input sequences. AMP-BERT is expected to contribute to the identification of candidate AMPs for functional validation and drug development. The code and dataset for the fine-tuning of AMP-BERT is publicly available at https://github.com/GIST-CSBL/AMP-BERT.


Subject(s)
Antimicrobial Peptides , Machine Learning
4.
Eur J Med Chem ; 240: 114556, 2022 Oct 05.
Article in English | MEDLINE | ID: mdl-35849939

ABSTRACT

Artificial intelligence (AI) has been recognized as a powerful technique that can accelerate drug discovery during the hit compound identification step. However, most simple deep learning models have been used for naive pre-filtering as the prediction result cannot be interpreted. Recently, our group developed a new deep learning model (Highlight on Target Sequence; HoTS) that can predict binding regions in a target protein sequence based on patterns learned from interactions between a target protein sequence and a ligand. In this study, we searched for new binding regions of the P2X3 receptor (P2X3R) using HoTS, and suggested a novel putative binding site of P2X3R by a cavity search on the predicted binding regions. The novel putative binding site was employed to generate pharmacophore features, and combinations of pharmacophore features were validated as queries. Two separate virtual screenings using the optimized pharmacophore query Q12 with docking-based scoring and HoTS-based prediction of ligand interactions enabled the initial selection of the compound library for in vitro screening. The screening of each set of 500 compounds from the two approaches (HoTS interaction prediction and Pharmacophore-LibDock cascade) resulted in the identification of 10 (HoTS-1 - 10) and 6 compounds (PD-1 - 6) with low micromolar IC50 values. Remarkably, the hit rate was 10-fold higher than that from the previous random screening of 8364 compound library, and the chemical structures of all identified hit compounds were distinct from those of known P2X3R antagonists, indicating that novel chemical entities could be developed for P2X3R antagonists by targeting the binding site. Overall, this study suggests the discovery of a novel putative binding site for P2X3R using the AI deep learning protocol along with in silico MD simulation and experimental screening of targeted library compounds to successfully identify 16 unique and novel hit compounds. These results may accelerate the discovery of novel chemical-class drugs for P2X3R antagonists.


Subject(s)
Artificial Intelligence , Purinergic P2X Receptor Antagonists , Binding Sites , Drug Discovery , Ligands , Molecular Docking Simulation , Molecular Dynamics Simulation , Protein Binding
5.
Brief Bioinform ; 23(4)2022 07 18.
Article in English | MEDLINE | ID: mdl-35709752

ABSTRACT

Unintended inhibition of the human ether-à-go-go-related gene (hERG) ion channel by small molecules leads to severe cardiotoxicity. Thus, hERG channel blockage is a significant concern in the development of new drugs. Several computational models have been developed to predict hERG channel blockage, including deep learning models; however, they lack robustness, reliability and interpretability. Here, we developed a graph-based Bayesian deep learning model for hERG channel blocker prediction, named BayeshERG, which has robust predictive power, high reliability and high resolution of interpretability. First, we applied transfer learning with 300 000 large data in initial pre-training to increase the predictive performance. Second, we implemented a Bayesian neural network with Monte Carlo dropout to calibrate the uncertainty of the prediction. Third, we utilized global multihead attentive pooling to augment the high resolution of structural interpretability for the hERG channel blockers and nonblockers. We conducted both internal and external validations for stringent evaluation; in particular, we benchmarked most of the publicly available hERG channel blocker prediction models. We showed that our proposed model outperformed predictive performance and uncertainty calibration performance. Furthermore, we found that our model learned to focus on the essential substructures of hERG channel blockers via an attention mechanism. Finally, we validated the prediction results of our model by conducting in vitro experiments and confirmed its high validity. In summary, BayeshERG could serve as a versatile tool for discovering hERG channel blockers and helping maximize the possibility of successful drug discovery. The data and source code are available at our GitHub repository (https://github.com/GIST-CSBL/BayeshERG).


Subject(s)
Deep Learning , Ether-A-Go-Go Potassium Channels , Bayes Theorem , Ether-A-Go-Go Potassium Channels/chemistry , Ether-A-Go-Go Potassium Channels/genetics , Humans , Potassium Channel Blockers/chemistry , Potassium Channel Blockers/pharmacology , Reproducibility of Results
6.
Sci Data ; 9(1): 132, 2022 03 31.
Article in English | MEDLINE | ID: mdl-35361774

ABSTRACT

The identification of efficient and sensitive biomarkers for non-invasive tests is one of the major challenges in cancer diagnosis. To address this challenge, metabolomics is widely applied for identifying biomarkers that detect abnormal changes in cancer patients. Canine mammary tumors exhibit physiological characteristics identical to those in human breast cancer and serve as a useful animal model to conduct breast cancer research. Here, we aimed to provide a reliable large-scale metabolite dataset collected from dogs with mammary tumors, using proton nuclear magnetic resonance spectroscopy. We identified 55 metabolites in urine samples from 20 benign, 87 malignant, and 49 healthy control subjects. This dataset provides details of mammary tumor-specific metabolites in dogs and insights into cancer-specific metabolic alterations that share similar molecular characteristics.


Subject(s)
Dogs , Mammary Neoplasms, Animal , Animals , Female , Mammary Neoplasms, Animal/urine , Metabolomics , Proton Magnetic Resonance Spectroscopy
7.
J Cheminform ; 14(1): 9, 2022 Mar 04.
Article in English | MEDLINE | ID: mdl-35246258

ABSTRACT

Adverse drug-drug interaction (DDI) is a major concern to polypharmacy due to its unexpected adverse side effects and must be identified at an early stage of drug discovery and development. Many computational methods have been proposed for this purpose, but most require specific types of information, or they have less concern in interpretation on underlying genes. We propose a deep learning-based framework for DDI prediction with drug-induced gene expression signatures so that the model can provide the expression level of interpretability for DDIs. The model engineers dynamic drug features using a gating mechanism that mimics the co-administration effects by imposing attention to genes. Also, each side-effect is projected into a latent space through translating embedding. As a result, the model achieved an AUC of 0.889 and an AUPR of 0.915 in unseen interaction prediction, which is competitively very accurate and outperforms other state-of-the-art methods. Furthermore, it can predict potential DDIs with new compounds not used in training. In conclusion, using drug-induced gene expression signatures followed by gating and translating embedding can increase DDI prediction accuracy while providing model interpretability. The source code is available on GitHub ( https://github.com/GIST-CSBL/DeSIDE-DDI ).

8.
J Cheminform ; 14(1): 5, 2022 Feb 08.
Article in English | MEDLINE | ID: mdl-35135622

ABSTRACT

Identifying drug-target interactions (DTIs) is important for drug discovery. However, searching all drug-target spaces poses a major bottleneck. Therefore, recently many deep learning models have been proposed to address this problem. However, the developers of these deep learning models have neglected interpretability in model construction, which is closely related to a model's performance. We hypothesized that training a model to predict important regions on a protein sequence would increase DTI prediction performance and provide a more interpretable model. Consequently, we constructed a deep learning model, named Highlights on Target Sequences (HoTS), which predicts binding regions (BRs) between a protein sequence and a drug ligand, as well as DTIs between them. To train the model, we collected complexes of protein-ligand interactions and protein sequences of binding sites and pretrained the model to predict BRs for a given protein sequence-ligand pair via object detection employing transformers. After pretraining the BR prediction, we trained the model to predict DTIs from a compound token designed to assign attention to BRs. We confirmed that training the BRs prediction model indeed improved the DTI prediction performance. The proposed HoTS model showed good performance in BR prediction on independent test datasets even though it does not use 3D structure information in its prediction. Furthermore, the HoTS model achieved the best performance in DTI prediction on test datasets. Additional analysis confirmed the appropriate attention for BRs and the importance of transformers in BR and DTI prediction. The source code is available on GitHub ( https://github.com/GIST-CSBL/HoTS ).

9.
Biomedicines ; 11(1)2022 Dec 27.
Article in English | MEDLINE | ID: mdl-36672575

ABSTRACT

Drug-target binding affinity (DTA) prediction is an essential step in drug discovery. Drug-target protein binding occurs at specific regions between the protein and drug, rather than the entire protein and drug. However, existing deep-learning DTA prediction methods do not consider the interactions between drug substructures and protein sub-sequences. This work proposes GraphATT-DTA, a DTA prediction model that constructs the essential regions for determining interaction affinity between compounds and proteins, modeled with an attention mechanism for interpretability. We make the model consider the local-to-global interactions with the attention mechanism between compound and protein. As a result, GraphATT-DTA shows an improved prediction of DTA performance and interpretability compared with state-of-the-art models. The model is trained and evaluated with the Davis dataset, the human kinase dataset; an external evaluation is achieved with the independently proposed human kinase dataset from the BindingDB dataset.

10.
J Chem Inf Model ; 61(8): 3858-3867, 2021 08 23.
Article in English | MEDLINE | ID: mdl-34342985

ABSTRACT

Understanding differences in drug responses between patients is crucial for delivering effective cancer treatment. We describe an interpretable AI model for use in predicting drug responses in cancer cells at the gene, molecular pathway, and drug level, which we have called the hierarchical network for drug response prediction with attention. We found that the model shows better accuracy in predicting drugs having efficacy against a given cell line than other state-of-the-art methods, with a root mean squared error of 1.0064, a Pearson's correlation coefficient of 0.9307, and an R2 value of 0.8647. We also confirmed that the model gives high attention to drug-target genes and cancer-related pathways when predicting a response. The validity of predicted results was proven by in vitro cytotoxicity assay. Overall, we propose that our hierarchical and interpretable AI-based model is capable of interpreting intrinsic characteristics of cancer cells and drugs for accurate prediction of cancer-drug responses.


Subject(s)
Antineoplastic Agents , Neoplasms , Pharmaceutical Preparations , Antineoplastic Agents/pharmacology , Antineoplastic Agents/therapeutic use , Humans , Neoplasms/drug therapy
11.
Nat Commun ; 11(1): 3616, 2020 07 17.
Article in English | MEDLINE | ID: mdl-32680987

ABSTRACT

Genomic and precision medicine research has afforded notable advances in human cancer treatment, yet applicability to other species remains uncertain. Through whole-exome and transcriptome analyses of 191 spontaneous canine mammary tumors (CMTs) that exhibit the archetypal features of human breast cancers, we found a striking resemblance of genomic characteristics including frequent PIK3CA mutations (43.1%), aberrations of the PI3K-Akt pathway (61.7%), and key genes involved in cancer initiation and progression. We also identified three gene expression-based CMT subtypes, one of which segregated with basal-like human breast cancer subtypes with activated epithelial-to-mesenchymal transition, low claudin expression, and unfavorable disease prognosis. A relative lack of ERBB2 amplification and Her2-enrichment subtype in CMT denoted species-specific molecular mechanisms. Taken together, our results elucidate cross-species oncogenic signatures for a better understanding of universal and context-dependent mechanisms in breast cancer development and provide a basis for precision diagnostics and therapeutics for domestic dogs.


Subject(s)
Biomarkers, Tumor/genetics , Breast Neoplasms/genetics , Carcinogenesis/genetics , Gene Expression Regulation, Neoplastic , Mammary Neoplasms, Animal/genetics , Animals , Breast/pathology , Breast Neoplasms/mortality , Breast Neoplasms/pathology , Class I Phosphatidylinositol 3-Kinases/genetics , Cohort Studies , DNA Copy Number Variations , DNA Mutational Analysis , Datasets as Topic , Dogs , Epithelial-Mesenchymal Transition , Female , Humans , Mammary Glands, Animal/pathology , Mammary Glands, Animal/surgery , Mammary Neoplasms, Animal/mortality , Mammary Neoplasms, Animal/pathology , Mammary Neoplasms, Animal/surgery , Mutation , Prognosis , RNA-Seq , Species Specificity , Exome Sequencing
12.
Comput Biol Chem ; 87: 107286, 2020 May 19.
Article in English | MEDLINE | ID: mdl-32531518

ABSTRACT

A voltage-gated potassium channel encoded by the human ether-à-go-go-related gene (hERG) regulates cardiac action potential, and it is involved in cardiotoxicity with compounds that inhibit its activity. Therefore, the screening of hERG channel blockers is a mandatory step in the drug discovery process. The screening of hERG blockers by using conventional methods is inefficient in terms of cost and efforts. This has led to the development of many in silico hERG blocker prediction models. However, constructing a high-performance predictive model with interpretability on hERG blockage by certain compounds is a major obstacle. In this study, we developed the first, attention-based, interpretable model that predicts hERG blockers and captures important hERG-related compound substructures. To do that, we first collected various datasets, ranging from public databases to publicly available private datasets, to train and test the model. Then, we developed a precise and interpretable hERG blocker prediction model by using deep learning with a self-attention approach that has an appropriate molecular descriptor, Morgan fingerprint. The proposed prediction model was validated, and the validation result showed that the model was well-optimized and had high performance. The test set performance of the proposed model was significantly higher than that of previous fingerprint-based conventional machine learning models. In particular, the proposed model generally had high accuracy and F1 score thereby, representing the model's predictive reliability. Furthermore, we interpreted the calculated attention score vectors obtained from the proposed prediction model and demonstrated the important structural patterns that are represented in hERG blockers. In summary, we have proposed a powerful and interpretable hERG blocker prediction model that can reduce the overall cost of drug discovery by accurately screening for hERG blockers and suggesting hERG-related substructures.

13.
BMC Bioinformatics ; 21(1): 175, 2020 May 04.
Article in English | MEDLINE | ID: mdl-32366211

ABSTRACT

BACKGROUND: Genome-wide studies of DNA methylation across the epigenetic landscape provide insights into the heterogeneity of pluripotent embryonic stem cells (ESCs). Differentiating into embryonic somatic and germ cells, ESCs exhibit varying degrees of pluripotency, and epigenetic changes occurring in this process have emerged as important factors explaining stem cell pluripotency. RESULTS: Here, using paired scBS-seq and scRNA-seq data of mice, we constructed a machine learning model that predicts degrees of pluripotency for mouse ESCs. Since the biological activities of non-CpG markers have yet to be clarified, we tested the predictive power of CpG and non-CpG markers, as well as a combination thereof, in the model. Through rigorous performance evaluation with both internal and external validation, we discovered that a model using both CpG and non-CpG markers predicted the pluripotency of ESCs with the highest prediction performance (0.956 AUC, external test). The prediction model consisted of 16 CpG and 33 non-CpG markers. The CpG and most of the non-CpG markers targeted depletions of methylation and were indicative of cell pluripotency, whereas only a few non-CpG markers reflected accumulations of methylation. Additionally, we confirmed that there exists the differing pluripotency between individual developmental stages, such as E3.5 and E6.5, as well as between induced mouse pluripotent stem cell (iPSC) and somatic cell. CONCLUSIONS: In this study, we investigated CpG and non-CpG methylation in relation to mouse stem cell pluripotency and developed a model thereon that successfully predicts the pluripotency of mouse ESCs.


Subject(s)
CpG Islands , DNA Methylation , Pluripotent Stem Cells/metabolism , Animals , Epigenesis, Genetic , Epigenomics , Mice , Mouse Embryonic Stem Cells/metabolism
14.
Biotechnol Bioprocess Eng ; 25(6): 895-930, 2020.
Article in English | MEDLINE | ID: mdl-33437151

ABSTRACT

As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.

15.
Genes (Basel) ; 10(11)2019 11 04.
Article in English | MEDLINE | ID: mdl-31690030

ABSTRACT

DNA methylation patterns have been shown to change throughout the normal aging process. Several studies have found epigenetic aging markers using age predictors, but these studies only focused on blood-specific or tissue-common methylation patterns. Here, we constructed nine tissue-specific age prediction models using methylation array data from normal samples. The constructed models predict the chronological age with good performance (mean absolute error of 5.11 years on average) and show better performance in the independent test than previous multi-tissue age predictors. We also compared tissue-common and tissue-specific aging markers and found that they had different characteristics. Firstly, the tissue-common group tended to contain more positive aging markers with methylation values that increased during the aging process, whereas the tissue-specific group tended to contain more negative aging markers. Secondly, many of the tissue-common markers were located in Cytosine-phosphate-Guanine (CpG) island regions, whereas the tissue-specific markers were located in CpG shore regions. Lastly, the tissue-common CpG markers tended to be located in more evolutionarily conserved regions. In conclusion, our prediction models identified CpG markers that capture both tissue-common and tissue-specific characteristics during the aging process.


Subject(s)
Age Factors , DNA Methylation/genetics , Forecasting/methods , Adult , Aged , Biomarkers , CpG Islands/genetics , Databases, Genetic , Epigenesis, Genetic/genetics , Epigenomics , Female , Humans , Male , Middle Aged , Organ Specificity/genetics
16.
Sci Data ; 6(1): 147, 2019 08 14.
Article in English | MEDLINE | ID: mdl-31413331

ABSTRACT

Studies of naturally occurring cancers in dogs, which share many genetic and environmental factors with humans, provide valuable information as a comparative model for studying the mechanisms of human cancer pathogenesis. While individual and small-scale studies of canine cancers are underway, more generalized multi-omics studies have not been attempted due to the lack of large-scale and well-controlled genomic data. Here, we produced reliable whole-exome and whole-transcriptome sequencing data of 197 canine mammary cancers and their matched controls, annotated with rich clinical and biological features. Our dataset provides useful reference points for comparative analysis with human cancers and for developing novel diagnostic and therapeutic technologies for cancers in pet dogs.


Subject(s)
Dogs/genetics , Exome , Mammary Neoplasms, Animal/genetics , Transcriptome , Animals , Female , Exome Sequencing
17.
PLoS Comput Biol ; 15(6): e1007129, 2019 06.
Article in English | MEDLINE | ID: mdl-31199797

ABSTRACT

Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors have been shown to not be sufficiently informative to predict accurate DTIs. Thus, in this study, we propose a deep learning based DTI prediction model capturing local residue patterns of proteins participating in DTIs. When we employ a convolutional neural network (CNN) on raw protein sequences, we perform convolution on various lengths of amino acids subsequences to capture local residue patterns of generalized protein classes. We train our model with large-scale DTI information and demonstrate the performance of the proposed model using an independent dataset that is not seen during the training phase. As a result, our model performs better than previous protein descriptor-based models. Also, our model performs better than the recently developed deep learning models for massive prediction of DTIs. By examining pooled convolution results, we confirmed that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches. Our code is available at https://github.com/GIST-CSBL/DeepConv-DTI.


Subject(s)
Deep Learning , Drug Discovery/methods , Proteins , Sequence Analysis, Protein/methods , Amino Acid Sequence , Binding Sites , Computational Biology , Computer Simulation , Ligands , Models, Molecular , Proteins/chemistry , Proteins/metabolism
18.
BMC Bioinformatics ; 20(Suppl 10): 247, 2019 May 29.
Article in English | MEDLINE | ID: mdl-31138103

ABSTRACT

BACKGROUND: Drug repositioning, also known as drug repurposing, defines new indications for existing drugs and can be used as an alternative to drug development. In recent years, the accumulation of large volumes of information related to drugs and diseases has led to the development of various computational approaches for drug repositioning. Although herbal medicines have had a great impact on current drug discovery, there are still a large number of herbal compounds that have no definite indications. RESULTS: In the present study, we constructed a computational model to predict the unknown pharmacological effects of herbal compounds using machine learning techniques. Based on the assumption that similar diseases can be treated with similar drugs, we used four categories of drug-drug similarity (e.g., chemical structure, side-effects, gene ontology, and targets) and three categories of disease-disease similarity (e.g., phenotypes, human phenotype ontology, and gene ontology). Then, associations between drug and disease were predicted using the employed similarity features. The prediction models were constructed using classification algorithms, including logistic regression, random forest and support vector machine algorithms. Upon cross-validation, the random forest approach showed the best performance (AUC = 0.948) and also performed well in an external validation assessment using an unseen independent dataset (AUC = 0.828). Finally, the constructed model was applied to predict potential indications for existing drugs and herbal compounds. As a result, new indications for 20 existing drugs and 31 herbal compounds were predicted and validated using clinical trial data. CONCLUSIONS: The predicted results were validated manually confirming the performance and underlying mechanisms - for example, irinotecan as a treatment for neuroblastoma. From the prediction, herbal compounds were considered to be drug candidates for related diseases which is important to be further developed. The proposed prediction model can contribute to drug discovery by suggesting drug candidates from herbal compounds which have potentials but few were studied.


Subject(s)
Drug Repositioning , Machine Learning , Phytochemicals/pharmacology , Algorithms , Gene Ontology , Humans , Logistic Models , Models, Biological , Pharmaceutical Preparations , Phenotype , Reproducibility of Results
19.
Nat Commun ; 10(1): 1047, 2019 03 05.
Article in English | MEDLINE | ID: mdl-30837471

ABSTRACT

Accurate genome-wide detection of somatic mutations with low variant allele frequency (VAF, <1%) has proven difficult, for which generalized, scalable methods are lacking. Herein, we describe a new computational method, called RePlow, that we developed to detect low-VAF somatic mutations based on simple, library-level replicates for next-generation sequencing on any platform. Through joint analysis of replicates, RePlow is able to remove prevailing background errors in next-generation sequencing analysis, facilitating remarkable improvement in the detection accuracy for low-VAF somatic mutations (up to ~99% reduction in false positives). The method is validated in independent cancer panel and brain tissue sequencing data. Our study suggests a new paradigm with which to exploit an overwhelming abundance of sequencing data for accurate variant detection.


Subject(s)
Computational Biology/methods , DNA Mutational Analysis/methods , Models, Statistical , Whole Genome Sequencing/methods , Algorithms , Brain/pathology , Gene Frequency/genetics , Genome, Human/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Neoplasms/genetics , Neoplasms/pathology , Polymorphism, Single Nucleotide/genetics
20.
Nature ; 566(7743): 254-258, 2019 02.
Article in English | MEDLINE | ID: mdl-30728500

ABSTRACT

Osteoarthritis-the most common form of age-related degenerative whole-joint disease1-is primarily characterized by cartilage destruction, as well as by synovial inflammation, osteophyte formation and subchondral bone remodelling2,3. However, the molecular mechanisms that underlie the pathogenesis of osteoarthritis are largely unknown. Although osteoarthritis is currently considered to be associated with metabolic disorders, direct evidence for this is lacking, and the role of cholesterol metabolism in the pathogenesis of osteoarthritis has not been fully investigated4-6. Various types of cholesterol hydroxylases contribute to cholesterol metabolism in extrahepatic tissues by converting cellular cholesterol to circulating oxysterols, which regulate diverse biological processes7,8. Here we show that the CH25H-CYP7B1-RORα axis of cholesterol metabolism in chondrocytes is a crucial catabolic regulator of the pathogenesis of osteoarthritis. Osteoarthritic chondrocytes had increased levels of cholesterol because of enhanced uptake, upregulation of cholesterol hydroxylases (CH25H and CYP7B1) and increased production of oxysterol metabolites. Adenoviral overexpression of CH25H or CYP7B1 in mouse joint tissues caused experimental osteoarthritis, whereas knockout or knockdown of these hydroxylases abrogated the pathogenesis of osteoarthritis. Moreover, retinoic acid-related orphan receptor alpha (RORα) was found to mediate the induction of osteoarthritis by alterations in cholesterol metabolism. These results indicate that osteoarthritis is a disease associated with metabolic disorders and suggest that targeting the CH25H-CYP7B1-RORα axis of cholesterol metabolism may provide a therapeutic avenue for treating osteoarthritis.


Subject(s)
Cholesterol/metabolism , Cytochrome P450 Family 7/metabolism , Nuclear Receptor Subfamily 1, Group F, Member 1/metabolism , Osteoarthritis/metabolism , Steroid Hydroxylases/metabolism , Animals , Biological Transport , Chondrocytes/enzymology , Chondrocytes/metabolism , Male , Mice , Nuclear Receptor Subfamily 1, Group F, Member 1/genetics , Osteoarthritis/enzymology , Osteoarthritis/pathology , Oxysterols/metabolism , Steroid Hydroxylases/deficiency , Up-Regulation
SELECTION OF CITATIONS
SEARCH DETAIL
...