Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 28(5): 745-6, 2012 Mar 01.
Article in English | MEDLINE | ID: mdl-22257668

ABSTRACT

UNLABELLED: Since tens of millions of chemical compounds have been accumulated in public chemical databases, fast comprehensive computational methods to predict interactions between chemical compounds and proteins are needed for virtual screening of lead compounds. Previously, we proposed a novel method for predicting protein-chemical interactions using two-layer Support Vector Machine classifiers that require only readily available biochemical data, i.e. amino acid sequences of proteins and structure formulas of chemical compounds. In this article, the method has been implemented as the COPICAT web service, with an easy-to-use front-end interface. Users can simply submit a protein-chemical interaction prediction job using a pre-trained classifier, or can even train their own classification model by uploading training data. COPICAT's fast and accurate computational prediction has enhanced lead compound discovery against a database of tens of millions of chemical compounds, implying that the search space for drug discovery is extended by >1000 times compared with currently well-used high-throughput screening methodologies. AVAILABILITY: The COPICAT server is available at http://copicat.dna.bio.keio.ac.jp. All functions, including the prediction function are freely available via anonymous login without registration. Registered users, however, can use the system more intensively.


Subject(s)
Databases, Factual , Ligands , Proteins/metabolism , Software , Support Vector Machine , Protein Binding , Proteins/chemistry
2.
PLoS Comput Biol ; 5(6): e1000397, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19503826

ABSTRACT

Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models. In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM. In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules. Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest. We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding. Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again. This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.


Subject(s)
Artificial Intelligence , Computer Simulation , Drug Discovery/methods , Models, Statistical , Proteins , Algorithms , Amino Acid Sequence , Area Under Curve , Binding Sites , Databases, Protein , Feedback , Ligands , Models, Chemical , Protein Binding , Protein Interaction Domains and Motifs , Protein Interaction Mapping , Proteins/chemistry , Proteins/metabolism , Receptors, Androgen/chemistry , Receptors, Androgen/metabolism , Reproducibility of Results , Sequence Analysis, Protein
3.
Bioinformatics ; 23(15): 2004-12, 2007 Aug 01.
Article in English | MEDLINE | ID: mdl-17510168

ABSTRACT

MOTIVATION: Prediction of interactions between proteins and chemical compounds is of great benefit in drug discovery processes. In this field, 3D structure-based methods such as docking analysis have been developed. However, the genomewide application of these methods is not really feasible as 3D structural information is limited in availability. RESULTS: We describe a novel method for predicting protein-chemical interaction using SVM. We utilize very general protein data, i.e. amino acid sequences, and combine these with chemical structures and mass spectrometry (MS) data. MS data can be of great use in finding new chemical compounds in the future. We assessed the validity of our method in the dataset of the binding of existing drugs and found that more than 80% accuracy could be obtained. Furthermore, we conducted comprehensive target protein predictions for MDMA, and validated the biological significance of our method by successfully finding proteins relevant to its known functions. AVAILABILITY: Available on request from the authors.


Subject(s)
Mass Spectrometry/methods , Models, Chemical , Peptide Mapping/methods , Protein Interaction Mapping/methods , Proteins/chemistry , Sequence Analysis, Protein/methods , Algorithms , Binding Sites , Computer Simulation , Data Interpretation, Statistical , Protein Binding , Structure-Activity Relationship
4.
Nucleic Acids Res ; 33(15): 4828-37, 2005.
Article in English | MEDLINE | ID: mdl-16126847

ABSTRACT

Cooperative transcriptional activations among multiple transcription factors (TFs) are important to understand the mechanisms of complex transcriptional regulations in eukaryotes. Previous studies have attempted to find cooperative TFs based on gene expression data with gene expression profiles as a measure of similarity of gene regulations. In this paper, we use protein-protein interaction data to infer synergistic binding of cooperative TFs. Our fundamental idea is based on the assumption that genes contributing to a similar biological process are regulated under the same control mechanism. First, the protein-protein interaction networks are used to calculate the similarity of biological processes among genes. Second, we integrate this similarity and the chromatin immuno-precipitation data to identify cooperative TFs. Our computational experiments in yeast show that predictions made by our method have successfully identified eight pairs of cooperative TFs that have literature evidences but could not be identified by the previous method. Further, 12 new possible pairs have been inferred and we have examined the biological relevances for them. However, since a typical problem using protein-protein interaction data is that many false-positive data are contained, we propose a method combining various biological data to increase the prediction accuracy.


Subject(s)
Protein Interaction Mapping , Transcription Factors/metabolism , Transcriptional Activation , Cell Cycle , Chromatin Immunoprecipitation , Computational Biology , Gene Expression , Models, Genetic , Transcription Factors/analysis , Transcription Factors/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...