Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 7(9): e45589, 2012.
Article in English | MEDLINE | ID: mdl-23029121

ABSTRACT

The low complexity of minimotif patterns results in a high false-positive prediction rate, hampering protein function prediction. A multi-filter algorithm, trained and tested on a linear regression model, support vector machine model, and neural network model, using a large dataset of verified minimotifs, vastly improves minimotif prediction accuracy while generating few false positives. An optimal threshold for the best accuracy reaches an overall accuracy above 90%, while a stringent threshold for the best specificity generates less than 1% false positives or even no false positives and still produces more than 90% true positives for the linear regression and neural network models. The minimotif multi-filter with its excellent accuracy represents the state-of-the-art in minimotif prediction and is expected to be very useful to biologists investigating protein function and how missense mutations cause disease.


Subject(s)
Amino Acid Motifs , Pattern Recognition, Automated/methods , Proteins/chemistry , Algorithms , Computational Biology/methods , Internet , Models, Theoretical , Protein Binding , ROC Curve
2.
Nucleic Acids Res ; 40(Database issue): D252-60, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22146221

ABSTRACT

Minimotif Miner (MnM available at http://minimotifminer.org or http://mnm.engr.uconn.edu) is an online database for identifying new minimotifs in protein queries. Minimotifs are short contiguous peptide sequences that have a known function in at least one protein. Here we report the third release of the MnM database which has now grown 60-fold to approximately 300,000 minimotifs. Since short minimotifs are by their nature not very complex we also summarize a new set of false-positive filters and linear regression scoring that vastly enhance minimotif prediction accuracy on a test data set. This online database can be used to predict new functions in proteins and causes of disease.


Subject(s)
Amino Acid Motifs , Databases, Protein , Amino Acid Sequence , Consensus Sequence , Models, Biological , Protein Interaction Maps , Proteins/genetics , Sequence Analysis, Protein
3.
Proteins ; 79(1): 153-64, 2011 Jan.
Article in English | MEDLINE | ID: mdl-20938975

ABSTRACT

Protein-protein interactions are important to understanding cell functions; however, our theoretical understanding is limited. There is a general discontinuity between the well-accepted physical and chemical forces that drive protein-protein interactions and the large collections of identified protein-protein interactions in various databases. Minimotifs are short functional peptide sequences that provide a basis to bridge this gap in knowledge. However, there is no systematic way to study minimotifs in the context of protein-protein interactions or vice versa. Here we have engineered a set of algorithms that can be used to identify minimotifs in known protein-protein interactions and implemented this for use by scientists in Minimotif Miner. By globally testing these algorithms on verified data and on 100 individual proteins as test cases, we demonstrate the utility of these new computation tools. This tool also can be used to reduce false-positive predictions in the discovery of novel minimotifs. The statistical significance of these algorithms is demonstrated by an ROC analysis (P = 0.001).


Subject(s)
Databases, Protein , Models, Molecular , Proteins/chemistry , Algorithms , Amino Acid Sequence , Animals , Computer Simulation , GRB2 Adaptor Protein/chemistry , Humans , Insect Proteins/chemistry , Mice , Protein Binding , Protein Interaction Domains and Motifs , Protein Interaction Mapping , Rats , Software
4.
PLoS One ; 5(8): e12276, 2010 Aug 19.
Article in English | MEDLINE | ID: mdl-20808856

ABSTRACT

BACKGROUND: Minimotifs are short contiguous peptide sequences in proteins that are known to have a function in at least one other protein. One of the principal limitations in minimotif prediction is that false positives limit the usefulness of this approach. As a step toward resolving this problem we have built, implemented, and tested a new data-driven algorithm that reduces false-positive predictions. METHODOLOGY/PRINCIPAL FINDINGS: Certain domains and minimotifs are known to be strongly associated with a known cellular process or molecular function. Therefore, we hypothesized that by restricting minimotif predictions to those where the minimotif containing protein and target protein have a related cellular or molecular function, the prediction is more likely to be accurate. This filter was implemented in Minimotif Miner using function annotations from the Gene Ontology. We have also combined two filters that are based on entirely different principles and this combined filter has a better predictability than the individual components. CONCLUSIONS/SIGNIFICANCE: Testing these functional filters on known and random minimotifs has revealed that they are capable of separating true motifs from false positives. In particular, for the cellular function filter, the percentage of known minimotifs that are not removed by the filter is approximately 4.6 times that of random minimotifs. For the molecular function filter this ratio is approximately 2.9. These results, together with the comparison with the published frequency score filter, strongly suggest that the new filters differentiate true motifs from random background with good confidence. A combination of the function filters and the frequency score filter performs better than these two individual filters.


Subject(s)
Amino Acid Motifs , Computational Biology/methods , Proteins/chemistry , Proteins/metabolism , Algorithms , ROC Curve
SELECTION OF CITATIONS
SEARCH DETAIL
...