Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 39(Web Server issue): W406-11, 2011 Jul.
Article in English | MEDLINE | ID: mdl-21486753

ABSTRACT

The University of Minnesota Pathway Prediction System (UM-PPS, http://umbbd.msi.umn.edu/predict/) is a rule-based system that predicts microbial catabolism of organic compounds. Currently, its knowledge base contains 250 biotransformation rules and five types of metabolic logic entities. The original UM-PPS predicted up to two prediction levels at a time. Users had to choose a predicted product to continue the prediction. This approach provided a limited view of prediction results and heavily relied on manual intervention. The new UM-PPS produces a multi-level prediction within an acceptable time frame, and allows users to view prediction alternatives much more easily as a directed acyclic graph.


Subject(s)
Biotransformation , Databases, Factual , Computer Graphics , Environmental Microbiology , Knowledge Bases , Organic Chemicals/metabolism , Software
2.
Nucleic Acids Res ; 38(Database issue): D488-91, 2010 Jan.
Article in English | MEDLINE | ID: mdl-19767608

ABSTRACT

The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.msi.umn.edu/) began in 1995 and now contains information on almost 1200 compounds, over 800 enzymes, almost 1300 reactions and almost 500 microorganism entries. Besides these data, it includes a Biochemical Periodic Table (UM-BPT) and a rule-based Pathway Prediction System (UM-PPS) (http://umbbd.msi.umn.edu/predict/) that predicts plausible pathways for microbial degradation of organic compounds. Currently, the UM-PPS contains 260 biotransformation rules derived from reactions found in the UM-BBD and scientific literature. Public access to UM-BBD data is increasing. UM-BBD compound data are now contributed to PubChem and ChemSpider, the public chemical databases. A new mirror website of the UM-BBD, UM-BPT and UM-PPS is being developed at ETH Zürich to improve speed and reliability of online access from anywhere in the world.


Subject(s)
Biodegradation, Environmental , Computational Biology/methods , Databases, Genetic , Databases, Nucleic Acid , Access to Information , Biochemistry/methods , Biotransformation , Computational Biology/trends , Dictionaries, Chemical as Topic , Environmental Pollutants , Genome, Bacterial , Information Storage and Retrieval/methods , Internet , Microbiology , Software , User-Computer Interface
3.
AMIA Annu Symp Proc ; : 951, 2008 Nov 06.
Article in English | MEDLINE | ID: mdl-18998954

ABSTRACT

The UM-BBD Pathway Prediction System (UM-PPS, http://umbbd.msi.umn.edu/predict/) predicts microbial catabolism of organic compounds. We improved UM-PPS infrastructure to improve pathway prediction results. We added the ability to allow relative reasoning and variable aerobic likelihood. One relative reasoning entry decreased choices 75% with no loss of sensitivity. Variable aerobic likelihood gives more accurate likelihood for rules triggered by substrates with certain chemical structures. Predictions are improved.


Subject(s)
Algorithms , Bacteria, Aerobic/metabolism , Models, Biological , Organic Chemicals/metabolism , Signal Transduction/physiology , Computer Simulation , Metabolism , Minnesota
4.
Nucleic Acids Res ; 36(Web Server issue): W427-32, 2008 Jul 01.
Article in English | MEDLINE | ID: mdl-18524801

ABSTRACT

The University of Minnesota pathway prediction system (UM-PPS, http://umbbd.msi.umn.edu/predict/) recognizes functional groups in organic compounds that are potential targets of microbial catabolic reactions, and predicts transformations of these groups based on biotransformation rules. Rules are based on the University of Minnesota biocatalysis/biodegradation database (http://umbbd.msi.umn.edu/) and the scientific literature. As rules were added to the UM-PPS, more of them were triggered at each prediction step. The resulting combinatorial explosion is being addressed in four ways. Biodegradation experts give each rule an aerobic likelihood value of Very Likely, Likely, Neutral, Unlikely or Very Unlikely. Users now can choose whether they view all, or only the more aerobically likely, predicted transformations. Relative reasoning, allowing triggering of some rules to inhibit triggering of others, was implemented. Rules were initially assigned to individual chemical reactions. In selected cases, these have been replaced by super rules, which include two or more contiguous reactions that form a small pathway of their own. Rules are continually modified to improve the prediction accuracy; increasing rule stringency can improve predictions and reduce extraneous choices. The UM-PPS is freely available to all without registration. Its value to the scientific community, for academic, industrial and government use, is good and will only increase.


Subject(s)
Biotransformation , Software , Databases, Factual , Environmental Pollutants/chemistry , Environmental Pollutants/metabolism , Internet , User-Computer Interface
5.
Nucleic Acids Res ; 36(4): e22, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18234718

ABSTRACT

Meta-predictors make predictions by organizing and processing the predictions produced by several other predictors in a defined problem domain. A proficient meta-predictor not only offers better predicting performance than the individual predictors from which it is constructed, but it also relieves experimentally researchers from making difficult judgments when faced with conflicting results made by multiple prediction programs. As increasing numbers of predicting programs are being developed in a large number of fields of life sciences, there is an urgent need for effective meta-prediction strategies to be investigated. We compiled four unbiased phosphorylation site datasets, each for one of the four major serine/threonine (S/T) protein kinase families-CDK, CK2, PKA and PKC. Using these datasets, we examined several meta-predicting strategies with 15 phosphorylation site predictors from six predicting programs: GPS, KinasePhos, NetPhosK, PPSP, PredPhospho and Scansite. Meta-predictors constructed with a generalized weighted voting meta-predicting strategy with parameters determined by restricted grid search possess the best performance, exceeding that of all individual predictors in predicting phosphorylation sites of all four kinase families. Our results demonstrate a useful decision-making tool for analysing the predictions of the various S/T phosphorylation site predictors. An implementation of these meta-predictors is available on the web at: http://MetaPred.umn.edu/MetaPredPS/.


Subject(s)
Protein Serine-Threonine Kinases/metabolism , Software , Internet , Phosphopeptides/chemistry , Phosphorylation , Phosphoserine/analysis , Phosphothreonine/analysis , Sequence Analysis, Protein
6.
Nucleic Acids Res ; 35(15): e96, 2007.
Article in English | MEDLINE | ID: mdl-17670799

ABSTRACT

Meta-prediction seeks to harness the combined strengths of multiple predicting programs with the hope of achieving predicting performance surpassing that of all existing predictors in a defined problem domain. We investigated meta-prediction for the four-compartment eukaryotic subcellular localization problem. We compiled an unbiased subcellular localization dataset of 1693 nuclear, cytoplasmic, mitochondrial and extracellular animal proteins from Swiss-Prot 50.2. Using this dataset, we assessed the predicting performance of 12 predictors from eight independent subcellular localization predicting programs: ELSPred, LOCtree, PLOC, Proteome Analyst, PSORT, PSORT II, SubLoc and WoLF PSORT. Gorodkin correlation coefficient (GCC) was one of the performance measures. Proteome Analyst is the best individual subcellular localization predictor tested in this four-compartment prediction problem, with GCC = 0.811. A reduced voting strategy eliminating six of the 12 predictors yields a meta-predictor (RAW-RAG-6) with GCC = 0.856, substantially better than all tested individual subcellular localization predictors (P = 8.2 x 10(-6), Fisher's Z-transformation test). The improvement in performance persists when the meta-predictor is tested with data not used in its development. This and similar voting strategies, when properly applied, are expected to produce meta-predictors with outstanding performance in other life sciences problem domains.


Subject(s)
Proteins/analysis , Software , Cell Compartmentation , Databases, Protein , Eukaryotic Cells/chemistry
7.
Nucleic Acids Res ; 34(Database issue): D517-21, 2006 Jan 01.
Article in English | MEDLINE | ID: mdl-16381924

ABSTRACT

As the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.ahc.umn.edu/) starts its second decade, it includes information on over 900 compounds, over 600 enzymes, nearly 1000 reactions and about 350 microorganism entries. Its Biochemical Periodic Tables have grown to include biological information for almost all stable, non-noble-gas elements (http://umbbd.ahc.umn.edu/periodic/). Its Pathway Prediction System (PPS) (http://umbbd.ahc.umn.edu/predict/) is now an internationally recognized, open system for predicting microbial catabolism of organic compounds. Graphical display of PPS rules, a stand-alone version of the PPS and guidance for PPS users are being developed. The next decade should see the PPS, and the UM-BBD on which it is based, find increasing use by national and international government agencies, commercial organizations and educational institutions.


Subject(s)
Biodegradation, Environmental , Databases, Factual , Microbiology , Catalysis , Enzymes/chemistry , Enzymes/metabolism , Internet , Minnesota , Organic Chemicals/chemistry , Organic Chemicals/metabolism , User-Computer Interface
8.
PLoS One ; 1: e104, 2006 Dec 20.
Article in English | MEDLINE | ID: mdl-17218990

ABSTRACT

BACKGROUND: Understanding the functional role(s) of the more than 20,000 proteins of the vertebrate genome is a major next step in the post-genome era. The approximately 4,000 co-translationally translocated (CTT) proteins - representing the vertebrate secretome - are important for such vertebrate-critical processes as organogenesis. However, the role(s) for most of these genes is currently unknown. RESULTS: We identified 585 putative full-length zebrafish CTT proteins using cross-species genomic and EST-based comparative sequence analyses. We further investigated 150 of these genes (Figure 1) for unique function using morpholino-based analysis in zebrafish embryos. 12% of the CTT protein-deficient embryos resulted in specific developmental defects, a notably higher rate of gene function annotation than the 2%-3% estimate from random gene mutagenesis studies. CONCLUSION: This initial collection includes novel genes required for the development of vascular, hematopoietic, pigmentation, and craniofacial tissues, as well as lipid metabolism, and organogenesis. This study provides a framework utilizing zebrafish for the systematic assignment of biological function in a vertebrate genome.


Subject(s)
Vertebrates/genetics , Amino Acid Sequence , Animals , Animals, Genetically Modified , Antisense Elements (Genetics)/genetics , Base Sequence , Blood Vessels/embryology , Computational Biology , Genome , Genomics , Hematopoiesis , Lipid Metabolism/genetics , Molecular Sequence Data , Proteome , Proteomics , Sequence Alignment , Vertebrates/growth & development , Vertebrates/physiology , Zebrafish/embryology , Zebrafish/genetics , Zebrafish/physiology , Zebrafish Proteins/genetics , Zebrafish Proteins/physiology
9.
BMC Bioinformatics ; 6: 256, 2005 Oct 14.
Article in English | MEDLINE | ID: mdl-16225690

ABSTRACT

BACKGROUND: Improvements in protein sequence annotation and an increase in the number of annotated protein databases has fueled development of an increasing number of software tools to predict secreted proteins. Six software programs capable of high throughput and employing a wide range of prediction methods, SignalP 3.0, SignalP 2.0, TargetP 1.01, PrediSi, Phobius, and ProtComp 6.0, are evaluated. RESULTS: Prediction accuracies were evaluated using 372 unbiased, eukaryotic, SwissProt protein sequences. TargetP, SignalP 3.0 maximum S-score and SignalP 3.0 D-score were the most accurate single scores (90-91% accurate). The combination of a positive TargetP prediction, SignalP 2.0 maximum Y-score, and SignalP 3.0 maximum S-score increased accuracy by six percent. CONCLUSION: Single predictive scores could be highly accurate, but almost all accuracies were slightly less than those reported by program authors. Predictive accuracy could be substantially improved by combining scores from multiple methods into a single composite prediction.


Subject(s)
Sequence Analysis, Protein/instrumentation , Software , Animals , Humans , Models, Statistical , Neural Networks, Computer , Predictive Value of Tests , Sensitivity and Specificity , Sequence Analysis, Protein/methods , Software Design , Terminology as Topic
10.
Nucleic Acids Res ; 33(Web Server issue): W506-11, 2005 Jul 01.
Article in English | MEDLINE | ID: mdl-15980523

ABSTRACT

AMOD is a web-based program that aids in the functional evaluation of nucleotide sequences through sequence characterization and antisense morpholino oligonucleotide (target site) selection. Submitted sequences are analyzed by translation initiation site prediction algorithms and sequence-to-sequence comparisons; results are used to characterize sequence features required for morpholino design. Within a defined subsequence, base composition and homodimerization values are computed for all putative morpholino oligonucleotides. Using these properties, morpholino candidates are selected and compared with genomic and transcriptome databases with the goal to identify target-specific enriched morpholinos. AMOD has been used at the University of Minnesota to design approximately 200 morpholinos for a functional genomics screen in zebrafish. The AMOD web server and a tutorial are freely available to both academic and commercial users at http://www.secretomes.umn.edu/AMOD/.


Subject(s)
Genomics/methods , Morpholines/chemistry , Oligonucleotides, Antisense/chemistry , Software , DNA Primers/chemistry , Internet , Sequence Alignment , User-Computer Interface
11.
J Ind Microbiol Biotechnol ; 31(6): 261-72, 2004 Jul.
Article in English | MEDLINE | ID: mdl-15248088

ABSTRACT

Prediction of microbial metabolism is important for annotating genome sequences and for understanding the fate of chemicals in the environment. A metabolic pathway prediction system (PPS) has been developed that is freely available on the world wide web (http://umbbd.ahc.umn.edu/predict/), recognizes the organic functional groups found in a compound, and predicts transformations based on metabolic rules. These rules are designed largely by examining reactions catalogued in the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD) and are generalized based on metabolic logic. The predictive accuracy of the PPS was tested: (1) using a 113-member set of compounds found in the database, (2) against a set of compounds whose metabolism was predicted by human experts, and (3) for consistency with experimental microbial growth studies. First, the system correctly predicted known metabolism for 111 of the 113 compounds containing C and H, O, N, S, P and/or halides that initiate existing pathways in the database, and also correctly predicted 410 of the 569 known pathway branches for these compounds. Second, computer predictions were compared to predictions by human experts for biodegradation of six compounds whose metabolism was not described in the literature. Third, the system predicted reactions liberating ammonia from three organonitrogen compounds, consistent with laboratory experiments showing that each compound served as the sole nitrogen source supporting microbial growth. The rule-based nature of the PPS makes it transparent, expandable, and adaptable.


Subject(s)
Bacteria/metabolism , Biodegradation, Environmental , Organic Chemicals/metabolism , Bacteria/genetics , Biotransformation , Catalysis , Databases, Factual/trends , Genes, Bacterial/physiology , Genomics , Organic Chemicals/chemistry
12.
BMC Bioinformatics ; 5: 14, 2004 Feb 16.
Article in English | MEDLINE | ID: mdl-15053846

ABSTRACT

BACKGROUND: Expressed Sequence Tag (EST) sequences are generally single-strand, single-pass sequences, only 200-600 nucleotides long, contain errors resulting in frame shifts, and represent different parts of their parent cDNA. If the cDNAs contain translation initiation sites, they may be suitable for functional genomics studies. We have compared five methods to predict translation initiation sites in EST data: first-ATG, ESTScan, Diogenes, Netstart, and ATGpr. RESULTS: A dataset of 100 EST sequences, 50 with and 50 without, translation initiation sites, was created. Based on analysis of this dataset, ATGpr is found to be the most accurate for predicting the presence versus absence of translation initiation sites. With a maximum accuracy of 76%, ATGpr more accurately predicts the position or absence of translation initiation sites than NetStart (57%) or Diogenes (50%). ATGpr similarly excels when start sites are known to be present (90%), whereas NetStart achieves only 60% overall accuracy. As a baseline for comparison, choosing the first ATG correctly identifies the translation initiation site in 74% of the sequences. ESTScan and Diogenes, consistent with their intended use, are able to identify open reading frames, but are unable to determine the precise position of translation initiation sites. CONCLUSIONS: ATGpr demonstrates high sensitivity, specificity, and overall accuracy in identifying start sites while also rejecting incomplete sequences. A database of EST sequences suitable for validating programs for translation initiation site prediction is now available. These tools and materials may open an avenue for future improvements in start site prediction and EST analysis.


Subject(s)
Codon, Initiator , Computational Biology/methods , Expressed Sequence Tags , Protein Biosynthesis , Methionine/genetics , Predictive Value of Tests
13.
Nucleic Acids Res ; 32(4): 1414-21, 2004.
Article in English | MEDLINE | ID: mdl-14990746

ABSTRACT

The proteins processed by the secretory pathway (secretome) are critical players in the development of multi-cellular eukaryotic organisms but have yet to be comprehensively studied at the genomic level. In this study, we use the Target P algorithm to predict human (13-20% of proteins found in individual datasets) and Fugu (14%) secretomes based on analysis of their nearly complete proteomes. We combine internal processing with prediction software to automate secreted protein identification and overcome one of the major challenges associated with EST data: identification of the minority of clones that encode N-terminally-complete proteins. We discuss the use of these methods to predict secreted proteins in EST-based consensus sequence sets, and we validate these predictions using an assay for cell-free cotranslational translocation. Analysis of TIGR Porcine Gene Index 4.0 as a test dataset resulted in the identification of 352 N-terminally-complete, putative secreted proteins. In functional agreement with our predictions, 34 of 40 (85%) of these cDNAs were verified to be cotranslationally translocated in an in vitro translation system. The methods developed here are specifically designed to accept partial open reading frames and improve secreted protein predictions in eukaryotic transcriptomes, and are valuable for the analysis and annotation of eukaryotic EST databases.


Subject(s)
Eukaryotic Cells/metabolism , Sequence Analysis, Protein/methods , Swine/genetics , Tetraodontiformes/genetics , Algorithms , Amino Acid Sequence , Animals , Databases, Genetic , Expressed Sequence Tags , Humans , Molecular Sequence Data , Protein Biosynthesis , Protein Sorting Signals , Protein Transport , Proteome/chemistry , Sequence Alignment , Swine/metabolism , Tetraodontiformes/metabolism
15.
J Chem Inf Comput Sci ; 43(3): 1051-7, 2003.
Article in English | MEDLINE | ID: mdl-12767164

ABSTRACT

We have developed a system to predict microbial catabolism, using the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.ahc.umn.edu/) as a knowledge base. The present system, available on the Web (http://umbbd.ahc.umn.edu/predict/), can predict biodegradation of most of the major aliphatic and aromatic organic functional groups containing C, H, N, O, and halogens. It can duplicate at least one known biodegradation pathway for 60% of the compounds in a 84-member validation set; most pathways that did not completely duplicate known metabolism could plausibly occur in nature. Users are encouraged, and have begun, to submit additional biotransformation rules and comment on existing rules; the system will further develop under the direction of the scientific community.


Subject(s)
Databases, Factual , Microbiology , Organic Chemicals/chemistry , Organic Chemicals/metabolism , Biodegradation, Environmental , Biotransformation , Catalysis , Internet
16.
Nucleic Acids Res ; 31(1): 262-5, 2003 Jan 01.
Article in English | MEDLINE | ID: mdl-12519997

ABSTRACT

The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.ahc.umn.edu/) provides curated information on microbial catabolism and related biotransformations, primarily for environmental pollutants. Currently, it contains information on over 130 metabolic pathways, 800 reactions, 750 compounds and 500 enzymes. In the past two years, it has increased its breath to include more examples of microbial metabolism of metals and metalloids; and expanded the types of information it includes to contain microbial biotransformations of, and binding interactions with many chemical elements. It has also increased the ways in which this data can be accessed (mined). Structure-based searching was added, for exact matches, similarity, or substructures. Analysis of UM-BBD reactions has lead to a prototype, guided, pathway prediction system. Guided prediction means that the user is shown all possible biotransformations at each step and guides the process to its conclusion. Mining the UM-BBD's data provides a unique view into how the microbial world recycles organic functional groups. UM-BBD users are encouraged to comment on all aspects of the database, including the information it contains and the tools by which it can be mined. The database and prediction system develop under the direction of the scientific community.


Subject(s)
Databases, Factual , Environmental Pollutants/metabolism , Biodegradation, Environmental , Biotransformation , Catalysis , Genomics , Metals/metabolism , Minnesota , Prokaryotic Cells/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...