Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 19 de 19
Filter
Add more filters










Publication year range
1.
Mol Clin Oncol ; 7(5): 767-770, 2017 Nov.
Article in English | MEDLINE | ID: mdl-29142749

ABSTRACT

Colorectal cancer is the third leading cause of cancer-associated mortality in the western world. The ability to predict a patient's response to chemotherapy may be of great value for clinicians and patients when planning cancer treatment. The aim of the current study was to develop a urine metabolomics-based biomarker panel to predict adverse events and response to chemotherapy in patients with colorectal cancer. A retrospective chart review of patients diagnosed with stage III or IV colorectal cancer between 2008 and 2012 was performed. The exclusion criteria included chemotherapy for palliation and patients living outside of Alberta. Data was collected concerning the chemotherapy regimen, adverse events associated with chemotherapy, disease progression and recurrence and 5-year survival. Adverse events were subdivided as follows: Delays in treatment, dose reductions, hospitalizations and chemotherapy regime changes. Patients provided urine samples for analysis prior to any intervention. Nuclear magnetic resonance (NMR) spectra of urine samples were acquired. The 1H NMR spectrum of each urine sample was analyzed using Chenomx NMRSuite v7.0. Using machine learning, predictors were generated and evaluated using 10-fold cross-validation. Urine spectra were obtained for 62 patients. The best predictors resulted in area under the receiver operating characteristic curve values of: 0.542 for chemotherapy dose reduction, 0.612 for 5-year survival, 0.650 for cancer recurrence and 0.750 for treatment delay. Therefore, predictors were developed for response to and adverse events from chemotherapy for patients with colorectal cancer patients. The predictor for treatment delay has the most promise, and further studies will aid its refinement and improvement of its accuracy.

2.
Metabolites ; 7(3)2017 Jun 22.
Article in English | MEDLINE | ID: mdl-28640228

ABSTRACT

Background: Colorectal cancer is one of the leading causes of cancer deaths worldwide. The detection and removal of the precursors to colorectal cancer, adenomatous polyps, is the key for screening. The aim of this study was to develop a clinically scalable (high throughput, low cost, and high sensitivity) mass spectrometry (MS)-based urine metabolomic test for the detection of adenomatous polyps. Methods: Prospective urine and stool samples were collected from 685 participants enrolled in a colorectal cancer screening program to undergo colonoscopy examination. Statistical analysis was performed on 69 urine metabolites measured by one-dimensional nuclear magnetic resonance spectroscopy to identify key metabolites. A targeted MS assay was then developed to quantify the key metabolites in urine. A MS-based urine metabolomic diagnostic test for adenomatous polyps was established using 67% samples (un-blinded training set) and validated using the remaining 33% samples (blinded testing set). Results: The MS-based urine metabolomic test identifies patients with colonic adenomatous polyps with an AUC of 0.692, outperforming the NMR based predictor with an AUC of 0.670. Conclusion: Here we describe a clinically scalable MS-based urine metabolomic test that identifies patients with adenomatous polyps at a higher level of sensitivity (86%) over current fecal-based tests (<18%).

3.
J Cheminform ; 8: 61, 2016.
Article in English | MEDLINE | ID: mdl-27867422

ABSTRACT

BACKGROUND: Scientists have long been driven by the desire to describe, organize, classify, and compare objects using taxonomies and/or ontologies. In contrast to biology, geology, and many other scientific disciplines, the world of chemistry still lacks a standardized chemical ontology or taxonomy. Several attempts at chemical classification have been made; but they have mostly been limited to either manual, or semi-automated proof-of-principle applications. This is regrettable as comprehensive chemical classification and description tools could not only improve our understanding of chemistry but also improve the linkage between chemistry and many other fields. For instance, the chemical classification of a compound could help predict its metabolic fate in humans, its druggability or potential hazards associated with it, among others. However, the sheer number (tens of millions of compounds) and complexity of chemical structures is such that any manual classification effort would prove to be near impossible. RESULTS: We have developed a comprehensive, flexible, and computable, purely structure-based chemical taxonomy (ChemOnt), along with a computer program (ClassyFire) that uses only chemical structures and structural features to automatically assign all known chemical compounds to a taxonomy consisting of >4800 different categories. This new chemical taxonomy consists of up to 11 different levels (Kingdom, SuperClass, Class, SubClass, etc.) with each of the categories defined by unambiguous, computable structural rules. Furthermore each category is named using a consensus-based nomenclature and described (in English) based on the characteristic common structural properties of the compounds it contains. The ClassyFire webserver is freely accessible at http://classyfire.wishartlab.com/. Moreover, a Ruby API version is available at https://bitbucket.org/wishartlab/classyfire_api, which provides programmatic access to the ClassyFire server and database. ClassyFire has been used to annotate over 77 million compounds and has already been integrated into other software packages to automatically generate textual descriptions for, and/or infer biological properties of over 100,000 compounds. Additional examples and applications are provided in this paper. CONCLUSION: ClassyFire, in combination with ChemOnt (ClassyFire's comprehensive chemical taxonomy), now allows chemists and cheminformaticians to perform large-scale, rapid and automated chemical classification. Moreover, a freely accessible API allows easy access to more than 77 million "ClassyFire" classified compounds. The results can be used to help annotate well studied, as well as lesser-known compounds. In addition, these chemical classifications can be used as input for data integration, and many other cheminformatics-related tasks.

5.
PLoS One ; 10(5): e0124219, 2015.
Article in English | MEDLINE | ID: mdl-26017271

ABSTRACT

Many diseases cause significant changes to the concentrations of small molecules (a.k.a. metabolites) that appear in a person's biofluids, which means such diseases can often be readily detected from a person's "metabolic profile"-i.e., the list of concentrations of those metabolites. This information can be extracted from a biofluids Nuclear Magnetic Resonance (NMR) spectrum. However, due to its complexity, NMR spectral profiling has remained manual, resulting in slow, expensive and error-prone procedures that have hindered clinical and industrial adoption of metabolomics via NMR. This paper presents a system, BAYESIL, which can quickly, accurately, and autonomously produce a person's metabolic profile. Given a 1D 1H NMR spectrum of a complex biofluid (specifically serum or cerebrospinal fluid), BAYESIL can automatically determine the metabolic profile. This requires first performing several spectral processing steps, then matching the resulting spectrum against a reference compound library, which contains the "signatures" of each relevant metabolite. BAYESIL views spectral matching as an inference problem within a probabilistic graphical model that rapidly approximates the most probable metabolic profile. Our extensive studies on a diverse set of complex mixtures including real biological samples (serum and CSF), defined mixtures and realistic computer generated spectra; involving > 50 compounds, show that BAYESIL can autonomously find the concentration of NMR-detectable metabolites accurately (~ 90% correct identification and ~ 10% quantification error), in less than 5 minutes on a single CPU. These results demonstrate that BAYESIL is the first fully-automatic publicly-accessible system that provides quantitative NMR spectral profiling effectively-with an accuracy on these biofluids that meets or exceeds the performance of trained experts. We anticipate this tool will usher in high-throughput metabolomics and enable a wealth of new applications of NMR in clinical settings. BAYESIL is accessible at http://www.bayesil.ca.


Subject(s)
Magnetic Resonance Imaging , Metabolomics/methods , Algorithms
6.
Biomed Res Int ; 2013: 303982, 2013.
Article in English | MEDLINE | ID: mdl-24307992

ABSTRACT

We report an automated diagnostic test that uses the NMR spectrum of a single spot urine sample to accurately distinguish patients who require a colonoscopy from those who do not. Moreover, our approach can be adjusted to tradeoff between sensitivity and specificity. We developed our system using a group of 988 patients (633 normal and 355 who required colonoscopy) who were all at average or above-average risk for developing colorectal cancer. We obtained a metabolic profile of each subject, based on the urine samples collected from these subjects, analyzed via (1)H-NMR and quantified using targeted profiling. Each subject then underwent a colonoscopy, the gold standard to determine whether he/she actually had an adenomatous polyp, a precursor to colorectal cancer. The metabolic profiles, colonoscopy outcomes, and medical histories were then analysed using machine learning to create a classifier that could predict whether a future patient requires a colonoscopy. Our empirical studies show that this classifier has a sensitivity of 64% and a specificity of 65% and, unlike the current fecal tests, allows the administrators of the test to adjust the tradeoff between the two.


Subject(s)
Artificial Intelligence , Colonic Neoplasms/diagnosis , Colonic Polyps/urine , Metabolomics , Adult , Aged , Colonic Neoplasms/pathology , Colonic Polyps/pathology , Colonoscopy , Female , Humans , Magnetic Resonance Spectroscopy , Male , Middle Aged , Prognosis
7.
Database (Oxford) ; 2013: bat070, 2013.
Article in English | MEDLINE | ID: mdl-24103452

ABSTRACT

Polyphenols are a major class of bioactive phytochemicals whose consumption may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, type II diabetes and cancers. Phenol-Explorer, launched in 2009, is the only freely available web-based database on the content of polyphenols in food and their in vivo metabolism and pharmacokinetics. Here we report the third release of the database (Phenol-Explorer 3.0), which adds data on the effects of food processing on polyphenol contents in foods. Data on >100 foods, covering 161 polyphenols or groups of polyphenols before and after processing, were collected from 129 peer-reviewed publications and entered into new tables linked to the existing relational design. The effect of processing on polyphenol content is expressed in the form of retention factor coefficients, or the proportion of a given polyphenol retained after processing, adjusted for change in water content. The result is the first database on the effects of food processing on polyphenol content and, following the model initially defined for Phenol-Explorer, all data may be traced back to original sources. The new update will allow polyphenol scientists to more accurately estimate polyphenol exposure from dietary surveys.


Subject(s)
Databases as Topic , Food Handling , Polyphenols/analysis , Statistics as Topic , User-Computer Interface
8.
PLoS One ; 8(6): e65380, 2013.
Article in English | MEDLINE | ID: mdl-23755224

ABSTRACT

Top differentially expressed gene lists are often inconsistent between studies and it has been suggested that small sample sizes contribute to lack of reproducibility and poor prediction accuracy in discriminative models. We considered sex differences (69♂, 65 ♀) in 134 human skeletal muscle biopsies using DNA microarray. The full dataset and subsamples (n = 10 (5 ♂, 5 ♀) to n = 120 (60 ♂, 60 ♀)) thereof were used to assess the effect of sample size on the differential expression of single genes, gene rank order and prediction accuracy. Using our full dataset (n = 134), we identified 717 differentially expressed transcripts (p<0.0001) and we were able predict sex with ~90% accuracy, both within our dataset and on external datasets. Both p-values and rank order of top differentially expressed genes became more variable using smaller subsamples. For example, at n = 10 (5 ♂, 5 ♀), no gene was considered differentially expressed at p<0.0001 and prediction accuracy was ~50% (no better than chance). We found that sample size clearly affects microarray analysis results; small sample sizes result in unstable gene lists and poor prediction accuracy. We anticipate this will apply to other phenotypes, in addition to sex.


Subject(s)
Oligonucleotide Array Sequence Analysis/statistics & numerical data , RNA, Messenger/analysis , Rectus Abdominis/chemistry , Transcriptome , Aged , Female , Genetic Variation , Humans , Male , Middle Aged , Neoplasms/genetics , Neoplasms/pathology , Neoplasms/surgery , Oligonucleotide Array Sequence Analysis/standards , Predictive Value of Tests , RNA, Messenger/genetics , Rectus Abdominis/metabolism , Reproducibility of Results , Sample Size
9.
Nucleic Acids Res ; 41(Database issue): D801-7, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23161693

ABSTRACT

The Human Metabolome Database (HMDB) (www.hmdb.ca) is a resource dedicated to providing scientists with the most current and comprehensive coverage of the human metabolome. Since its first release in 2007, the HMDB has been used to facilitate research for nearly 1000 published studies in metabolomics, clinical biochemistry and systems biology. The most recent release of HMDB (version 3.0) has been significantly expanded and enhanced over the 2009 release (version 2.0). In particular, the number of annotated metabolite entries has grown from 6500 to more than 40,000 (a 600% increase). This enormous expansion is a result of the inclusion of both 'detected' metabolites (those with measured concentrations or experimental confirmation of their existence) and 'expected' metabolites (those for which biochemical pathways are known or human intake/exposure is frequent but the compound has yet to be detected in the body). The latest release also has greatly increased the number of metabolites with biofluid or tissue concentration data, the number of compounds with reference spectra and the number of data fields per entry. In addition to this expansion in data quantity, new database visualization tools and new data content have been added or enhanced. These include better spectral viewing tools, more powerful chemical substructure searches, an improved chemical taxonomy and better, more interactive pathway maps. This article describes these enhancements to the HMDB, which was previously featured in the 2009 NAR Database Issue. (Note to referees, HMDB 3.0 will go live on 18 September 2012.).


Subject(s)
Databases, Chemical , Metabolome , Metabolomics , Humans , Internet , Mass Spectrometry , Nuclear Magnetic Resonance, Biomolecular , User-Computer Interface
10.
Database (Oxford) ; 2012: bas031, 2012.
Article in English | MEDLINE | ID: mdl-22879444

ABSTRACT

Phenol-Explorer, launched in 2009, is the only comprehensive web-based database on the content in foods of polyphenols, a major class of food bioactives that receive considerable attention due to their role in the prevention of diseases. Polyphenols are rarely absorbed and excreted in their ingested forms, but extensively metabolized in the body, and until now, no database has allowed the recall of identities and concentrations of polyphenol metabolites in biofluids after the consumption of polyphenol-rich sources. Knowledge of these metabolites is essential in the planning of experiments whose aim is to elucidate the effects of polyphenols on health. Release 2.0 is the first major update of the database, allowing the rapid retrieval of data on the biotransformations and pharmacokinetics of dietary polyphenols. Data on 375 polyphenol metabolites identified in urine and plasma were collected from 236 peer-reviewed publications on polyphenol metabolism in humans and experimental animals and added to the database by means of an extended relational design. Pharmacokinetic parameters have been collected and can be retrieved in both tabular and graphical form. The web interface has been enhanced and now allows the filtering of information according to various criteria. Phenol-Explorer 2.0, which will be periodically updated, should prove to be an even more useful and capable resource for polyphenol scientists because bioactivities and health effects of polyphenols are dependent on the nature and concentrations of metabolites reaching the target tissues. The Phenol-Explorer database is publicly available and can be found online at http://www.phenol-explorer.eu. Database URL: http://www.phenol-explorer.eu.


Subject(s)
Databases, Chemical , Polyphenols/metabolism , Polyphenols/pharmacokinetics , Animals , Food Analysis , Humans , Internet , Software
11.
J Nutr ; 142(1): 14-21, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22157537

ABSTRACT

Urine and plasma metabolites originate from endogenous metabolic pathways in different organs and exogenous sources (diet). Urine and plasma were obtained from advanced cancer patients and investigated to determine if variations in lean and fat mass, dietary intake, and energy metabolism relate to variation in metabolite profiles. Patients (n = 55) recorded their diets for 3 d and after an overnight fast they were evaluated by DXA and indirect calorimetry. Metabolites were measured by NMR and direct injection MS. Three algorithms were used [partial least squares discriminant-analysis, support vector machines (SVM), and least absolute shrinkage and selection operator] to relate patients' plasma/urine metabolic profile with their dietary/physiological assessments. Leave-one-out cross-validation and permutation testing were conducted to determine statistical validity. None of the algorithms, using 63 urine metabolites, could learn to predict variations in individual's resting energy expenditure, respiratory quotient, or their intake of total energy, fat, sugar, or carbohydrate. Urine metabolites predicted appendicular lean tissue (skeletal muscle) with excellent cross-validation accuracy (98% using SVM). Total lean tissue correlated highly with appendicular muscle (Pearson r = 0.98; P < 0.0001) and gave similar cross-validation accuracies. Fat mass was effectively predicted using the 63 urine metabolites or the 143 plasma metabolites, exclusively. In conclusion, in this population, lean and fat mass variation could be effectively predicted using urinary metabolites, suggesting a potential role for metabolomics in body composition research. Furthermore, variation in lean and fat mass potentially confounds metabolomic studies attempting to characterize diet or disease conditions. Future studies should account or correct for such variation.


Subject(s)
Adipose Tissue/physiopathology , Metabolome , Muscle, Skeletal/physiopathology , Neoplasms/physiopathology , Absorptiometry, Photon , Algorithms , Calorimetry, Indirect , Energy Metabolism , Humans , Magnetic Resonance Spectroscopy , Mass Spectrometry , Neoplasms/metabolism , Neoplasms/urine
12.
PLoS One ; 6(2): e16957, 2011 Feb 16.
Article in English | MEDLINE | ID: mdl-21359215

ABSTRACT

Continuing improvements in analytical technology along with an increased interest in performing comprehensive, quantitative metabolic profiling, is leading to increased interest pressures within the metabolomics community to develop centralized metabolite reference resources for certain clinically important biofluids, such as cerebrospinal fluid, urine and blood. As part of an ongoing effort to systematically characterize the human metabolome through the Human Metabolome Project, we have undertaken the task of characterizing the human serum metabolome. In doing so, we have combined targeted and non-targeted NMR, GC-MS and LC-MS methods with computer-aided literature mining to identify and quantify a comprehensive, if not absolutely complete, set of metabolites commonly detected and quantified (with today's technology) in the human serum metabolome. Our use of multiple metabolomics platforms and technologies allowed us to substantially enhance the level of metabolome coverage while critically assessing the relative strengths and weaknesses of these platforms or technologies. Tables containing the complete set of 4229 confirmed and highly probable human serum compounds, their concentrations, related literature references and links to their known disease associations are freely available at http://www.serummetabolome.ca.


Subject(s)
Metabolome/physiology , Serum/metabolism , Adult , Aged , Blood Chemical Analysis/methods , Blood Proteins/analysis , Blood Proteins/metabolism , Case-Control Studies , Databases, Protein , Female , Gas Chromatography-Mass Spectrometry , Health , Humans , Lipids/analysis , Lipids/blood , Male , Metabolomics/methods , Middle Aged , Nuclear Magnetic Resonance, Biomolecular , Osmolar Concentration , Review Literature as Topic , Serum/chemistry , Spectrometry, Mass, Electrospray Ionization
13.
Nucleic Acids Res ; 39(Database issue): D1035-41, 2011 Jan.
Article in English | MEDLINE | ID: mdl-21059682

ABSTRACT

DrugBank (http://www.drugbank.ca) is a richly annotated database of drug and drug target information. It contains extensive data on the nomenclature, ontology, chemistry, structure, function, action, pharmacology, pharmacokinetics, metabolism and pharmaceutical properties of both small molecule and large molecule (biotech) drugs. It also contains comprehensive information on the target diseases, proteins, genes and organisms on which these drugs act. First released in 2006, DrugBank has become widely used by pharmacists, medicinal chemists, pharmaceutical researchers, clinicians, educators and the general public. Since its last update in 2008, DrugBank has been greatly expanded through the addition of new drugs, new targets and the inclusion of more than 40 new data fields per drug entry (a 40% increase in data 'depth'). These data field additions include illustrated drug-action pathways, drug transporter data, drug metabolite data, pharmacogenomic data, adverse drug response data, ADMET data, pharmacokinetic data, computed property data and chemical classification data. DrugBank 3.0 also offers expanded database links, improved search tools for drug-drug and food-drug interaction, new resources for querying and viewing drug pathways and hundreds of new drug entries with detailed patent, pricing and manufacturer data. These additions have been complemented by enhancements to the quality and quantity of existing data, particularly with regard to drug target, drug description and drug action data. DrugBank 3.0 represents the result of 2 years of manual annotation work aimed at making the database much more useful for a wide range of 'omics' (i.e. pharmacogenomic, pharmacoproteomic, pharmacometabolomic and even pharmacoeconomic) applications.


Subject(s)
Databases, Factual , Pharmacological Phenomena , Metabolomics , Pharmaceutical Preparations/chemistry , Pharmacogenetics , Proteomics , User-Computer Interface
14.
Nucleic Acids Res ; 37(Database issue): D603-10, 2009 Jan.
Article in English | MEDLINE | ID: mdl-18953024

ABSTRACT

The Human Metabolome Database (HMDB, http://www.hmdb.ca) is a richly annotated resource that is designed to address the broad needs of biochemists, clinical chemists, physicians, medical geneticists, nutritionists and members of the metabolomics community. Since its first release in 2007, the HMDB has been used to facilitate the research for nearly 100 published studies in metabolomics, clinical biochemistry and systems biology. The most recent release of HMDB (version 2.0) has been significantly expanded and enhanced over the previous release (version 1.0). In particular, the number of fully annotated metabolite entries has grown from 2180 to more than 6800 (a 300% increase), while the number of metabolites with biofluid or tissue concentration data has grown by a factor of five (from 883 to 4413). Similarly, the number of purified compounds with reference to NMR, LC-MS and GC-MS spectra has more than doubled (from 380 to more than 790 compounds). In addition to this significant expansion in database size, many new database searching tools and new data content has been added or enhanced. These include better algorithms for spectral searching and matching, more powerful chemical substructure searches, faster text searching software, as well as dedicated pathway searching tools and customized, clickable metabolic maps. Changes to the user-interface have also been implemented to accommodate future expansion and to make database navigation much easier. These improvements should make the HMDB much more useful to a much wider community of users.


Subject(s)
Databases, Factual , Metabolome , Humans , Magnetic Resonance Spectroscopy , Mass Spectrometry , Metabolic Networks and Pathways , User-Computer Interface
15.
Article in English | MEDLINE | ID: mdl-18502700

ABSTRACT

With continuing improvements in analytical technology and an increased interest in comprehensive metabolic profiling of biofluids and tissues, there is a growing need to develop comprehensive reference resources for certain clinically important biofluids, such as blood, urine and cerebrospinal fluid (CSF). As part of our effort to systematically characterize the human metabolome we have chosen to characterize CSF as the first biofluid to be intensively scrutinized. In doing so, we combined comprehensive NMR, gas chromatography-mass spectrometry (GC-MS) and liquid chromatography (LC) Fourier transform-mass spectrometry (FTMS) methods with computer-aided literature mining to identify and quantify essentially all of the metabolites that can be commonly detected (with today's technology) in the human CSF metabolome. Tables containing the compounds, concentrations, spectra, protocols and links to disease associations that we have found for the human CSF metabolome are freely available at http://www.csfmetabolome.ca.


Subject(s)
Cerebrospinal Fluid Proteins , Computational Biology/methods , Mass Spectrometry/methods , Cerebrospinal Fluid Proteins/analysis , Chromatography, Liquid/methods , Fourier Analysis , Gas Chromatography-Mass Spectrometry/methods , Humans , Nuclear Magnetic Resonance, Biomolecular
16.
Pac Symp Biocomput ; : 145-56, 2007.
Article in English | MEDLINE | ID: mdl-17990488

ABSTRACT

One of the growing challenges in life science research lies in finding useful, descriptive or quantitative data about newly reported biomolecules (genes, proteins, metabolites and drugs). An even greater challenge is finding information that connects these genes, proteins, drugs or metabolites to each other. Much of this information is scattered through hundreds of different databases, abstracts or books and almost none of it is particularly well integrated. While some efforts are being undertaken at the NCBI and EBI to integrate many different databases together, this still falls short of the goal of having some kind of human-readable synopsis that summarizes the state of knowledge about a given biomolecule - especially small molecules. To address this shortfall, we have developed BioSpider. BioSpider is essentially an automated report generator designed specifically to tabulate and summarize data on biomolecules - both large and small. Specifically, BioSpider allows users to type in almost any kind of biological or chemical identifier (protein/gene name, sequence, accession number, chemical name, brand name, SMILES string, InCHI string, CAS number, etc.) and it returns an in-depth synoptic report (approximately 3-30 pages in length) about that biomolecule and any other biomolecule it may target. This summary includes physico-chemical parameters, images, models, data files, descriptions and predictions concerning the query molecule. BioSpider uses a web-crawler to scan through dozens of public databases and employs a variety of specially developed text mining tools and locally developed prediction tools to find, extract and assemble data for its reports. Because of its breadth, depth and comprehensiveness, we believe BioSpider will prove to be a particularly valuable tool for researchers in metabolomics. BioSpider is available at: www.biospider.ca


Subject(s)
Internet , Metabolism , Software , Computational Biology , Databases, Factual , Drug Design , Enzymes/metabolism
17.
Nucleic Acids Res ; 35(Database issue): D521-6, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17202168

ABSTRACT

The Human Metabolome Database (HMDB) is currently the most complete and comprehensive curated collection of human metabolite and human metabolism data in the world. It contains records for more than 2180 endogenous metabolites with information gathered from thousands of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the HMDB also contains an extensive collection of experimental metabolite concentration data compiled from hundreds of mass spectra (MS) and Nuclear Magnetic resonance (NMR) metabolomic analyses performed on urine, blood and cerebrospinal fluid samples. This is further supplemented with thousands of NMR and MS spectra collected on purified, reference metabolites. Each metabolite entry in the HMDB contains an average of 90 separate data fields including a comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, biofluid concentrations, disease associations, pathway information, enzyme data, gene sequence data, SNP and mutation data as well as extensive links to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided. The HMDB is designed to address the broad needs of biochemists, clinical chemists, physicians, medical geneticists, nutritionists and members of the metabolomics community. The HMDB is available at: www.hmdb.ca.


Subject(s)
Databases, Factual , Metabolism , Databases, Factual/standards , Humans , Internet , Mass Spectrometry , Metabolic Diseases/genetics , Metabolic Diseases/metabolism , Metabolic Networks and Pathways , Nuclear Magnetic Resonance, Biomolecular , Quality Control , User-Computer Interface
18.
Nucleic Acids Res ; 33(Database issue): D147-53, 2005 Jan 01.
Article in English | MEDLINE | ID: mdl-15608166

ABSTRACT

PA-GOSUB (Proteome Analyst: Gene Ontology Molecular Function and Subcellular Localization) is a publicly available, web-based, searchable and downloadable database that contains the sequences, predicted GO molecular functions and predicted subcellular localizations of more than 107,000 proteins from 10 model organisms (and growing), covering the major kingdoms and phyla for which annotated proteomes exist (http://www.cs.ualberta.ca/~bioinfo/PA/GOSUB). The PA-GOSUB database effectively expands the coverage of subcellular localization and GO function annotations by a significant factor (already over five for subcellular localization, compared with Swiss-Prot v42.7), and more model organisms are being added to PA-GOSUB as their sequenced proteomes become available. PA-GOSUB can be used in three main ways. First, a researcher can browse the pre-computed PA-GOSUB annotations on a per-organism and per-protein basis using annotation-based and text-based filters. Second, a user can perform BLAST searches against the PA-GOSUB database and use the annotations from the homologs as simple predictors for the new sequences. Third, the whole of PA-GOSUB can be downloaded in either FASTA or comma-separated values (CSV) formats.


Subject(s)
Databases, Protein , Proteins/chemistry , Proteomics , Amino Acid Sequence , Animals , Artificial Intelligence , Humans , Mice , Models, Animal , Proteins/analysis , Proteins/genetics , Proteins/physiology , Sequence Homology, Amino Acid
19.
Nucleic Acids Res ; 32(Web Server issue): W365-71, 2004 Jul 01.
Article in English | MEDLINE | ID: mdl-15215412

ABSTRACT

Proteome Analyst (PA) (http://www.cs.ualberta.ca/~bioinfo/PA/) is a publicly available, high-throughput, web-based system for predicting various properties of each protein in an entire proteome. Using machine-learned classifiers, PA can predict, for example, the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein. In addition, PA is currently the most accurate and most comprehensive system for predicting subcellular localization, the location within a cell where a protein performs its main function. Two other capabilities of PA are notable. First, PA can create a custom classifier to predict a new property, without requiring any programming, based on labeled training data (i.e. a set of examples, each with the correct classification label) provided by a user. PA has been used to create custom classifiers for potassium-ion channel proteins and other general function ontologies. Second, PA provides a sophisticated explanation feature that shows why one prediction is chosen over another. The PA system produces a Naïve Bayes classifier, which is amenable to a graphical and interactive approach to explanations for its predictions; transparent predictions increase the user's confidence in, and understanding of, PA.


Subject(s)
Proteome/chemistry , Proteomics , Software , Internet , Proteins/classification , Proteins/physiology , Reproducibility of Results , Sequence Analysis, Protein
SELECTION OF CITATIONS
SEARCH DETAIL
...