Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Metabolites ; 6(2)2016 May 31.
Article in English | MEDLINE | ID: mdl-27258318

ABSTRACT

Metabolite structure identification remains a significant challenge in nontargeted metabolomics research. One commonly used strategy relies on searching biochemical databases using exact mass. However, this approach fails when the database does not contain the unknown metabolite (i.e., for unknown-unknowns). For these cases, constrained structure generation with combinatorial structure generators provides a potential option. Here we evaluated structure generation constraints based on the specification of: (1) substructures required (i.e., seed structures); (2) substructures not allowed; and (3) filters to remove incorrect structures. Our approach (database assisted structure identification, DASI) used predictive models in MolFind to find candidate structures with chemical and physical properties similar to the unknown. These candidates were then used for seed structure generation using eight different structure generation algorithms. One algorithm was able to generate correct seed structures for 21/39 test compounds. Eleven of these seed structures were large enough to constrain the combinatorial structure generator to fewer than 100,000 structures. In 35/39 cases, at least one algorithm was able to generate a correct seed structure. The DASI method has several limitations and will require further experimental validation and optimization. At present, it seems most useful for identifying the structure of unknown-unknowns with molecular weights <200 Da.

2.
Bioanalysis ; 7(8): 939-55, 2015.
Article in English | MEDLINE | ID: mdl-25966007

ABSTRACT

BACKGROUND: Artificial Neural Networks (ANN) are extensively used to model 'omics' data. Different modeling methodologies and combinations of adjustable parameters influence model performance and complicate model optimization. METHODOLOGY: We evaluated optimization of four ANN modeling parameters (learning rate annealing, stopping criteria, data split method, network architecture) using retention index (RI) data for 390 compounds. Models were assessed by independent validation (I-Val) using newly measured RI values for 1492 compounds. CONCLUSION: The best model demonstrated an I-Val standard error of 55 RI units and was built using a Ward's clustering data split and a minimally nonlinear network architecture. Use of validation statistics for stopping and final model selection resulted in better independent validation performance than the use of test set statistics.


Subject(s)
Artificial Intelligence/standards , Chromatography, High Pressure Liquid/methods , Metabolomics , Neural Networks, Computer , Systems Biology/standards , Cluster Analysis , Databases, Factual , Tandem Mass Spectrometry/methods
3.
J Chem Inf Model ; 53(9): 2483-92, 2013 Sep 23.
Article in English | MEDLINE | ID: mdl-23991755

ABSTRACT

Current methods of structure identification in mass-spectrometry-based nontargeted metabolomics rely on matching experimentally determined features of an unknown compound to those of candidate compounds contained in biochemical databases. A major limitation of this approach is the relatively small number of compounds currently included in these databases. If the correct structure is not present in a database, it cannot be identified, and if it cannot be identified, it cannot be included in a database. Thus, there is an urgent need to augment metabolomics databases with rationally designed biochemical structures using alternative means. Here we present the In Vivo/In Silico Metabolites Database (IIMDB), a database of in silico enzymatically synthesized metabolites, to partially address this problem. The database, which is available at http://metabolomics.pharm.uconn.edu/iimdb/, includes ~23,000 known compounds (mammalian metabolites, drugs, secondary plant metabolites, and glycerophospholipids) collected from existing biochemical databases plus more than 400,000 computationally generated human phase-I and phase-II metabolites of these known compounds. IIMDB features a user-friendly web interface and a programmer-friendly RESTful web service. Ninety-five percent of the computationally generated metabolites in IIMDB were not found in any existing database. However, 21,640 were identical to compounds already listed in PubChem, HMDB, KEGG, or HumanCyc. Furthermore, the vast majority of these in silico metabolites were scored as biological using BioSM, a software program that identifies biochemical structures in chemical structure space. These results suggest that in silico biochemical synthesis represents a viable approach for significantly augmenting biochemical databases for nontargeted metabolomics applications.


Subject(s)
Databases, Factual , Enzymes/metabolism , Metabolomics/methods , Animals , Glycerophospholipids/metabolism , Humans , Internet , Pharmaceutical Preparations/metabolism , Plants/metabolism , User-Computer Interface
4.
Comput Struct Biotechnol J ; 5: e201302005, 2013.
Article in English | MEDLINE | ID: mdl-24688698

ABSTRACT

The identification of compounds in complex mixtures remains challenging despite recent advances in analytical techniques. At present, no single method can detect and quantify the vast array of compounds that might be of potential interest in metabolomics studies. High performance liquid chromatography/mass spectrometry (HPLC/MS) is often considered the analytical method of choice for analysis of biofluids. The positive identification of an unknown involves matching at least two orthogonal HPLC/MS measurements (exact mass, retention index, drift time etc.) against an authentic standard. However, due to the limited availability of authentic standards, an alternative approach involves matching known and measured features of the unknown compound with computationally predicted features for a set of candidate compounds downloaded from a chemical database. Computationally predicted features include retention index, ECOM50 (energy required to decompose 50% of a selected precursor ion in a collision induced dissociation cell), drift time, whether the unknown compound is biological or synthetic and a collision induced dissociation (CID) spectrum. Computational predictions are used to filter the initial "bin" of candidate compounds. The final output is a ranked list of candidates that best match the known and measured features. In this mini review, we discuss cheminformatics methods underlying this database search-filter identification approach.

5.
Anal Chem ; 84(21): 9388-94, 2012 Nov 06.
Article in English | MEDLINE | ID: mdl-23039714

ABSTRACT

In this paper, we present MolFind, a highly multithreaded pipeline type software package for use as an aid in identifying chemical structures in complex biofluids and mixtures. MolFind is specifically designed for high-performance liquid chromatography/mass spectrometry (HPLC/MS) data inputs typical of metabolomics studies where structure identification is the ultimate goal. MolFind enables compound identification by matching HPLC/MS-based experimental data obtained for an unknown compound with computationally derived HPLC/MS values for candidate compounds downloaded from chemical databases such as PubChem. The downloaded "bins" consist of all compounds matching the monoisotopic molecular weight of the unknown. The computational HPLC/MS values predicted include retention index (RI), ECOM(50) (energy required to fragment 50% of a selected precursor ion), drift time, and collision induced dissociation (CID) spectrum. RI, ECOM(50), and drift-time models are used for filtering compounds downloaded from PubChem. The remaining candidates are then ranked based on CID spectra matching. Current RI and ECOM(50) models allow for the removal of about 28% of compounds from PubChem bins. Our estimates suggest that this could be improved to as much as 87% with additional chemical structures included in the computational models. Quantitative structure property relationship-based modeling of drift times showed a better correlation with experimentally determined drift times than did Mobcal cross-sectional areas. In 23 of 35 example cases, filtering PubChem bins with RI and ECOM(50) predictive models resulted in improved ranking of the unknown compounds compared to previous studies using CID spectra matching alone. In 19 of 35 examples, the correct candidate was ranked within the top 20 compounds in bins containing an average of 1635 compounds.


Subject(s)
Chromatography, High Pressure Liquid/methods , Mass Spectrometry/methods , Software
6.
J Mol Graph Model ; 30: 38-45, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21715202

ABSTRACT

Modeling chemical events inside proteins often require the incorporation of solvent effects via continuum polarizable models. One of these approaches is based on the assumption that the interface between solute and solvent acts as a conductor. Image charges are added on the molecular surface to satisfy the appropriate conductor boundary conditions in the presence of solute charges. As in the case of other polarizable continuum models that are based on surface tessellation, the simplest implementation of this approach is often limited to several hundred atoms due to a matrix inversion, which scales as the cube of the number or tesserae. For larger systems, approaches that use iterative matrix solvers coupled to fast summation methods must be used. In the present work, we develop a self-consistent approach to obtain conductor-like screening charges suitable for applications in proteins. The approach is based on a density fragmentation of a graphical surface tessellation. This method, although approximate, provides a straightforward scheme of parallelization, which can in principle be added to existing linear scaling implementations of conductor-like models. We implement this method in conjunction with a fixed charge model for the protein, as well as with a moving domain QM/MM description of the protein. In the latter case, the overall result leads to a charge distribution within the protein determined by self-polarization and polarization due to solvent.


Subject(s)
Computer Simulation , Models, Molecular , Proteins/chemistry , Algorithms , Amino Acids/chemistry , Electrochemistry , Linear Models , Protein Structure, Tertiary , Solvents , Surface Properties , Thermodynamics
7.
For Immunopathol Dis Therap ; 2(1): 47-58, 2011.
Article in English | MEDLINE | ID: mdl-21709760

ABSTRACT

Raf kinase inhibitor protein (RKIP) interacts with a number of different proteins and regulates multiple signaling pathways. Here, we show that locostatin, a small molecule that covalently binds RKIP, not only disrupts interactions of RKIP with Raf-1 kinase, but also with G protein-coupled receptor kinase 2. In contrast, we found that locostatin does not disrupt binding of RKIP to two other proteins: inhibitor of κB kinase α and transforming growth factor ß-activated kinase 1. These results thus imply that different proteins interact with different regions of RKIP. Locostatin's mechanism of action involves modification of a nucleophilic residue on RKIP. We observed that after binding RKIP, part of locostatin is slowly hydrolyzed, leaving a smaller RKIP-butyrate adduct. We identified the residue alkylated by locostatin as His86, a highly conserved residue in RKIP's ligand-binding pocket. Computational modeling of the binding of locostatin to RKIP suggested that the recognition interaction between small molecule and protein ensures that locostatin's electrophilic site is poised to react with His86. Furthermore, binding of locostatin would sterically hinder binding of other ligands in the pocket. These data provide a basis for understanding how locostatin disrupts particular interactions of RKIP with RKIP-binding proteins and demonstrate its utility as a probe of specific RKIP interactions and functions.

8.
Curr Top Med Chem ; 10(1): 46-54, 2010.
Article in English | MEDLINE | ID: mdl-19929827

ABSTRACT

One of the goals of medicinal chemistry concerns the ability to compute protein-ligand interactions based on the structural knowledge of the receptor. To this end, the majority of current approaches incorporate classical force field potentials to describe receptor-ligand interactions. One of the most critical problems of standard molecular mechanics (MM) force fields is their fixed-charge treatment of electrostatic interactions. Two problems are derived from this approximation, polarization and charge transfer. As an immediate step in computational complexity, it seems natural to incorporate Quantum Mechanics (QM) within a hybrid QM/MM approach, which has shown to be a useful tool to describe structural and mechanistic aspects of chromophores and prosthetic residues in proteins. In this review, we describe specifically the role of QM/MM methods and their various applications to computational drug design and medicinal chemistry research in general.


Subject(s)
Models, Chemical , Pharmaceutical Preparations/chemistry , Quantum Theory , Chemistry, Pharmaceutical , Computer-Aided Design , Drug Design , Ligands
9.
J Mol Model ; 14(6): 479-87, 2008 Jun.
Article in English | MEDLINE | ID: mdl-18427844

ABSTRACT

This work presents new developments of the moving-domain QM/MM (MoD-QM/MM) method for modeling protein electrostatic potentials. The underlying goal of the method is to map the electronic density of a specific protein configuration into a point-charge distribution. Important modifications of the general strategy of the MoD-QM/MM method involve new partitioning and fitting schemes and the incorporation of dynamic effects via a single-step free energy perturbation approach (FEP). Selection of moderately sized QM domains partitioned between C (alpha) and C (from C=O), with incorporation of delocalization of electrons over neighboring domains, results in a marked improvement of the calculated molecular electrostatic potential (MEP). More importantly, we show that the evaluation of the electrostatic potential can be carried out on a dynamic framework by evaluating the free energy difference between a non-polarized MEP and a polarized MEP. A simplified form of the potassium ion channel protein Gramicidin-A from Bacillus brevis is used as the model system for the calculation of MEP.


Subject(s)
Computer Simulation , Models, Chemical , Protein Structure, Tertiary , Proteins/chemistry , Bacillus subtilis , Bacterial Proteins/chemistry , Gramicidin/chemistry , Static Electricity , Thermodynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...