Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
1.
Metab Eng ; 84: 34-47, 2024 May 31.
Article in English | MEDLINE | ID: mdl-38825177

ABSTRACT

Understanding diverse bacterial nutritional requirements and responses is foundational in microbial research and biotechnology. In this study, we employed knowledge-enriched transcriptomic analytics to decipher complex stress responses of Vibrio natriegens to supplied nutrients, aiming to enhance microbial engineering efforts. We computed 64 independently modulated gene sets that comprise a quantitative basis for transcriptome dynamics across a comprehensive transcriptomics dataset containing a broad array of nutrient conditions. Our approach led to the i) identification of novel transporter systems for diverse substrates, ii) a detailed understanding of how trace elements affect metabolism and growth, and iii) extensive characterization of nutrient-induced stress responses, including osmotic stress, low glycolytic flux, proteostasis, and altered protein expression. By clarifying the relationship between the acetate-associated regulon and glycolytic flux status of various nutrients, we have showcased its vital role in directing optimal carbon source selection. Our findings offer deep insights into the transcriptional landscape of bacterial nutrition and underscore its significance in tailoring strain engineering strategies, thereby facilitating the development of more efficient and robust microbial systems for biotechnological applications.

2.
mSystems ; : e0030524, 2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38829048

ABSTRACT

Fast growth phenotypes are achieved through optimal transcriptomic allocation, in which cells must balance tradeoffs in resource allocation between diverse functions. One such balance between stress readiness and unbridled growth in E. coli has been termed the fear versus greed (f/g) tradeoff. Two specific RNA polymerase (RNAP) mutations observed in adaptation to fast growth have been previously shown to affect the f/g tradeoff, suggesting that genetic adaptations may be primed to control f/g resource allocation. Here, we conduct a greatly expanded study of the genetic control of the f/g tradeoff across diverse conditions. We introduced 12 RNA polymerase (RNAP) mutations commonly acquired during adaptive laboratory evolution (ALE) and obtained expression profiles of each. We found that these single RNAP mutation strains resulted in large shifts in the f/g tradeoff primarily in the RpoS regulon and ribosomal genes, likely through modifying RNAP-DNA interactions. Two of these mutations additionally caused condition-specific transcriptional adaptations. While this tradeoff was previously characterized by the RpoS regulon and ribosomal expression, we find that the GAD regulon plays an important role in stress readiness and ppGpp in translation activity, expanding the scope of the tradeoff. A phylogenetic analysis found the greed-related genes of the tradeoff present in numerous bacterial species. The results suggest that the f/g tradeoff represents a general principle of transcriptome allocation in bacteria where small genetic changes can result in large phenotypic adaptations to growth conditions.IMPORTANCETo increase growth, E. coli must raise ribosomal content at the expense of non-growth functions. Previous studies have linked RNAP mutations to this transcriptional shift and increased growth but were focused on only two mutations found in the protein's central region. RNAP mutations, however, commonly occur over a large structural range. To explore RNAP mutations' impact, we have introduced 12 RNAP mutations found in laboratory evolution experiments and obtained expression profiles of each. The mutations nearly universally increased growth rates by adjusting said tradeoff away from non-growth functions. In addition to this shift, a few caused condition-specific adaptations. We explored the prevalence of this tradeoff across phylogeny and found it to be a widespread and conserved trend among bacteria.

3.
NAR Genom Bioinform ; 6(2): lqae041, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38774514

ABSTRACT

Microbial genome sequences are rapidly accumulating, enabling large-scale studies of sequence variation. Existing studies primarily focus on coding regions to study amino acid substitution patterns in proteins. However, non-coding regulatory regions also play a distinct role in determining physiologic responses. To investigate intergenic sequence variation on a large-scale, we identified non-coding regulatory region alleles across 2350 Escherichia coli strains. This 'alleleome' consists of 117 781 unique alleles for 1169 reference regulatory regions (transcribing 1975 genes) at single base-pair resolution. We find that 64% of nucleotide positions are invariant, and variant positions vary in a median of just 0.6% of strains. Additionally, non-coding alleles are sufficient to recover E. coli phylogroups. We find that core promoter elements and transcription factor binding sites are significantly conserved, especially those located upstream of essential or highly-expressed genes. However, variability in conservation of transcription factor binding sites is significant both within and across regulons. Finally, we contrast mutations acquired during adaptive laboratory evolution with wild-type variation, finding that the former preferentially alter positions that the latter conserves. Overall, this analysis elucidates the wealth of information found in E. coli non-coding sequence variation and expands pangenomic studies to non-coding regulatory regions at single-nucleotide resolution.

4.
Metab Eng Commun ; 18: e00234, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38711578

ABSTRACT

Kinetic models of metabolism are promising platforms for studying complex metabolic systems and designing production strains. Given the availability of enzyme kinetic data from historical experiments and machine learning estimation tools, a straightforward modeling approach is to assemble kinetic data enzyme by enzyme until a desired scale is reached. However, this type of 'bottom up' parameterization of kinetic models has been difficult due to a number of issues including gaps in kinetic parameters, the complexity of enzyme mechanisms, inconsistencies between parameters obtained from different sources, and in vitro-in vivo differences. Here, we present a computational workflow for the robust estimation of kinetic parameters for detailed mass action enzyme models while taking into account parameter uncertainty. The resulting software package, termed MASSef (the Mass Action Stoichiometry Simulation Enzyme Fitting package), can handle standard 'macroscopic' kinetic parameters, including Km, kcat, Ki, Keq, and nh, as well as diverse reaction mechanisms defined in terms of mass action reactions and 'microscopic' rate constants. We provide three enzyme case studies demonstrating that this approach can identify and reconcile inconsistent data either within in vitro experiments or between in vitro and in vivo enzyme function. We further demonstrate how parameterized enzyme modules can be used to assemble pathway-scale kinetic models consistent with in vivo behavior. This work builds on the legacy of knowledge on kinetic behavior of enzymes by enabling robust parameterization of enzyme kinetic models at scale utilizing the abundance of historical literature data and machine learning parameter estimates.

5.
mSystems ; 9(3): e0125723, 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38349131

ABSTRACT

Limosilactobacillus reuteri, a probiotic microbe instrumental to human health and sustainable food production, adapts to diverse environmental shifts via dynamic gene expression. We applied the independent component analysis (ICA) to 117 RNA-seq data sets to decode its transcriptional regulatory network (TRN), identifying 35 distinct signals that modulate specific gene sets. Our findings indicate that the ICA provides a qualitative advancement and captures nuanced relationships within gene clusters that other methods may miss. This study uncovers the fundamental properties of L. reuteri's TRN and deepens our understanding of its arginine metabolism and the co-regulation of riboflavin metabolism and fatty acid conversion. It also sheds light on conditions that regulate genes within a specific biosynthetic gene cluster and allows for the speculation of the potential role of isoprenoid biosynthesis in L. reuteri's adaptive response to environmental changes. By integrating transcriptomics and machine learning, we provide a system-level understanding of L. reuteri's response mechanism to environmental fluctuations, thus setting the stage for modeling the probiotic transcriptome for applications in microbial food production. IMPORTANCE: We have studied Limosilactobacillus reuteri, a beneficial probiotic microbe that plays a significant role in our health and production of sustainable foods, a type of foods that are nutritionally dense and healthier and have low-carbon emissions compared to traditional foods. Similar to how humans adapt their lifestyles to different environments, this microbe adjusts its behavior by modulating the expression of genes. We applied machine learning to analyze large-scale data sets on how these genes behave across diverse conditions. From this, we identified 35 unique patterns demonstrating how L. reuteri adjusts its genes based on 50 unique environmental conditions (such as various sugars, salts, microbial cocultures, human milk, and fruit juice). This research helps us understand better how L. reuteri functions, especially in processes like breaking down certain nutrients and adapting to stressful changes. More importantly, with our findings, we become closer to using this knowledge to improve how we produce more sustainable and healthier foods with the help of microbes.


Subject(s)
Limosilactobacillus reuteri , Probiotics , Humans , Limosilactobacillus reuteri/genetics , Gene Expression Profiling , Transcriptome/genetics , Machine Learning
6.
PLoS Comput Biol ; 20(1): e1011824, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38252668

ABSTRACT

The transcriptional regulatory network (TRN) of E. coli consists of thousands of interactions between regulators and DNA sequences. Regulons are typically determined either from resource-intensive experimental measurement of functional binding sites, or inferred from analysis of high-throughput gene expression datasets. Recently, independent component analysis (ICA) of RNA-seq compendia has shown to be a powerful method for inferring bacterial regulons. However, it remains unclear to what extent regulons predicted by ICA structure have a biochemical basis in promoter sequences. Here, we address this question by developing machine learning models that predict inferred regulon structures in E. coli based on promoter sequence features. Models were constructed successfully (cross-validation AUROC > = 0.8) for 85% (40/47) of ICA-inferred E. coli regulons. We found that: 1) The presence of a high scoring regulator motif in the promoter region was sufficient to specify regulatory activity in 40% (19/47) of the regulons, 2) Additional features, such as DNA shape and extended motifs that can account for regulator multimeric binding, helped to specify regulon structure for the remaining 60% of regulons (28/47); 3) investigating regulons where initial machine learning models failed revealed new regulator-specific sequence features that improved model accuracy. Finally, we found that strong regulatory binding sequences underlie both the genes shared between ICA-inferred and experimental regulons as well as genes in the E. coli core pan-regulon of Fur. This work demonstrates that the structure of ICA-inferred regulons largely can be understood through the strength of regulator binding sites in promoter regions, reinforcing the utility of top-down inference for regulon discovery.


Subject(s)
Escherichia coli , Regulon , Regulon/genetics , Escherichia coli/genetics , Escherichia coli/metabolism , Bacteria/genetics , Binding Sites/genetics , Promoter Regions, Genetic/genetics , Gene Expression Regulation, Bacterial/genetics , Bacterial Proteins/metabolism
7.
Metabolites ; 13(11)2023 Nov 03.
Article in English | MEDLINE | ID: mdl-37999223

ABSTRACT

Pathway analysis is ubiquitous in biological data analysis due to the ability to integrate small simultaneous changes in functionally related components. While pathways are often defined based on either manual curation or network topological properties, an attractive alternative is to generate pathways around specific functions, in which metabolism can be defined as the production and consumption of specific metabolites. In this work, we present an algorithm, termed MetPath, that calculates pathways for condition-specific production and consumption of specific metabolites. We demonstrate that these pathways have several useful properties. Pathways calculated in this manner (1) take into account the condition-specific metabolic role of a gene product, (2) are localized around defined metabolic functions, and (3) quantitatively weigh the importance of expression to a function based on the flux contribution of the gene product. We demonstrate how these pathways elucidate network interactions between genes across different growth conditions and between cell types. Furthermore, the calculated pathways compare favorably to manually curated pathways in predicting the expression correlation between genes. To facilitate the use of these pathways, we have generated a large compendium of pathways under different growth conditions for E. coli. The MetPath algorithm provides a useful tool for metabolic network-based statistical analyses of high-throughput data.

8.
Nucleic Acids Res ; 51(19): 10176-10193, 2023 10 27.
Article in English | MEDLINE | ID: mdl-37713610

ABSTRACT

Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this data are critical. Here, we assembled a top-down expression and regulation knowledge base for Escherichia coli. The expression component is a 1035-sample, high-quality RNA-seq compendium consisting of data generated in our lab using a single experimental protocol. The compendium contains diverse growth conditions, including: 9 media; 39 supplements, including antibiotics; 42 heterologous proteins; and 76 gene knockouts. Using this resource, we elucidated global expression patterns. We used machine learning to extract 201 modules that account for 86% of known regulatory interactions, creating the regulatory component. With these modules, we identified two novel regulons and quantified systems-level regulatory responses. We also integrated 1675 curated, publicly-available transcriptomes into the resource. We demonstrated workflows for analyzing new data against this knowledge base via deconstruction of regulation during aerobic transition. This resource illuminates the E. coli transcriptome at scale and provides a blueprint for top-down transcriptomic analysis of non-model organisms.


Subject(s)
Escherichia coli , Knowledge Bases , Escherichia coli/genetics , Escherichia coli/metabolism , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , Gene Expression Profiling , Gene Expression Regulation, Bacterial , Transcriptome
9.
mSystems ; 8(3): e0024723, 2023 Jun 29.
Article in English | MEDLINE | ID: mdl-37278526

ABSTRACT

Streptococcus pyogenes can cause a wide variety of acute infections throughout the body of its human host. An underlying transcriptional regulatory network (TRN) is responsible for altering the physiological state of the bacterium to adapt to each unique host environment. Consequently, an in-depth understanding of the comprehensive dynamics of the S. pyogenes TRN could inform new therapeutic strategies. Here, we compiled 116 existing high-quality RNA sequencing data sets of invasive S. pyogenes serotype M1 and estimated the TRN structure in a top-down fashion by performing independent component analysis (ICA). The algorithm computed 42 independently modulated sets of genes (iModulons). Four iModulons contained the nga-ifs-slo virulence-related operon, which allowed us to identify carbon sources that control its expression. In particular, dextrin utilization upregulated the nga-ifs-slo operon by activation of two-component regulatory system CovRS-related iModulons, altering bacterial hemolytic activity compared to glucose or maltose utilization. Finally, we show that the iModulon-based TRN structure can be used to simplify the interpretation of noisy bacterial transcriptome data at the infection site. IMPORTANCE S. pyogenes is a pre-eminent human bacterial pathogen that causes a wide variety of acute infections throughout the body of its host. Understanding the comprehensive dynamics of its TRN could inform new therapeutic strategies. Since at least 43 S. pyogenes transcriptional regulators are known, it is often difficult to interpret transcriptomic data from regulon annotations. This study shows the novel ICA-based framework to elucidate the underlying regulatory structure of S. pyogenes allows us to interpret the transcriptome profile using data-driven regulons (iModulons). Additionally, the observations of the iModulon architecture lead us to identify the multiple regulatory inputs governing the expression of a virulence-related operon. The iModulons identified in this study serve as a powerful guidepost to further our understanding of S. pyogenes TRN structure and dynamics.


Subject(s)
Streptococcus pyogenes , Toxins, Biological , Humans , Streptococcus pyogenes/genetics , Bacterial Proteins/genetics , Virulence/genetics , Toxins, Biological/metabolism , Transcriptome
10.
Nat Commun ; 14(1): 3390, 2023 06 09.
Article in English | MEDLINE | ID: mdl-37296102

ABSTRACT

Elucidating intracellular drug targets is a difficult problem. While machine learning analysis of omics data has been a promising approach, going from large-scale trends to specific targets remains a challenge. Here, we develop a hierarchic workflow to focus on specific targets based on analysis of metabolomics data and growth rescue experiments. We deploy this framework to understand the intracellular molecular interactions of the multi-valent dihydrofolate reductase-targeting antibiotic compound CD15-3. We analyse global metabolomics data utilizing machine learning, metabolic modelling, and protein structural similarity to prioritize candidate drug targets. Overexpression and in vitro activity assays confirm one of the predicted candidates, HPPK (folK), as a CD15-3 off-target. This study demonstrates how established machine learning methods can be combined with mechanistic analyses to improve the resolution of drug target finding workflows for discovering off-targets of a metabolic inhibitor.


Subject(s)
Anti-Bacterial Agents , Proteins , Proteins/chemistry , Metabolomics , Tetrahydrofolate Dehydrogenase/genetics , Power, Psychological
11.
Res Sq ; 2023 Apr 12.
Article in English | MEDLINE | ID: mdl-37090546

ABSTRACT

Fit phenotypes are achieved through optimal transcriptomic allocation. Here, we performed a high-resolution, multi-scale study of the transcriptomic tradeoff between two key fitness phenotypes, stress response (fear) and growth (greed), in Escherichia coli. We introduced twelve RNA polymerase (RNAP) mutations commonly acquired during adaptive laboratory evolution (ALE) and found that single mutations resulted in large shifts in the fear vs. greed tradeoff, likely through destabilizing the rpoB-rpoC interface. RpoS and GAD regulons drive the fear response while ribosomal proteins and the ppGpp regulon underlie greed. Growth rate selection pressure during ALE results in endpoint strains that often have RNAP mutations, with synergistic mutations reflective of particular conditions. A phylogenetic analysis found the tradeoff in numerous bacteria species. The results suggest that the fear vs. greed tradeoff represents a general principle of transcriptome allocation in bacteria where small genetic changes can result in large phenotypic adaptations to growth conditions.

12.
ACS Synth Biol ; 10(12): 3379-3395, 2021 12 17.
Article in English | MEDLINE | ID: mdl-34762392

ABSTRACT

Microbes are being engineered for an increasingly large and diverse set of applications. However, the designing of microbial genomes remains challenging due to the general complexity of biological systems. Adaptive Laboratory Evolution (ALE) leverages nature's problem-solving processes to generate optimized genotypes currently inaccessible to rational methods. The large amount of public ALE data now represents a new opportunity for data-driven strain design. This study describes how novel strain designs, or genome sequences not yet observed in ALE experiments or published designs, can be extracted from aggregated ALE data and demonstrates this by designing, building, and testing three novel Escherichia coli strains with fitnesses comparable to ALE mutants. These designs were achieved through a meta-analysis of aggregated ALE mutations data (63 Escherichia coli K-12 MG1655 based ALE experiments, described by 93 unique environmental conditions, 357 independent evolutions, and 13 957 observed mutations), which additionally revealed global ALE mutation trends that inform on ALE-derived strain design principles. Such informative trends anticipate ALE-derived strain designs as largely gene-centric, as opposed to noncoding, and composed of a relatively small number of beneficial variants (approximately 6). These results demonstrate how strain design efforts can be enhanced by the meta-analysis of aggregated ALE data.


Subject(s)
Escherichia coli K12 , Escherichia coli Proteins , Escherichia coli/genetics , Escherichia coli K12/genetics , Escherichia coli Proteins/genetics , Laboratories , Mutation/genetics
13.
Nat Commun ; 12(1): 3178, 2021 05 26.
Article in English | MEDLINE | ID: mdl-34039963

ABSTRACT

Living systems formed and evolved under constraints that govern their interactions with the inorganic world. These interactions are definable using basic physico-chemical principles. Here, we formulate a comprehensive set of ten governing abiotic constraints that define possible quantitative metabolomes. We apply these constraints to a metabolic network of Escherichia coli that represents 90% of its metabolome. We show that the quantitative metabolomes allowed by the abiotic constraints are consistent with metabolomic and isotope-labeling data. We find that: (i) abiotic constraints drive the evolution of high-affinity phosphate transporters; (ii) Charge-, hydrogen- and magnesium-related constraints underlie transcriptional regulatory responses to osmotic stress; and (iii) hydrogen-ion and charge imbalance underlie transcriptional regulatory responses to acid stress. Thus, quantifying the constraints that the inorganic world imposes on living systems provides insights into their key characteristics, helps understand the outcomes of evolutionary adaptation, and should be considered as a fundamental part of theoretical biology and for understanding the constraints on evolution.


Subject(s)
Adaptation, Physiological , Escherichia coli/physiology , Metabolome/physiology , Stress, Physiological , Acids/metabolism , Biological Evolution , Escherichia coli/chemistry , Escherichia coli Proteins/analysis , Escherichia coli Proteins/metabolism , Gene Expression Regulation/physiology , Hydrogen/metabolism , Magnesium/metabolism , Metabolic Networks and Pathways/physiology , Metabolomics , Osmosis , Phosphate Transport Proteins/metabolism , Phosphates/metabolism
14.
PLoS Comput Biol ; 17(1): e1008208, 2021 01.
Article in English | MEDLINE | ID: mdl-33507922

ABSTRACT

Mathematical models of metabolic networks utilize simulation to study system-level mechanisms and functions. Various approaches have been used to model the steady state behavior of metabolic networks using genome-scale reconstructions, but formulating dynamic models from such reconstructions continues to be a key challenge. Here, we present the Mass Action Stoichiometric Simulation Python (MASSpy) package, an open-source computational framework for dynamic modeling of metabolism. MASSpy utilizes mass action kinetics and detailed chemical mechanisms to build dynamic models of complex biological processes. MASSpy adds dynamic modeling tools to the COnstraint-Based Reconstruction and Analysis Python (COBRApy) package to provide an unified framework for constraint-based and kinetic modeling of metabolic networks. MASSpy supports high-performance dynamic simulation through its implementation of libRoadRunner: the Systems Biology Markup Language (SBML) simulation engine. Three examples are provided to demonstrate how to use MASSpy: (1) a validation of the MASSpy modeling tool through dynamic simulation of detailed mechanisms of enzyme regulation; (2) a feature demonstration using a workflow for generating ensemble of kinetic models using Monte Carlo sampling to approximate missing numerical values of parameters and to quantify biological uncertainty, and (3) a case study in which MASSpy is utilized to overcome issues that arise when integrating experimental data with the computation of functional states of detailed biological mechanisms. MASSpy represents a powerful tool to address challenges that arise in dynamic modeling of metabolic networks, both at small and large scales.


Subject(s)
Computer Simulation , Metabolic Networks and Pathways , Models, Biological , Software , Systems Biology/methods , Kinetics
15.
Nat Commun ; 9(1): 5252, 2018 12 07.
Article in English | MEDLINE | ID: mdl-30531987

ABSTRACT

Knowing the catalytic turnover numbers of enzymes is essential for understanding the growth rate, proteome composition, and physiology of organisms, but experimental data on enzyme turnover numbers is sparse and noisy. Here, we demonstrate that machine learning can successfully predict catalytic turnover numbers in Escherichia coli based on integrated data on enzyme biochemistry, protein structure, and network context. We identify a diverse set of features that are consistently predictive for both in vivo and in vitro enzyme turnover rates, revealing novel protein structural correlates of catalytic turnover. We use our predictions to parameterize two mechanistic genome-scale modelling frameworks for proteome-limited metabolism, leading to significantly higher accuracy in the prediction of quantitative proteome data than previous approaches. The presented machine learning models thus provide a valuable tool for understanding metabolism and the proteome at the genome scale, and elucidate structural, biochemical, and network properties that underlie enzyme kinetics.


Subject(s)
Escherichia coli Proteins/metabolism , Escherichia coli/enzymology , Machine Learning , Metabolic Networks and Pathways , Algorithms , Biocatalysis , Escherichia coli/genetics , Escherichia coli Proteins/genetics , Kinetics , Models, Biological , Proteome/genetics , Proteome/metabolism
16.
Nat Commun ; 9(1): 5270, 2018 12 10.
Article in English | MEDLINE | ID: mdl-30532008

ABSTRACT

Systems biology describes cellular phenotypes as properties that emerge from the complex interactions of individual system components. Little is known about how these interactions have affected the evolution of metabolic enzymes. Here, we combine genome-scale metabolic modeling with population genetics models to simulate the evolution of enzyme turnover numbers (kcats) from a theoretical ancestor with inefficient enzymes. This systems view of biochemical evolution reveals strong epistatic interactions between metabolic genes that shape evolutionary trajectories and influence the magnitude of evolved kcats. Diminishing returns epistasis prevents enzymes from developing higher kcats in all reactions and keeps the organism far from the potential fitness optimum. Multifunctional enzymes cause synergistic epistasis that slows down adaptation. The resulting fitness landscape allows kcat evolution to be convergent. Predicted kcat parameters show a significant correlation with experimental data, validating our modeling approach. Our analysis reveals how evolutionary forces shape modern kcats and the whole of metabolism.


Subject(s)
Enzymes/genetics , Epistasis, Genetic , Escherichia coli Proteins/genetics , Evolution, Molecular , Genome, Bacterial/genetics , Algorithms , Biocatalysis , Enzymes/metabolism , Escherichia coli K12/enzymology , Escherichia coli K12/genetics , Escherichia coli K12/metabolism , Escherichia coli Proteins/metabolism , Kinetics , Models, Genetic
17.
Trends Biochem Sci ; 43(12): 960-969, 2018 12.
Article in English | MEDLINE | ID: mdl-30472988

ABSTRACT

Reaction equilibrium constants (Keqs) are key parameters that impose thermodynamic constraints on the function of a metabolic network. An important approach for Keq estimation is the group contribution method, which utilizes chemical moiety-based estimates of compound formation energies. In this Opinion, we delineate a number of current challenges with the group contribution method, specifically: (i) problems related to the completeness and quality of data necessary for reliable estimation; and (ii) inadequacies of the method to represent the physical properties of compounds. We then highlight a number of promising approaches to deal with the limitations of group contribution methods. Further advancements should lead to more accurate prediction of equilibrium constants and a better representation of cellular function under biophysical constraints.


Subject(s)
Models, Chemical , Kinetics , Thermodynamics
18.
Proc Natl Acad Sci U S A ; 115(44): 11339-11344, 2018 10 30.
Article in English | MEDLINE | ID: mdl-30309961

ABSTRACT

The structure of the metabolic network contains myriad organism-specific variations across the tree of life, but the selection basis for pathway choices in different organisms is not well understood. Here, we examined the metabolic capabilities with respect to cofactor use and pathway thermodynamics of all sequenced organisms in the Kyoto Encyclopedia of Genes and Genomes Database. We found that (i) many biomass precursors have alternate synthesis routes that vary substantially in thermodynamic favorability and energy cost, creating tradeoffs that may be subject to selection pressure; (ii) alternative pathways in amino acid synthesis are characteristically distinguished by the use of biosynthetically unnecessary acyl-CoA cleavage; (iii) distinct choices preferring thermodynamic-favorable or cofactor-use-efficient pathways exist widely among organisms; (iv) cofactor-use-efficient pathways tend to have a greater yield advantage under anaerobic conditions specifically; and (v) lysine biosynthesis in particular exhibits temperature-dependent thermodynamics and corresponding differential pathway choice by thermophiles. These findings present a view on the evolution of metabolic network structure that highlights a key role of pathway thermodynamics and cofactor use in determining organism pathway choices.


Subject(s)
Biosynthetic Pathways/genetics , Biological Evolution , Biomass , Databases, Genetic , Genome/genetics , Metabolic Networks and Pathways/genetics , Phylogeny , Thermodynamics
19.
Biophys J ; 114(11): 2691-2702, 2018 06 05.
Article in English | MEDLINE | ID: mdl-29874618

ABSTRACT

Reaction-equilibrium constants determine the metabolite concentrations necessary to drive flux through metabolic pathways. Group-contribution methods offer a way to estimate reaction-equilibrium constants at wide coverage across the metabolic network. Here, we present an updated group-contribution method with 1) additional curated thermodynamic data used in fitting and 2) capabilities to calculate equilibrium constants as a function of temperature. We first collected and curated aqueous thermodynamic data, including reaction-equilibrium constants, enthalpies of reaction, Gibbs free energies of formation, enthalpies of formation, entropy changes of formation of compounds, and proton- and metal-ion-binding constants. Next, we formulated the calculation of equilibrium constants as a function of temperature and calculated the standard entropy change of formation (ΔfS∘) using a model based on molecular properties. The median absolute error in estimating ΔfS∘ was 0.013 kJ/K/mol. We also estimated magnesium binding constants for 618 compounds using a linear regression model validated against measured data. We demonstrate the improved performance of the current method (8.17 kJ/mol in median absolute residual) over the current state-of-the-art method (11.47 kJ/mol) in estimating the 185 new reactions added in this work. The efforts here fill in gaps for thermodynamic calculations under various conditions, specifically different temperatures and metal-ion concentrations. These, to our knowledge, new capabilities empower the study of thermodynamic driving forces underlying the metabolic function of organisms living under diverse conditions.


Subject(s)
Metabolic Networks and Pathways , Models, Biological , Temperature , Entropy , Linear Models , Magnesium/metabolism
20.
Nat Biotechnol ; 36(3): 272-281, 2018 03.
Article in English | MEDLINE | ID: mdl-29457794

ABSTRACT

Genome-scale network reconstructions have helped uncover the molecular basis of metabolism. Here we present Recon3D, a computational resource that includes three-dimensional (3D) metabolite and protein structure data and enables integrated analyses of metabolic functions in humans. We use Recon3D to functionally characterize mutations associated with disease, and identify metabolic response signatures that are caused by exposure to certain drugs. Recon3D represents the most comprehensive human metabolic network model to date, accounting for 3,288 open reading frames (representing 17% of functionally annotated human genes), 13,543 metabolic reactions involving 4,140 unique metabolites, and 12,890 protein structures. These data provide a unique resource for investigating molecular mechanisms of human metabolism. Recon3D is available at http://vmh.life.


Subject(s)
Computational Biology/methods , Databases, Protein , Metabolic Networks and Pathways/genetics , Databases, Genetic , Humans , Internet , Molecular Sequence Annotation , Open Reading Frames/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...