ABSTRACT
Constraint-based metabolic models have been used for decades to predict the phenotype of microorganisms in different environments. However, quantitative predictions are limited unless labor-intensive measurements of media uptake fluxes are performed. We show how hybrid neural-mechanistic models can serve as an architecture for machine learning providing a way to improve phenotype predictions. We illustrate our hybrid models with growth rate predictions of Escherichia coli and Pseudomonas putida grown in different media and with phenotype predictions of gene knocked-out Escherichia coli mutants. Our neural-mechanistic models systematically outperform constraint-based models and require training set sizes orders of magnitude smaller than classical machine learning methods. Our hybrid approach opens a doorway to enhancing constraint-based modeling: instead of constraining mechanistic models with additional experimental measurements, our hybrid models grasp the power of machine learning while fulfilling mechanistic constrains, thus saving time and resources in typical systems biology or biological engineering projects.
Subject(s)
Biochemical Phenomena , Phenotype , Escherichia coli/genetics , Escherichia coli/metabolism , Models, BiologicalABSTRACT
Spatially resolved transcriptomics (SrT) can investigate organ or tissue architecture from the angle of gene programs that define their molecular complexity. However, computational methods to analyze SrT data underexploit their spatial signature. Inspired by contextual pixel classification strategies applied to image analysis, we developed MULTILAYER to stratify maps into functionally relevant molecular substructures. MULTILAYER applies agglomerative clustering within contiguous locally defined transcriptomes (gene expression elements or "gexels") combined with community detection methods for graphical partitioning. MULTILAYER resolves molecular tissue substructures within a variety of SrT data with superior performance to commonly used dimensionality reduction strategies and still detects differentially expressed genes on par with existing methods. MULTILAYER can process high-resolution as well as multiple SrT data in a comparative mode, anticipating future needs in the field. MULTILAYER provides a digital image perspective for SrT analysis and opens the door to contextual gexel classification strategies for developing self-supervised molecular diagnosis solutions. A record of this paper's transparent peer review process is included in the supplemental information.