Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Commun ; 14(1): 7690, 2023 Nov 24.
Article in English | MEDLINE | ID: mdl-38001096

ABSTRACT

Surveillance programs for managing antimicrobial resistance (AMR) have yielded thousands of genomes suited for data-driven mechanism discovery. We present a workflow integrating pangenomics, gene annotation, and machine learning to identify AMR genes at scale. When applied to 12 species, 27,155 genomes, and 69 drugs, we 1) find AMR gene transfer mostly confined within related species, with 925 genes in multiple species but just eight in multiple phylogenetic classes, 2) demonstrate that discovery-oriented support vector machines outperform contemporary methods at recovering known AMR genes, recovering 263 genes compared to 145 by Pyseer, and 3) identify 142 AMR gene candidates. Validation of two candidates in E. coli BW25113 reveals cases of conditional resistance: ΔcycA confers ciprofloxacin resistance in minimal media with D-serine, and frdD V111D confers ampicillin resistance in the presence of ampC by modifying the overlapping promoter. We expect this approach to be adaptable to other species and phenotypes.


Subject(s)
Anti-Bacterial Agents , Escherichia coli , Anti-Bacterial Agents/pharmacology , Escherichia coli/genetics , Drug Resistance, Bacterial/genetics , Phylogeny , Ciprofloxacin/pharmacology
2.
Food Microbiol ; 115: 104334, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37567624

ABSTRACT

Lactobacillaceae represent a large family of important microbes that are foundational to the food industry. Many genome sequences of Lactobacillaceae strains are now available, enabling us to conduct a comprehensive pangenome analysis of this family. We collected 3591 high-quality genomes from public sources and found that: 1) they contained enough genomes for 26 species to perform a pangenomic analysis, 2) the normalized Heap's coefficient λ (a measure of pangenome openness) was found to have an average value of 0.27 (ranging from 0.07 to 0.37), 3) the pangenome openness was correlated with the abundance and genomic location of transposons and mobilomes, 4) the pangenome for each species was divided into core, accessory, and rare genomes, that highlight the species-specific properties (such as motility and restriction-modification systems), 5) the pangenome of Lactiplantibacillus plantarum (which contained the highest number of genomes found amongst the 26 species studied) contained nine distinct phylogroups, and 6) genome mining revealed a richness of detected biosynthetic gene clusters, with functions ranging from antimicrobial and probiotic to food preservation, but ∼93% were of unknown function. This study provides the first in-depth comparative pangenomics analysis of the Lactobacillaceae family.


Subject(s)
Genomics , Lactobacillaceae , Phylogeny
3.
Genome Biol ; 24(1): 183, 2023 08 08.
Article in English | MEDLINE | ID: mdl-37553643

ABSTRACT

BACKGROUND: Cumulative sequencing efforts have yielded enough genomes to construct pangenomes for dozens of bacterial species and elucidate intraspecies gene conservation. Given the diversity of organisms for which this is achievable, similar analyses for ancestral species are feasible through the integration of pangenomics and phylogenetics, promising deeper insights into the nature of ancient life. RESULTS: We construct pangenomes for 183 bacterial species from 54,085 genomes and identify their core genomes using a novel statistical model to estimate genome-specific error rates and underlying gene frequencies. The core genomes are then integrated into a phylogenetic tree to reconstruct the core genome of the last bacterial common ancestor (LBCA), yielding three main results: First, the gene content of modern and ancestral core genomes are diverse at the level of individual genes but are similarly distributed by functional category and share several poorly characterized genes. Second, the LBCA core genome is distinct from any individual modern core genome but has many fundamental biological systems intact, especially those involving translation machinery and biosynthetic pathways to all major nucleotides and amino acids. Third, despite this metabolic versatility, the LBCA core genome likely requires additional non-core genes for viability, based on comparisons with the minimal organism, JCVI-Syn3A. CONCLUSIONS: These results suggest that many cellular systems commonly conserved in modern bacteria were not just present in ancient bacteria but were nearly immutable with respect to short-term intraspecies variation. Extending this analysis to other domains of life will likely provide similar insights into more distant ancestral species.


Subject(s)
Evolution, Molecular , Genome , Phylogeny , Gene Frequency , Bacteria/genetics , Genome, Bacterial
4.
BMC Genomics ; 23(1): 7, 2022 Jan 04.
Article in English | MEDLINE | ID: mdl-34983386

ABSTRACT

BACKGROUND: With the exponential growth of publicly available genome sequences, pangenome analyses have provided increasingly complete pictures of genetic diversity for many microbial species. However, relatively few studies have scaled beyond single pangenomes to compare global genetic diversity both within and across different species. We present here several methods for "comparative pangenomics" that can be used to contextualize multi-pangenome scale genetic diversity with gene function for multiple species at multiple resolutions: pangenome shape, genes, sequence variants, and positions within variants. RESULTS: Applied to 12,676 genomes across 12 microbial pathogenic species, we observed several shared resolution-specific patterns of genetic diversity: First, pangenome openness is associated with species' phylogenetic placement. Second, relationships between gene function and frequency are conserved across species, with core genomes enriched for metabolic and ribosomal genes and accessory genomes for trafficking, secretion, and defense-associated genes. Third, genes in core genomes with the highest sequence diversity are functionally diverse. Finally, certain protein domains are consistently mutation enriched across multiple species, especially among aminoacyl-tRNA synthetases where the extent of a domain's mutation enrichment is strongly function-dependent. CONCLUSIONS: These results illustrate the value of each resolution at uncovering distinct aspects in the relationship between genetic and functional diversity across multiple species. With the continued growth of the number of sequenced genomes, these methods will reveal additional universal patterns of genetic diversity at the pangenome scale.


Subject(s)
Phylogeny
5.
PLoS Comput Biol ; 16(3): e1007608, 2020 03.
Article in English | MEDLINE | ID: mdl-32119670

ABSTRACT

The evolution of antimicrobial resistance (AMR) poses a persistent threat to global public health. Sequencing efforts have already yielded genome sequences for thousands of resistant microbial isolates and require robust computational tools to systematically elucidate the genetic basis for AMR. Here, we present a generalizable machine learning workflow for identifying genetic features driving AMR based on constructing reference strain-agnostic pan-genomes and training random subspace ensembles (RSEs). This workflow was applied to the resistance profiles of 14 antimicrobials across three urgent threat pathogens encompassing 288 Staphylococcus aureus, 456 Pseudomonas aeruginosa, and 1588 Escherichia coli genomes. We find that feature selection by RSE detects known AMR associations more reliably than common statistical tests and previous ensemble approaches, identifying a total of 45 known AMR-conferring genes and alleles across the three organisms, as well as 25 candidate associations backed by domain-level annotations. Furthermore, we find that results from the RSE approach are consistent with existing understanding of fluoroquinolone (FQ) resistance due to mutations in the main drug targets, gyrA and parC, in all three organisms, and suggest the mutational landscape of those genes with respect to FQ resistance is simple. As larger datasets become available, we expect this approach to more reliably predict AMR determinants for a wider range of microbial pathogens.


Subject(s)
Computational Biology/methods , Drug Resistance, Bacterial/genetics , Genome, Bacterial/genetics , Anti-Bacterial Agents/pharmacology , Anti-Infective Agents , Drug Resistance, Multiple, Bacterial/drug effects , Escherichia coli/genetics , Fluoroquinolones/pharmacology , Humans , Machine Learning , Microbial Sensitivity Tests , Pseudomonas aeruginosa/genetics , Staphylococcus aureus/genetics , Whole Genome Sequencing/methods
6.
Microb Cell Fact ; 15: 61, 2016 Apr 11.
Article in English | MEDLINE | ID: mdl-27067813

ABSTRACT

BACKGROUND: Vanillin is an industrially valuable molecule that can be produced from simple carbon sources in engineered microorganisms such as Saccharomyces cerevisiae and Escherichia coli. In E. coli, de novo production of vanillin was demonstrated previously as a proof of concept. In this study, a series of data-driven experiments were performed in order to better understand limitations associated with biosynthesis of vanillate, which is the immediate precursor to vanillin. RESULTS: Time-course experiments monitoring production of heterologous metabolites in the E. coli de novo vanillin pathway revealed a bottleneck in conversion of protocatechuate to vanillate. Perturbations in central metabolism intended to increase flux into the heterologous pathway increased average vanillate titers from 132 to 205 mg/L, but protocatechuate remained the dominant heterologous product on a molar basis. SDS-PAGE, in vitro activity measurements, and L-methionine supplementation experiments suggested that the decline in conversion rate was influenced more by limited availability of the co-substrate S-adenosyl-L-methionine (AdoMet or SAM) than by loss of activity of the heterologous O-methyltransferase. The combination of metJ deletion and overexpression of feedback-resistant variants of metA and cysE, which encode enzymes involved in SAM biosynthesis, increased average de novo vanillate titers by an additional 33% (from 205 to 272 mg/L). An orthogonal strategy intended to improve SAM regeneration through overexpression of native mtn and luxS genes resulted in a 25% increase in average de novo vanillate titers (from 205 to 256 mg/L). Vanillate production improved further upon supplementation with methionine (as high as 419 ± 58 mg/L), suggesting potential for additional enhancement by increasing SAM availability. CONCLUSIONS: Results from this study demonstrate context dependency of engineered pathways and highlight the limited methylation capacity of E. coli. Unlike in previous efforts to improve SAM or methionine biosynthesis, we pursued two orthogonal strategies that are each aimed at deregulating multiple reactions. Our results increase the working knowledge of SAM biosynthesis engineering and provide a framework for improving titers of metabolic products dependent upon methylation reactions.


Subject(s)
Benzaldehydes/metabolism , Escherichia coli , Metabolic Networks and Pathways/genetics , Methyltransferases/genetics , Methyltransferases/metabolism , S-Adenosylmethionine/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism , Gene Expression Regulation, Bacterial , Metabolic Engineering , Methylation , Organisms, Genetically Modified
SELECTION OF CITATIONS
SEARCH DETAIL
...