Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 56
Filter
1.
BJS Open ; 7(3)2023 05 05.
Article in English | MEDLINE | ID: mdl-37161675

ABSTRACT

BACKGROUND: The gold standard treatment for locally advanced rectal cancer is total mesorectal excision after preoperative chemoradiotherapy. Response to chemoradiotherapy varies, with some patients completely responding to the treatment and some failing to respond at all. Identifying biomarkers of response to chemoradiotherapy could allow patients to avoid unnecessary treatment-associated morbidity rate. While previous studies have attempted to identify such biomarkers, none have reached clinical utility, which may be due to heterogeneity of the cancer. In this study, potential human gene and microbial biomarkers were explored in a cohort of rectal cancer patients who underwent chemoradiotherapy. METHODS: RNA sequencing was carried out on matched tumour and adjacent normal rectum biopsies from patients with rectal cancer with varying chemoradiotherapy responses treated between 2016 and 2019 at two institutions. Enriched genes and microbes from tumours of complete responders were compared with those from tumours of others with lesser response. RESULTS: In 39 patients analysed, enriched gene sets in complete responders indicate involvement of immune responses, including immunoglobulin production, B cell activation and response to bacteria (adjusted P values <0.050). Bacteria such as Ruminococcaceae bacterium and Bacteroides thetaiotaomicron were documented to be abundant in tumours of complete responders compared with all other patients (adjusted P value <0.100). CONCLUSION: These results identify potential genetic and microbial biomarkers of response to chemoradiotherapy in rectal cancer, as well as suggesting a potential mechanism of complete response to chemoradiotherapy that may benefit further testing in the laboratory.


Subject(s)
Rectal Neoplasms , Humans , Rectal Neoplasms/genetics , Rectal Neoplasms/radiotherapy , Chemoradiotherapy
2.
Proc Natl Acad Sci U S A ; 120(5): e2206945119, 2023 01 31.
Article in English | MEDLINE | ID: mdl-36693089

ABSTRACT

Quantifying SARS-like coronavirus (SL-CoV) evolution is critical to understanding the origins of SARS-CoV-2 and the molecular processes that could underlie future epidemic viruses. While genomic analyses suggest recombination was a factor in the emergence of SARS-CoV-2, few studies have quantified recombination rates among SL-CoVs. Here, we infer recombination rates of SL-CoVs from correlated substitutions in sequencing data using a coalescent model with recombination. Our computationally-efficient, non-phylogenetic method infers recombination parameters of both sampled sequences and the unsampled gene pools with which they recombine. We apply this approach to infer recombination parameters for a range of positive-sense RNA viruses. We then analyze a set of 191 SL-CoV sequences (including SARS-CoV-2) and find that ORF1ab and S genes frequently undergo recombination. We identify which SL-CoV sequence clusters have recombined with shared gene pools, and show that these pools have distinct structures and high recombination rates, with multiple recombination events occurring per synonymous substitution. We find that individual genes have recombined with different viral reservoirs. By decoupling contributions from mutation and recombination, we recover the phylogeny of non-recombined portions for many of these SL-CoVs, including the position of SARS-CoV-2 in this clonal phylogeny. Lastly, by analyzing >400,000 SARS-CoV-2 whole genome sequences, we show current diversity levels are insufficient to infer the within-population recombination rate of the virus since the pandemic began. Our work offers new methods for inferring recombination rates in RNA viruses with implications for understanding recombination in SARS-CoV-2 evolution and the structure of clonal relationships and gene pools shaping its origins.


Subject(s)
COVID-19 , Chiroptera , Animals , COVID-19/genetics , SARS-CoV-2/genetics , Gene Pool , Phylogeny , Genomics , Genome, Viral/genetics , Evolution, Molecular
3.
G3 (Bethesda) ; 13(2)2023 02 09.
Article in English | MEDLINE | ID: mdl-36454087

ABSTRACT

DNA methylation in bacteria frequently serves as a simple immune system, allowing recognition of DNA from foreign sources, such as phages or selfish genetic elements. However, DNA methylation also affects other cell phenotypes in a heritable manner (i.e. epigenetically). While there are several examples of methylation affecting transcription in an epigenetic manner in highly localized contexts, it is not well-established how frequently methylation serves a more general epigenetic function over larger genomic scales. To address this question, here we use Oxford Nanopore sequencing to profile DNA modification marks in three natural isolates of Escherichia coli. We first identify the DNA sequence motifs targeted by the methyltransferases in each strain. We then quantify the frequency of methylation at each of these motifs across the entire genome in different growth conditions. We find that motifs in specific regions of the genome consistently exhibit high or low levels of methylation. Furthermore, we show that there are replicable and consistent differences in methylated regions across different growth conditions. This suggests that during growth, E. coli transiently differentiate into distinct methylation states that depend on the growth state, raising the possibility that measuring DNA methylation alone can be used to infer bacterial growth states without additional information such as transcriptome or proteome data. These results show the utility of using Oxford Nanopore sequencing as an economic means to infer DNA methylation status. They also provide new insights into the dynamics of methylation during bacterial growth and provide evidence of differentiated cell states, a transient analog to what is observed in the differentiation of cell types in multicellular organisms.


Subject(s)
DNA Methylation , Escherichia coli , DNA Methylation/genetics , Escherichia coli/genetics , Genomics , Cell Differentiation , DNA , Epigenesis, Genetic
4.
Preprint in English | bioRxiv | ID: ppbiorxiv-505425

ABSTRACT

Quantifying SARS-like coronavirus (SL-CoV) evolution is critical to understanding the origins of SARS-CoV-2 and the molecular processes that could underlie future epidemic viruses. While genomic evidence implicates recombination as a factor in the emergence of SARS-CoV-2, few studies have quantified recombination rates among SL-CoVs. Here, we infer recombination rates of SL-CoVs from correlated substitutions in sequencing data using a coalescent model with recombination. Our computationally-efficient, non-phylogenetic method infers recombination parameters of both sampled sequences and the unsampled gene pools with which they recombine. We apply this approach to infer recombination parameters for a range of positive-sense RNA viruses. We then analyze a set of 191 SL-CoV sequences (including SARS-CoV-2) and find that ORF1ab and S genes frequently undergo recombination. We identify which SL-CoV sequence clusters have recombined with shared gene pools, and show that these pools have distinct structures and high recombination rates, with multiple recombination events occurring per synonymous substitution. We find that individual genes have recombined with different viral reservoirs. By decoupling contributions from mutation and recombination, we recover the phylogeny of non-recombined portions for many of these SL-CoVs, including the position of SARS-CoV-2 in this clonal phylogeny. Lastly, by analyzing 444,145 SARS-CoV-2 whole genome sequences, we show current diversity levels are insufficient to infer the within-population recombination rate of the virus since the pandemic began. Our work offers new methods for inferring recombination rates in RNA viruses with implications for understanding recombination in SARS-CoV-2 evolution and the structure of clonal relationships and gene pools shaping its origins. Significance StatementQuantifying the population genetics of SARS-like coronavirus (SL-CoV) evolution is vital to deciphering the origins of SARS-CoV-2 and pinpointing viruses with epidemic potential. While some Bayesian approaches can quantify recombination for these pathogens, the required simulations of recombination networks do not scale well with the massive amounts of sequences available in the genomics era. Our approach circumvents this by measuring correlated substitutions in sequences and fitting these data to a coalescent model with recombination. This allows us to analyze hundreds of thousands of sample sequences, and infer recombination rates for unsampled viral reservoirs. Our results provide insights into both the clonal relationships of sampled SL-CoV sequence clusters and the evolutionary dynamics of the gene pools with which they recombine.

5.
Nat Ecol Evol ; 6(8): 1165-1179, 2022 08.
Article in English | MEDLINE | ID: mdl-35726087

ABSTRACT

Bacteria often respond to dynamically changing environments by regulating gene expression. Despite this regulation being critically important for growth and survival, little is known about how selection shapes gene regulation in natural populations. To better understand the role natural selection plays in shaping bacterial gene regulation, here we compare differences in the regulatory behaviour of naturally segregating promoter variants from Escherichia coli (which have been subject to natural selection) to randomly mutated promoter variants (which have never been exposed to natural selection). We quantify gene expression phenotypes (expression level, plasticity and noise) for hundreds of promoter variants across multiple environments and show that segregating promoter variants are enriched for mutations with minimal effects on expression level. In many promoters, we infer that there is strong selection to maintain high levels of plasticity, and direct selection to decrease or increase cell-to-cell variability in expression. Taken together, these results expand our knowledge of how gene regulation is affected by natural selection and highlight the power of comparing naturally segregating polymorphisms to de novo random mutations to quantify the action of selection.


Subject(s)
Escherichia coli , Gene Expression Regulation , Escherichia coli/genetics , Phenotype , Promoter Regions, Genetic , Selection, Genetic
6.
R Soc Open Sci ; 9(1): 211550, 2022 Jan.
Article in English | MEDLINE | ID: mdl-35242350

ABSTRACT

Most animal mitochondrial genomes are small, circular and structurally conserved. However, recent work indicates that diverse taxa possess unusual mitochondrial genomes. In Isopoda, species in multiple lineages have atypical and rearranged mitochondrial genomes. However, more species of this speciose taxon need to be evaluated to understand the evolutionary origins of atypical mitochondrial genomes in this group. In this study, we report the presence of an atypical mitochondrial structure in the New Zealand endemic marine isopod, Isocladus armatus. Data from long- and short-read DNA sequencing suggest that I. armatus has two mitochondrial chromosomes. The first chromosome consists of two mitochondrial genomes that have been inverted and fused together in a circular form, and the second chromosome consists of a single mitochondrial genome in a linearized form. This atypical mitochondrial structure has been detected in other isopod lineages, and our data from an additional divergent isopod lineage (Sphaeromatidae) lends support to the hypothesis that atypical structure evolved early in the evolution of Isopoda. Additionally, we find that an asymmetrical site previously observed across many species within Isopoda is absent in I. armatus, but confirm the presence of two asymmetrical sites recently reported in two other isopod species.

7.
Microbiologyopen ; 10(4): e1232, 2021 08.
Article in English | MEDLINE | ID: mdl-34459545

ABSTRACT

The expanding knowledge of the variety of synthetic genetic elements has enabled the construction of new and more efficient genetic circuits and yielded novel insights into molecular mechanisms. However, context dependence, in which interactions between cis- or trans-genetic elements affect the behavior of these elements, can reduce their general applicability or predictability. Genetic insulators, which mitigate unintended context-dependent cis-interactions, have been used to address this issue. One of the most commonly used genetic insulators is a self-splicing ribozyme called RiboJ, which can be used to decouple upstream 5' UTR in mRNA from downstream sequences (e.g., open reading frames). Despite its general use as an insulator, there has been no systematic study quantifying the efficiency of RiboJ splicing or whether this autocatalytic activity is robust to trans- and cis-genetic context. Here, we determine the robustness of RiboJ splicing in the genetic context of six widely divergent E. coli strains. We also check for possible cis-effects by assessing two SNP versions close to the catalytic site of RiboJ. We show that mRNA molecules containing RiboJ are rapidly spliced even during rapid exponential growth and high levels of gene expression, with a mean efficiency of 98%. We also show that neither the cis- nor trans-genetic context has a significant impact on RiboJ activity, suggesting this element is robust to both cis- and trans-genetic changes.


Subject(s)
Escherichia coli/genetics , Gene Expression Regulation, Bacterial/genetics , RNA Splicing/genetics , RNA, Catalytic/genetics , 5' Untranslated Regions/genetics , Escherichia coli/growth & development , Genome, Bacterial/genetics , Lac Operon/genetics , Open Reading Frames/genetics , Plasmids/genetics , Polymorphism, Single Nucleotide/genetics , Promoter Regions, Genetic/genetics , RNA, Messenger/genetics
8.
Emerg Infect Dis ; 27(5): 1317-1322, 2021 05.
Article in English | MEDLINE | ID: mdl-33900175

ABSTRACT

Real-time genomic sequencing has played a major role in tracking the global spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), contributing greatly to disease mitigation strategies. In August 2020, after having eliminated the virus, New Zealand experienced a second outbreak. During that outbreak, New Zealand used genomic sequencing in a primary role, leading to a second elimination of the virus. We generated genomes from 78% of the laboratory-confirmed samples of SARS-CoV-2 from the second outbreak and compared them with the available global genomic data. Genomic sequencing rapidly identified that virus causing the second outbreak in New Zealand belonged to a single cluster, thus resulting from a single introduction. However, successful identification of the origin of this outbreak was impeded by substantial biases and gaps in global sequencing data. Access to a broader and more heterogenous sample of global genomic data would strengthen efforts to locate the source of any new outbreaks.


Subject(s)
COVID-19 , SARS-CoV-2 , Disease Outbreaks , Genomics , Humans , New Zealand/epidemiology
9.
Preprint in English | medRxiv | ID: ppmedrxiv-20221853

ABSTRACT

BackgroundReal-time genomic sequencing has played a major role in tracking the global spread and local transmission of SARS-CoV-2, contributing greatly to disease mitigation strategies. After effectively eliminating the virus, New Zealand experienced a second outbreak of SARS-CoV-2 in August 2020. During this August outbreak, New Zealand utilised genomic sequencing in a primary role to support its track and trace efforts for the first time, leading to a second successful elimination of the virus. MethodsWe generated the genomes of 80% of the laboratory-confirmed samples of SARS-CoV-2 from New Zealands August 2020 outbreak and compared these genomes to the available global genomic data. FindingsGenomic sequencing was able to rapidly identify that the new COVID-19 cases in New Zealand belonged to a single cluster and hence resulted from a single introduction. However, successful identification of the origin of this outbreak was impeded by substantial biases and gaps in global sequencing data. InterpretationAccess to a broader and more heterogenous sample of global genomic data would strengthen efforts to locate the source of any new outbreaks. FundingThis work was funded by the Ministry of Health of New Zealand, New Zealand Ministry of Business, Innovation and Employment COVID-19 Innovation Acceleration Fund (CIAF-0470), ESR Strategic Innovation Fund and the New Zealand Health Research Council (20/1018 and 20/1041).

10.
Biol Methods Protoc ; 5(1): bpaa014, 2020.
Article in English | MEDLINE | ID: mdl-33029559

ABSTRACT

Rapid and cost-efficient whole-genome sequencing of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes coronavirus disease 2019, is critical for understanding viral transmission dynamics. Here we show that using a new multiplexed set of primers in conjunction with the Oxford Nanopore Rapid Barcode library kit allows for faster, simpler, and less expensive SARS-CoV-2 genome sequencing. This primer set results in amplicons that exhibit lower levels of variation in coverage compared to other commonly used primer sets. Using five SARS-CoV-2 patient samples with Cq values between 20 and 31, we show that high-quality genomes can be generated with as few as 10 000 reads (∼5 Mbp of sequence data). We also show that mis-classification of barcodes, which may be more likely when using the Oxford Nanopore Rapid Barcode library prep, is unlikely to cause problems in variant calling. This method reduces the time from RNA to genome sequence by more than half compared to the more standard ligation-based Oxford Nanopore library preparation method at considerably lower costs.

11.
Microbiol Resour Announc ; 9(38)2020 Sep 17.
Article in English | MEDLINE | ID: mdl-32943554

ABSTRACT

Escherichia coli is commonly considered a host-associated bacterium. However, there is evidence that some strains occupy environmental (non-host-associated) niches. Here, we report the complete genomes of 47 Escherichia coli environmental isolates. These will be useful for understanding the dynamics of plasmids, phages, and other repetitive genetic elements.

12.
BMC Bioinformatics ; 21(1): 220, 2020 May 29.
Article in English | MEDLINE | ID: mdl-32471343

ABSTRACT

BACKGROUND: The first step in understanding ecological community diversity and dynamics is quantifying community membership. An increasingly common method for doing so is through metagenomics. Because of the rapidly increasing popularity of this approach, a large number of computational tools and pipelines are available for analysing metagenomic data. However, the majority of these tools have been designed and benchmarked using highly accurate short read data (i.e. Illumina), with few studies benchmarking classification accuracy for long error-prone reads (PacBio or Oxford Nanopore). In addition, few tools have been benchmarked for non-microbial communities. RESULTS: Here we compare simulated long reads from Oxford Nanopore and Pacific Biosciences (PacBio) with high accuracy Illumina read sets to systematically investigate the effects of sequence length and taxon type on classification accuracy for metagenomic data from both microbial and non-microbial communities. We show that very generally, classification accuracy is far lower for non-microbial communities, even at low taxonomic resolution (e.g. family rather than genus). We then show that for two popular taxonomic classifiers, long reads can significantly increase classification accuracy, and this is most pronounced for non-microbial communities. CONCLUSIONS: This work provides insight on the expected accuracy for metagenomic analyses for different taxonomic groups, and establishes the point at which read length becomes more important than error rate for assigning the correct taxon.


Subject(s)
Metagenomics/methods , Computer Simulation , Eukaryota/genetics , High-Throughput Nucleotide Sequencing , Nanopore Sequencing , Sequence Analysis, DNA
13.
Preprint in English | bioRxiv | ID: ppbiorxiv-122648

ABSTRACT

Rapid and cost-efficient whole-genome sequencing of SARS-CoV-2, the virus that causes COVID-19, is critical for understanding viral transmission dynamics. Here we show that using a new multiplexed set of primers in conjunction with the Oxford Nanopore Rapid Barcode library kit allows for faster, simpler, and less expensive SARS-CoV-2 genome sequencing. This primer set results in amplicons that exhibit lower levels of variation in coverage compared to other commonly used primer sets. Using five SARS-CoV-2 patient samples with Cq values between 20 and 31, we show that high-quality genomes can be generated with as few as 10,000 reads (approximately 5 Mbp of sequence data). We also show that mis-classification of barcodes, which may be more likely when using the Oxford Nanopore Rapid Barcode library prep, is unlikely to cause problems in variant calling. This method reduces the time from RNA to genome sequence by more than half compared to the more standard ligation-based Oxford Nanopore library preparation method at considerably lower costs.Competing Interest StatementThe authors have declared no competing interest.View Full Text

14.
Ecol Evol ; 10(24): 13624-13639, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33391668

ABSTRACT

Population genetic structure in the marine environment can be influenced by life-history traits such as developmental mode (biphasic, with distinct adult and larval morphology, and direct development, in which larvae resemble adults) or habitat specificity, as well as geography and selection. Developmental mode is thought to significantly influence dispersal, with direct developers expected to have much lower dispersal potential. However, this prediction can be complicated by the presence of geophysical barriers to dispersal. In this study, we use a panel of 8,020 SNPs to investigate population structure and biogeography over multiple spatial scales for a direct-developing species, the New Zealand endemic marine isopod Isocladus armatus. Because our sampling range is intersected by two well-known biogeographic barriers (the East Cape and the Cook Strait), our study provides an opportunity to understand how such barriers influence dispersal in direct developers. On a small spatial scale (20 km), gene flow between locations is extremely high, suggestive of an island model of migration. However, over larger spatial scales (600 km), populations exhibit a clear pattern of isolation-by-distance. Our results indicate that I. armatus exhibits significant migration across the hypothesized barriers and suggest that large-scale ocean currents associated with these locations do not present a barrier to dispersal. Interestingly, we find evidence of a north-south population genetic break occurring between Mahia and Wellington. While no known geophysical barrier is apparent in this area, it coincides with the location of a proposed border between bioregions. Analysis of loci under selection revealed that both isolation-by-distance and adaption may be contributing to the degree of population structure we have observed here. We conclude that developmental life history largely predicts dispersal in the intertidal isopod I. armatus. However, localized biogeographic processes can disrupt this expectation, and this may explain the potential meta-population detected in the Auckland region.

15.
Nat Commun ; 9(1): 212, 2018 01 15.
Article in English | MEDLINE | ID: mdl-29335514

ABSTRACT

Much is still not understood about how gene regulatory interactions control cell fate decisions in single cells, in part due to the difficulty of directly observing gene regulatory processes in vivo. We introduce here a novel integrated setup consisting of a microfluidic chip and accompanying analysis software that enable long-term quantitative tracking of growth and gene expression in single cells. The dual-input Mother Machine (DIMM) chip enables controlled and continuous variation of external conditions, allowing direct observation of gene regulatory responses to changing conditions in single cells. The Mother Machine Analyzer (MoMA) software achieves unprecedented accuracy in segmenting and tracking cells, and streamlines high-throughput curation with a novel leveraged editing procedure. We demonstrate the power of the method by uncovering several novel features of an iconic gene regulatory program: the induction of Escherichia coli's lac operon in response to a switch from glucose to lactose.


Subject(s)
Gene Expression Regulation, Bacterial , Microfluidic Analytical Techniques/methods , Single-Cell Analysis/methods , Software , Algorithms , Cell Tracking/instrumentation , Cell Tracking/methods , Escherichia coli/cytology , Escherichia coli/drug effects , Escherichia coli/genetics , Glucose/pharmacology , Lac Operon/genetics , Lactose/pharmacology , Single-Cell Analysis/instrumentation
16.
Genome Announc ; 5(43)2017 Oct 26.
Article in English | MEDLINE | ID: mdl-29074662

ABSTRACT

Seven mycobacteriophages from distinct geographical locations were isolated, using Mycobacterium smegmatis mc2155 as the host, and then purified and sequenced. All of the genomes are related to cluster A mycobacteriophages, BobSwaget and Lokk in subcluster A2; Fred313, KADY, Stagni, and StepMih in subcluster A3; and MyraDee in subcluster A18, the first phage to be assigned to that subcluster.

17.
BMC Microbiol ; 16(1): 203, 2016 09 06.
Article in English | MEDLINE | ID: mdl-27599549

ABSTRACT

BACKGROUND: Gene essentiality - whether or not a gene is necessary for cell growth - is a fundamental component of gene function. It is not well established how quickly gene essentiality can change, as few studies have compared empirical measures of essentiality between closely related organisms. RESULTS: Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in the bacterial pathogen Shigella flexneri 2a 2457T on a genome-wide scale. Superficial analysis of this data suggested that 481 protein-coding genes in this Shigella strain are critical for robust cellular growth on rich media. Comparison of this set of genes with a gold-standard data set of essential genes in the closely related Escherichia coli K12 BW25113 revealed that an excessive number of genes appeared essential in Shigella but non-essential in E. coli. Importantly, and in converse to this comparison, we found no genes that were essential in E. coli and non-essential in Shigella, implying that many genes were artefactually inferred as essential in Shigella. Controlling for such artefacts resulted in a much smaller set of discrepant genes. Among these, we identified three sets of functionally related genes, two of which have previously been implicated as critical for Shigella growth, but which are dispensable for E. coli growth. CONCLUSIONS: The data presented here highlight the small number of protein coding genes for which we have strong evidence that their essentiality status differs between the closely related bacterial taxa E. coli and Shigella. A set of genes involved in acetate utilization provides a canonical example. These results leave open the possibility of developing strain-specific antibiotic treatments targeting such differentially essential genes, but suggest that such opportunities may be rare in closely related bacteria.


Subject(s)
Escherichia coli/growth & development , Escherichia coli/genetics , Gene Deletion , Genes, Essential/genetics , Genes, Essential/physiology , Shigella/growth & development , Shigella/genetics , Anti-Bacterial Agents/therapeutic use , Bacterial Proteins/genetics , Base Sequence , Chromosomes, Bacterial , DNA Transposable Elements , DNA, Bacterial , Escherichia coli K12/genetics , Escherichia coli K12/growth & development , Gene Expression Profiling , Genes, Bacterial/genetics , Mutagenesis , Open Reading Frames/genetics , Plasmids , Polymorphism, Single Nucleotide/physiology , Shigella flexneri/genetics , Shigella flexneri/growth & development , Species Specificity
18.
Elife ; 42015 Jun 17.
Article in English | MEDLINE | ID: mdl-26080931

ABSTRACT

Although it is often tacitly assumed that gene regulatory interactions are finely tuned, how accurate gene regulation could evolve from a state without regulation is unclear. Moreover, gene expression noise would seem to impede the evolution of accurate gene regulation, and previous investigations have provided circumstantial evidence that natural selection has acted to lower noise levels. By evolving synthetic Escherichia coli promoters de novo, we here show that, contrary to expectations, promoters exhibit low noise by default. Instead, selection must have acted to increase the noise levels of highly regulated E. coli promoters. We present a general theory of the interplay between gene expression noise and gene regulation that explains these observations. The theory shows that propagation of expression noise from regulators to their targets is not an unwanted side-effect of regulation, but rather acts as a rudimentary form of regulation that facilitates the evolution of more accurate regulation.


Subject(s)
Evolution, Molecular , Gene Expression Regulation/physiology , Models, Genetic , Selection, Genetic , Directed Molecular Evolution/methods , Escherichia coli/genetics , Promoter Regions, Genetic/genetics , Stochastic Processes
19.
Mol Biol Evol ; 31(5): 1077-88, 2014 May.
Article in English | MEDLINE | ID: mdl-24600054

ABSTRACT

Studies of microbial evolutionary dynamics are being transformed by the availability of affordable high-throughput sequencing technologies, which allow whole-genome sequencing of hundreds of related taxa in a single study. Reconstructing a phylogenetic tree of these taxa is generally a crucial step in any evolutionary analysis. Instead of constructing genome assemblies for all taxa, annotating these assemblies, and aligning orthologous genes, many recent studies 1) directly map raw sequencing reads to a single reference sequence, 2) extract single nucleotide polymorphisms (SNPs), and 3) infer the phylogenetic tree using maximum likelihood methods from the aligned SNP positions. However, here we show that, when using such methods to reconstruct phylogenies from sets of simulated sequences, both the exclusion of nonpolymorphic positions and the alignment to a single reference genome, introduce systematic biases and errors in phylogeny reconstruction. To address these problems, we developed a new method that combines alignments from mappings to multiple reference sequences and show that this successfully removes biases from the reconstructed phylogenies. We implemented this method as a web server named REALPHY (Reference sequence Alignment-based Phylogeny builder), which fully automates phylogenetic reconstruction from raw sequencing reads.


Subject(s)
Genomics/methods , Phylogeny , Algorithms , Computer Simulation , Escherichia coli/genetics , Evolution, Molecular , Genome, Bacterial , High-Throughput Nucleotide Sequencing , Likelihood Functions , Models, Genetic , Polymorphism, Single Nucleotide , Pseudomonas syringae/genetics , Reproducibility of Results , Sequence Alignment , Sinorhizobium meliloti/genetics
20.
Proc Natl Acad Sci U S A ; 111(8): 3044-9, 2014 Feb 25.
Article in English | MEDLINE | ID: mdl-24516157

ABSTRACT

Determining the molecular changes that give rise to functional innovations is a major unresolved problem in biology. The paucity of examples has served as a significant hindrance in furthering our understanding of this process. Here we used experimental evolution with the bacterium Escherichia coli to quantify the molecular changes underlying functional innovation in 68 independent instances ranging over 22 different metabolic functions. Using whole-genome sequencing, we show that the relative contribution of regulatory and structural mutations depends on the cellular context of the metabolic function. In addition, we find that regulatory mutations affect genes that act in pathways relevant to the novel function, whereas structural mutations affect genes that act in unrelated pathways. Finally, we use population genetic modeling to show that the relative contributions of regulatory and structural mutations during functional innovation may be affected by population size. These results provide a predictive framework for the molecular basis of evolutionary innovation, which is essential for anticipating future evolutionary trajectories in the face of rapid environmental change.


Subject(s)
Adaptation, Biological/genetics , Escherichia coli/genetics , Evolution, Molecular , Metabolic Networks and Pathways/genetics , Models, Genetic , Phenotype , Base Sequence , DNA, Intergenic/genetics , Directed Molecular Evolution/methods , Escherichia coli/metabolism , Genetics, Population , Molecular Sequence Data , Mutation/genetics , Population Density , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...