ABSTRACT
We present genome engineering technologies that are capable of fundamentally reengineering genomes from the nucleotide to the megabase scale. We used multiplex automated genome engineering (MAGE) to site-specifically replace all 314 TAG stop codons with synonymous TAA codons in parallel across 32 Escherichia coli strains. This approach allowed us to measure individual recombination frequencies, confirm viability for each modification, and identify associated phenotypes. We developed hierarchical conjugative assembly genome engineering (CAGE) to merge these sets of codon modifications into genomes with 80 precise changes, which demonstrate that these synonymous codon substitutions can be combined into higher-order strains without synthetic lethal effects. Our methods treat the chromosome as both an editable and an evolvable template, permitting the exploration of vast genetic landscapes.
Subject(s)
Chromosomes, Bacterial/genetics , Codon, Terminator , Conjugation, Genetic , Escherichia coli/genetics , Genetic Engineering/methods , Genome, Bacterial , Directed Molecular Evolution , Escherichia coli/growth & development , Escherichia coli/physiology , Genomic Instability , Mutagenesis, Site-Directed , Mutation , Phenotype , Recombination, Genetic , Templates, GeneticABSTRACT
We perform a genome-wide analysis of the transition between transcriptional initiation and elongation in Escherichia coli by determining the association of core RNA polymerase (RNAP) and the promoter-recognition factor sigma70 with respect to RNA transcripts. We identify 1286 sigma70-associated promoters, including many internal to known operons, and demonstrate that sigma70 is usually released very rapidly from elongating RNAP complexes. On average, RNAP density is higher at the promoter than in the coding sequence, although the ratio is highly variable among different transcribed regions. Strikingly, a significant fraction of RNAP-bound promoters is not associated with transcriptional activity, perhaps due to an intrinsic energetic barrier to promoter escape. Thus, the transition from transcriptional initiation to elongation is highly variable, often rate limiting, and in some cases is essentially blocked such that RNAP is effectively "poised" to transcribe only under the appropriate environmental conditions. The genomic pattern of RNAP density in E. coli differs from that in yeast and mammalian cells.
Subject(s)
Escherichia coli/genetics , Genome, Bacterial , Transcription, Genetic , Chromatin Immunoprecipitation , DNA-Directed RNA Polymerases/genetics , Oligonucleotide Array Sequence Analysis , Promoter Regions, Genetic/genetics , RNA, Messenger/genetics , Sigma Factor/geneticsABSTRACT
Genome sequencing currently requires DNA from pools of numerous nearly identical cells (clones), leaving the genome sequences of many difficult-to-culture microorganisms unattainable. We report a sequencing strategy that eliminates culturing of microorganisms by using real-time isothermal amplification to form polymerase clones (plones) from the DNA of single cells. Two Escherichia coli plones, analyzed by Affymetrix chip hybridization, demonstrate that plonal amplification is specific and the bias is randomly distributed. Whole-genome shotgun sequencing of Prochlorococcus MIT9312 plones showed 62% coverage of the genome from one plone at a sequencing depth of 3.5x, and 66% coverage from a second plone at a depth of 4.7x. Genomic regions not revealed in the initial round of sequencing are recovered by sequencing PCR amplicons derived from plonal DNA. The mutation rate in single-cell amplification is <2 x 10(5), better than that of current genome sequencing standards. Polymerase cloning should provide a critical tool for systematic characterization of genome diversity in the biosphere.
Subject(s)
Chromosome Mapping/methods , Cloning, Molecular/methods , DNA, Bacterial/genetics , DNA, Bacterial/metabolism , DNA-Directed DNA Polymerase/metabolism , Genome, Bacterial/genetics , Nucleic Acid Amplification Techniques/methodsABSTRACT
Genomes of eukaryotic organisms are packaged into nucleosomes that restrict the binding of transcription factors to accessible regions. Bacteria do not contain histones, but they have nucleoid-associated proteins that have been proposed to function analogously. Here, we combine chromatin immunoprecipitation and high-density oligonucleotide microarrays to define the in vivo DNA targets of the LexA transcriptional repressor in Escherichia coli. We demonstrate a near-universal relationship between the presence of a LexA sequence motif, LexA binding in vitro, and LexA binding in vivo, suggesting that a suitable recognition site for LexA is sufficient for binding in vivo. Consistent with this observation, LexA binds comparably to ectopic target sites introduced at various positions in the genome. We also identify approximately 20 novel LexA targets that lack a canonical LexA sequence motif, are not bound by LexA in vitro, and presumably require an additional factor for binding in vivo. Our results indicate that, unlike eukaryotic genomes, the E. coli genome is permissive to transcription factor binding. The permissive nature of the E. coli genome has important consequences for the nature of transcriptional regulatory proteins, biological specificity, and evolution.
Subject(s)
Bacterial Proteins/metabolism , DNA, Bacterial/metabolism , Escherichia coli/physiology , Gene Expression Regulation, Bacterial/physiology , Genome, Bacterial/physiology , Serine Endopeptidases/metabolism , Bacterial Proteins/genetics , Binding Sites/physiology , Chromatin Immunoprecipitation/methods , DNA, Bacterial/genetics , DNA-Binding Proteins/metabolism , Eukaryotic Cells/physiology , Evolution, Molecular , Nucleosomes/metabolism , Protein Binding/physiology , Serine Endopeptidases/geneticsABSTRACT
We describe a DNA sequencing technology in which a commonly available, inexpensive epifluorescence microscope is converted to rapid nonelectrophoretic DNA sequencing automation. We apply this technology to resequence an evolved strain of Escherichia coli at less than one error per million consensus bases. A cell-free, mate-paired library provided single DNA molecules that were amplified in parallel to 1-micrometer beads by emulsion polymerase chain reaction. Millions of beads were immobilized in a polyacrylamide gel and subjected to automated cycles of sequencing by ligation and four-color imaging. Cost per base was roughly one-ninth as much as that of conventional sequencing. Our protocols were implemented with off-the-shelf instrumentation and reagents.