Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 23
Filter
Add more filters










Publication year range
1.
Infect Immun ; 71(5): 2775-86, 2003 May.
Article in English | MEDLINE | ID: mdl-12704152

ABSTRACT

We determined the complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp). Shigella species cause >1 million deaths per year from dysentery and diarrhea and have a lifestyle that is markedly different from those of closely related bacteria, including Escherichia coli. The genome exhibits the backbone and island mosaic structure of E. coli pathogens, albeit with much less horizontally transferred DNA and lacking 357 genes present in E. coli. The strain is distinctive in its large complement of insertion sequences, with several genomic rearrangements mediated by insertion sequences, 12 cryptic prophages, 372 pseudogenes, and 195 S. flexneri-specific genes. The 2457T genome was also compared with that of a recently sequenced S. flexneri 2a strain, 301. Our data are consistent with Shigella being phylogenetically indistinguishable from E. coli. The S. flexneri-specific regions contain many genes that could encode proteins with roles in virulence. Analysis of these will reveal the genetic basis for aspects of this pathogenic organism's distinctive lifestyle that have yet to be explained.


Subject(s)
Genome, Bacterial , Genomics , Shigella flexneri/genetics , Base Sequence , DNA Transposable Elements , Genes, Bacterial , Molecular Sequence Data , Phylogeny , Plasmids , Shigella flexneri/classification , Shigella flexneri/pathogenicity
2.
Proc Natl Acad Sci U S A ; 99(26): 17020-4, 2002 Dec 24.
Article in English | MEDLINE | ID: mdl-12471157

ABSTRACT

We present the complete genome sequence of uropathogenic Escherichia coli, strain CFT073. A three-way genome comparison of the CFT073, enterohemorrhagic E. coli EDL933, and laboratory strain MG1655 reveals that, amazingly, only 39.2% of their combined (nonredundant) set of proteins actually are common to all three strains. The pathogen genomes are as different from each other as each pathogen is from the benign strain. The difference in disease potential between O157:H7 and CFT073 is reflected in the absence of genes for type III secretion system or phage- and plasmid-encoded toxins found in some classes of diarrheagenic E. coli. The CFT073 genome is particularly rich in genes that encode potential fimbrial adhesins, autotransporters, iron-sequestration systems, and phase-switch recombinases. Striking differences exist between the large pathogenicity islands of CFT073 and two other well-studied uropathogenic E. coli strains, J96 and 536. Comparisons indicate that extraintestinal pathogenic E. coli arose independently from multiple clonal lineages. The different E. coli pathotypes have maintained a remarkable synteny of common, vertically evolved genes, whereas many islands interrupting this common backbone have been acquired by different horizontal transfer events in each strain.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial , Pyelonephritis/microbiology , Acute Disease , Base Sequence , Escherichia coli/pathogenicity , Female , Genetic Structures , Humans , Molecular Sequence Data , Open Reading Frames
3.
Genome Res ; 11(9): 1584-93, 2001 Sep.
Article in English | MEDLINE | ID: mdl-11544203

ABSTRACT

We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from ensembles of single DNA molecules. Shotgun sequencing of bacterial genomes remains labor-intensive, despite advances in sequencing technology. This is partly due to manual intervention required during the last stages of finishing. The applicability of optical mapping to this problem was enhanced by advances in machine vision techniques that improved mapping throughput and created a path to full automation of mapping. Comparisons were made between maps and sequence data that characterized sequence gaps and guided nascent assemblies.


Subject(s)
Contig Mapping/methods , Escherichia coli O157/genetics , Genome, Bacterial , Restriction Mapping/methods , Sequence Analysis, DNA/methods , Software
4.
Infect Immun ; 69(5): 3271-85, 2001 May.
Article in English | MEDLINE | ID: mdl-11292750

ABSTRACT

The complete sequence analysis of the 210-kb Shigella flexneri 5a virulence plasmid was determined. Shigella spp. cause dysentery and diarrhea by invasion and spread through the colonic mucosa. Most of the known Shigella virulence determinants are encoded on a large plasmid that is unique to virulent strains of Shigella and enteroinvasive Escherichia coli; these known genes account for approximately 30 to 35% of the virulence plasmid. In the complete sequence of the virulence plasmid, 286 open reading frames (ORFs) were identified. An astonishing 153 (53%) of these were related to known and putative insertion sequence (IS) elements; no known bacterial plasmid has previously been described with such a high proportion of IS elements. Four new IS elements were identified. Fifty putative proteins show no significant homology to proteins of known function; of these, 18 have a G+C content of less than 40%, typical of known virulence genes on the plasmid. These 18 constitute potentially unknown virulence genes. Two alleles of shet2 and five alleles of ipaH were also identified on the plasmid. Thus, the plasmid sequence suggests a remarkable history of IS-mediated acquisition of DNA across bacterial species. The complete sequence will permit targeted characterization of potential new Shigella virulence determinants.


Subject(s)
DNA, Bacterial/chemistry , Plasmids , Shigella flexneri/genetics , Shigella flexneri/pathogenicity , Amino Acid Sequence , Base Sequence , Biological Evolution , DNA Transposable Elements , Humans , Molecular Sequence Data , Open Reading Frames , Replicon , Virulence
5.
Nature ; 409(6819): 529-33, 2001 Jan 25.
Article in English | MEDLINE | ID: mdl-11206551

ABSTRACT

The bacterium Escherichia coli O157:H7 is a worldwide threat to public health and has been implicated in many outbreaks of haemorrhagic colitis, some of which included fatalities caused by haemolytic uraemic syndrome. Close to 75,000 cases of O157:H7 infection are now estimated to occur annually in the United States. The severity of disease, the lack of effective treatment and the potential for large-scale outbreaks from contaminated food supplies have propelled intensive research on the pathogenesis and detection of E. coli O157:H7 (ref. 4). Here we have sequenced the genome of E. coli O157:H7 to identify candidate genes responsible for pathogenesis, to develop better methods of strain detection and to advance our understanding of the evolution of E. coli, through comparison with the genome of the non-pathogenic laboratory strain E. coli K-12 (ref. 5). We find that lateral gene transfer is far more extensive than previously anticipated. In fact, 1,387 new genes encoded in strain-specific clusters of diverse sizes were found in O157:H7. These include candidate virulence factors, alternative metabolic capacities, several prophages and other new functions--all of which could be targets for surveillance.


Subject(s)
Escherichia coli O157/genetics , Genome, Bacterial , Base Sequence , Chromosome Mapping , Chromosomes, Bacterial , Escherichia coli Infections/microbiology , Escherichia coli O157/pathogenicity , Genetic Variation , Humans , Molecular Sequence Data , Polymorphism, Genetic , Sequence Analysis, DNA , Species Specificity , Virulence/genetics
6.
Appl Environ Microbiol ; 66(8): 3310-29, 2000 Aug.
Article in English | MEDLINE | ID: mdl-10919786

ABSTRACT

Photorhabdus luminescens is a pathogenic bacterium that lives in the guts of insect-pathogenic nematodes. After invasion of an insect host by a nematode, bacteria are released from the nematode gut and help kill the insect, in which both the bacteria and the nematodes subsequently replicate. However, the bacterial virulence factors associated with this "symbiosis of pathogens" remain largely obscure. In order to identify genes encoding potential virulence factors, we performed approximately 2,000 random sequencing reads from a P. luminescens W14 genomic library. We then compared the sequences obtained to sequences in existing gene databases and to the Escherichia coli K-12 genome sequence. Here we describe the different classes of potential virulence factors found. These factors include genes that putatively encode Tc insecticidal toxin complexes, Rtx-like toxins, proteases and lipases, colicin and pyocins, and various antibiotics. They also include a diverse array of secretion (e.g., type III), iron uptake, and lipopolysaccharide production systems. We speculate on the potential functions of each of these gene classes in insect infection and also examine the extent to which the invertebrate pathogen P. luminescens shares potential antivertebrate virulence factors. The implications for understanding both the biology of this insect pathogen and links between the evolution of vertebrate virulence factors and the evolution of invertebrate virulence factors are discussed.


Subject(s)
Genome, Bacterial , Insecta/microbiology , Nematoda/microbiology , Photorhabdus/genetics , Photorhabdus/pathogenicity , Animals , Bacterial Adhesion/genetics , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Bacterial Toxins/genetics , Bacterial Toxins/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism , Escherichia coli/pathogenicity , Extrachromosomal Inheritance , Genomic Library , Insecta/parasitology , Iron/metabolism , Molecular Sequence Data , Polysaccharides, Bacterial/genetics , Polysaccharides, Bacterial/metabolism , Sequence Analysis, DNA , Symbiosis , Virulence/genetics
7.
Nucleic Acids Res ; 28(10): 2177-86, 2000 May 15.
Article in English | MEDLINE | ID: mdl-10773089

ABSTRACT

Salmonella typhi, the causative agent of typhoid fever, annually infects 16 million people and kills 600 000 world wide. Plasmid-encoded multiple drug resistance in S. typhi is always encoded by plasmids of incompatibility group H (IncH). The complete DNA sequence of the large temperature-sensitive conjugative plasmid R27, the prototype for the IncHI1 family of plasmids, has been compiled and analyzed. This 180 kb plasmid contains 210 open reading frames (ORFs), of which 14 have been previously identified and 56 exhibit similarity to other plasmid and prokaryotic ORFs. A number of insertion elements were found, including the full Tn 10 transposon, which carries tetracycline resistance genes. Two transfer regions, Tra1 and Tra2, are present, which are separated by a minimum of 64 kb. Homologs of the DNA-binding proteins TlpA and H-NS that act as temperature-regulated repressors in other systems have been located in R27. Sequence analysis of transfer and replication regions supports a mosaic-like structure for R27. The genes responsible for conjugation and plasmid maintenance have been identified and mechanisms responsible for thermosensitive transfer are discussed.


Subject(s)
Drug Resistance, Multiple/genetics , R Factors/chemistry , Salmonella typhi/genetics , Amino Acid Sequence , Base Sequence , Conjugation, Genetic , DNA Nucleotidyltransferases/chemistry , DNA Nucleotidyltransferases/genetics , Deoxyribonuclease I/chemistry , Deoxyribonuclease I/genetics , Molecular Sequence Data , Open Reading Frames , Sequence Alignment , Sequence Homology, Amino Acid , Temperature
8.
Plasmid ; 43(3): 235-9, 2000 May.
Article in English | MEDLINE | ID: mdl-10783303

ABSTRACT

An analysis of the complete nucleotide sequence of the composite tetracycline-resistance transposon Tn10 (9147 bp) from the Salmonella typhi conjugative plasmid R27 is presented. A comparison of the protein sequences from IS10-right and IS10-left transposases has identified four amino acid differences. These residues appear to play an important role in normal transposase function and may account for the differences in exhibited transposition activities. The tetracycline determinants encoded by this version of Tn10 share >99% identity with those of Tn10(R100), demonstrating the conservation that exists between these transposons. A previously uncharacterized approximately 3000-bp region of Tn10 contains four putative open reading frames. One of these open reading frames shares 55% identity with the glutamate permease protein sequence from Haemophilus influenzae although it was unable to complement an Escherichia coli glutamate permease mutant, with which it shares 51% identity. The three remaining putative open reading frames are arranged as a discrete genetic unit adjacent to the glutamate permease homolog and are transcribed in the opposite direction. Two of these open reading frames are homologous with Bacillus subtilis proteins of unknown functions while the other has no homologs in the database. The presence of an aminoacyl-tRNA synthetase class II motif in one of these open reading frames in combination with the glutamate permease homolog allows us to postulate that this region of Tn10 could once have played a role in amino acid metabolism.


Subject(s)
Amino Acid Transport Systems, Acidic , DNA Transposable Elements , Membrane Transport Proteins/genetics , Tetracycline Resistance/genetics , Amino Acid Sequence , Escherichia coli/genetics , Escherichia coli Proteins , Genetic Complementation Test , Membrane Transport Proteins/metabolism , Molecular Sequence Data , Mutation , Plasmids/genetics , Sequence Analysis, DNA , Sequence Homology, Amino Acid , Symporters
9.
Electrophoresis ; 20(6): 1186-94, 1999 Jun.
Article in English | MEDLINE | ID: mdl-10380758

ABSTRACT

A systematic characterization of the effects of important physical parameters on the sensitivity and specificity of methods in searching for unknown base changes (mutations or single nucleotide polymorphisms) over a relatively long DNA segment has not been previously reported. To this end, we have constructed a set of molecules of varying G+C content (40, 50, and 60% GC) having all possible base changes at a particular location - the "DNA toolbox". Exhaustive confirmatory sequencing demonstrated that there were no other base changes in any of the clones. Using this set of clones as polymerase chain reaction (PCR) templates, amplicons of various lengths with the same base mutated to all other bases were generated. The behavior of these constructs in manual and automated heteroduplex analysis was analyzed as a function of the size and overall base content of the fragment, the nature and location of the base change. Our results show that in heteroduplex analysis, the nature of the mismatched base pair is the overriding determinant for the ability to detect the mutation, regardless of fragment length, GC content, or the location of the mutation.


Subject(s)
DNA, Viral/analysis , Mutation , Nucleic Acid Heteroduplexes/analysis , Evaluation Studies as Topic
10.
Gene ; 223(1-2): 47-54, 1998 Nov 26.
Article in English | MEDLINE | ID: mdl-9858680

ABSTRACT

A transposon-based method of introducing unique restriction sites was used for subdivision of the Escherichia coli genome into a contiguous series of large non-overlapping segments spanning 2.5Mb. The segments, sizes ranging from 150 to 250kb, were isolated from the chromosome using the inserted restriction sites and shotgun cloned into an M13 vector for DNA sequencing. These shotgun sizes proved easily manageable, allowing the genomic sequence of E. coli to be completed more efficiently and rapidly than was possible by previously available methods. The 9bp duplication generated during transposition was used as a tag for accurate splicing of the segments; no further sequence redundancy at the junction sites was needed. The system is applicable to larger genomes even if they are not already well-characterized. We present the technology for segment sequencing, results of applying this method to E. coli, and the sequences of the transposon cassettes.


Subject(s)
Chromosomes, Bacterial , DNA Transposable Elements , Deoxyribonucleases, Type II Site-Specific/genetics , Escherichia coli/genetics , Sequence Analysis, DNA/methods , Gene Library , Genome, Bacterial , Saccharomyces cerevisiae Proteins
11.
Infect Immun ; 66(12): 5731-42, 1998 Dec.
Article in English | MEDLINE | ID: mdl-9826348

ABSTRACT

Yersinia pestis, the causative agent of plague, harbors at least three plasmids necessary for full virulence of the organism, two of which are species specific. One of the Y. pestis-specific plasmids, pMT1, is thought to promote deep tissue invasion, resulting in more acute onset of symptoms and death. We determined the entire nucleotide sequence of Y. pestis KIM5 pMT1 and identified potential open reading frames (ORFs) encoded by the 100,990-bp molecule. Based on codon usage for known yersinial genes, homology with known proteins in the databases, and potential ribosome binding sites, we determined that 115 of the potential ORFs which we considered could encode polypeptides in Y. pestis. Five of these ORFs were genes previously identified as being necessary for production of the classic virulence factors, murine toxin (MT), and the fraction 1 (F1) capsule antigen. The regions of pMT1 encoding MT and F1 were surrounded by remnants of multiple transposition events and bacteriophage, respectively, suggesting horizontal gene transfer of these virulence factors. We identified seven new potential virulence factors that might interact with the mammalian host or flea vector. Forty-three of the remaining 115 putative ORFs did not display any significant homology with proteins in the current databases. Furthermore, DNA sequence analysis allowed the determination of the putative replication and partitioning regions of pMT1. We identified a single 2,450-bp region within pMT1 that could function as the origin of replication, including a RepA-like protein similar to RepFIB, RepHI1B, and P1 and P7 replicons. Plasmid partitioning function was located ca. 36 kb from the putative origin of replication and was most similar to the parABS bacteriophage P1 and P7 system. Y. pestis pMT1 encoded potential genes with a high degree of similarity to a wide variety of organisms, plasmids, and bacteriophage. Accordingly, our analysis of the pMT1 DNA sequence emphasized the mosaic nature of this large bacterial virulence plasmid and provided implications as to its evolution.


Subject(s)
Antigens, Bacterial/genetics , Bacterial Capsules/genetics , Bacterial Toxins/genetics , Plasmids/genetics , Yersinia pestis/genetics , Bacteriophage lambda/genetics , Base Composition , Base Sequence , DNA Replication , DNA Transposable Elements , Evolution, Molecular , Gene Library , Gene Transfer, Horizontal , Genes, Bacterial , Molecular Sequence Data , Open Reading Frames , Proviruses/genetics , Sequence Analysis, DNA , Sequence Homology , Virulence/genetics , Yersinia pestis/pathogenicity
12.
Nucleic Acids Res ; 26(18): 4196-204, 1998 Sep 15.
Article in English | MEDLINE | ID: mdl-9722640

ABSTRACT

The complete DNA sequence of pO157, the large virulence plasmid of EHEC strain O157:H7 EDL 933, is presented. The 92 kb F-like plasmid is composed of segments of putative virulence genes in a framework of replication and maintenance regions, with seven insertion sequence elements, located mostly at the boundaries of the virulence segments. One hundred open reading frames (ORFs) were identified, of which 19 were previously sequenced potential virulence genes. Forty-two ORFs were sufficiently similar to known proteins for suggested functions to be assigned, and 22 had no convincing similarity with any known proteins. Of the newly identified genes, an unusually large ORF of 3169 amino acids has a putative cytotoxin active site shared with the large clostridial toxin (LCT) family and proteins such as ToxA and B of Clostridium difficile . A conserved motif was detected that links the large ORF and the LCT proteins with the OCH1 family of glycosyltransferases. In the complete sequence, the mosaic form can be observed at the levels of base composition, codon usage and gene organization. Insights were obtained from patterns of DNA composition as well as the pathogenic and 'housekeeping' gene segments. Evolutionary trees built from shared plasmid maintenance genes show that even these genes have heterogeneous origins.


Subject(s)
DNA, Bacterial/chemistry , Escherichia coli O157/genetics , Escherichia coli O157/pathogenicity , Plasmids/chemistry , Amino Acid Sequence , Base Sequence , Clostridioides difficile/genetics , Codon , Conserved Sequence , Cytotoxins/chemistry , Cytotoxins/genetics , DNA, Bacterial/genetics , Evolution, Molecular , Molecular Sequence Data , Open Reading Frames , Phylogeny , Plasmids/genetics , Sequence Alignment , Sequence Homology, Amino Acid , Virulence
13.
Science ; 277(5331): 1453-62, 1997 Sep 05.
Article in English | MEDLINE | ID: mdl-9278503

ABSTRACT

The 4,639,221-base pair sequence of Escherichia coli K-12 is presented. Of 4288 protein-coding genes annotated, 38 percent have no attributed function. Comparison with five other sequenced microbes reveals ubiquitous as well as narrowly distributed gene families; many families of similar genes within E. coli are also evident. The largest family of paralogous proteins contains 80 ABC transporters. The genome as a whole is strikingly organized with respect to the local direction of replication; guanines, oligonucleotides possibly related to replication and recombination, and most genes are so oriented. The genome also contains insertion sequence (IS) elements, phage remnants, and many other patches of unusual composition indicating genome plasticity through horizontal transfer.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial , Sequence Analysis, DNA , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Bacteriophage lambda/genetics , Base Composition , Binding Sites , Chromosome Mapping , DNA Replication , DNA Transposable Elements , DNA, Bacterial/genetics , Genes, Bacterial , Molecular Sequence Data , Mutation , Operon , RNA, Bacterial/genetics , RNA, Transfer/genetics , Recombination, Genetic , Regulatory Sequences, Nucleic Acid , Repetitive Sequences, Nucleic Acid , Sequence Homology, Amino Acid
14.
Biotechniques ; 23(6): 1070-2, 1074-5, 1997 Dec.
Article in English | MEDLINE | ID: mdl-9421638

ABSTRACT

We demonstrate a rapid cloning and sequencing strategy for kilobase-size DNA segments using DNase I and long PCR. In a single-tube protocol, deletions were formed in a plasmid insert by two enzymatic cuts, one at a fixed site and one at random. The doubly cut molecules were recircularized to generate a library of plasmids carrying deletions of various sizes and transformed into E. coli. The plasmid inserts were directly amplified from transformant colonies by long PCR and sized on a high-resolution agarose gel. A minimal tiling set, selected from the amplified material, was used directly as templates for long-read sequencing. The system is useful for inserts up to about 3.5 kb for de novo sequencing (both strands) or 6 kb for confirmatory sequencing (one strand).


Subject(s)
Cloning, Molecular/methods , Polymerase Chain Reaction/methods , Sequence Analysis, DNA , Deoxyribonuclease I/chemistry , Electrophoresis, Agar Gel , Escherichia coli/genetics , Gene Library , Plasmids/isolation & purification , Sequence Analysis, DNA/methods , Templates, Genetic
15.
Biotechniques ; 21(1): 142-4, 1996 Jul.
Article in English | MEDLINE | ID: mdl-8816249

ABSTRACT

Accurate resolution of PCR products in the range of 15-40 kb may be obtained in agarose gels without pulsed field electrophoresis. A gel of 0.3% SeaKem Gold agarose cast on GelBond support film provides good resolution and sufficient get strength to reliably allow staining and photography. This paper describes a test system for Long PCR and demonstrates analysis of the PCR products on a gel run under standard low-voltage electrophoresis conditions.


Subject(s)
DNA, Viral/analysis , Electrophoresis, Agar Gel/methods , Polymerase Chain Reaction , Bacteriophage lambda/genetics , Electrophoresis, Agar Gel/instrumentation , Magnesium/administration & dosage , Photography , Staining and Labeling , Templates, Genetic
16.
Nucleic Acids Res ; 23(12): 2105-19, 1995 Jun 25.
Article in English | MEDLINE | ID: mdl-7610040

ABSTRACT

The 338.5 kb of the Escherichia coli genome described here together with previously described segments bring the total of contiguous finished sequence of this genome to > 1 Mb. Of 319 open reading frames (ORFs) found in this 338.5 kb segment, 147 (46%) are potential new genes. The positions of several genes which had been previously located here by mapping or partial sequencing have been confirmed. Several ORFs have functions suggested by similarities to other characterised genes but cannot be assigned with certainty. Fifteen of the ORFs of unknown function had been previously sequenced. Eight transfer RNAs are encoded in the region and there are two grey holes in which no features were found. The attachment site for phage P4 and three insertion sequences were located. The region was also analysed for chi sites, bend sites, REP elements and other repeats. A computer search identified potential promoters and tentative transcription units were assigned. The occurrence of the rare tetramer CTAG was analysed in 1.6 Mb of contiguous E.coli sequence. Hypotheses addressing the rarity and distribution of CTAG are discussed.


Subject(s)
DNA, Bacterial/chemistry , Escherichia coli/genetics , Genes, Bacterial , Sequence Analysis , Base Sequence , Chromosome Mapping , Molecular Sequence Data , Oligonucleotides/chemistry , Open Reading Frames , Operon , Promoter Regions, Genetic , Protein Sorting Signals , Repetitive Sequences, Nucleic Acid , Restriction Mapping , Sequence Alignment
17.
Nucleic Acids Res ; 22(13): 2576-86, 1994 Jul 11.
Article in English | MEDLINE | ID: mdl-8041620

ABSTRACT

The DNA sequence of a 225.4 kilobase segment of the Escherichia coli K-12 genome is described here, from 76.0 to 81.5 minutes on the genetic map. This brings the total of contiguous sequence from the E.coli genome project to 725.1 kb (76.0 to 92.8 minutes). We found 191 putative coding genes (ORFs) of which 72 genes were previously known, and 110 of which remain unidentified despite literature and similarity searches. Seven new genes--arsE, arsF, arsG, treF, xylR, xylG, and xylH--were identified as well as the previously mapped pit and dctA genes. The arrangement of proposed genes relative to possible promoters and terminators suggests 90 potential transcription units. Other features include 19 REP elements, 95 computer-predicted bends, 50 Chi sites, and one grey hole. Thirty-one putative signal peptides were found, including those of thirteen known membrane or periplasmic proteins. One tRNA gene (proK) and two insertion sequences (IS5 and IS150) are located in this segment. The genes in this region are organized with equal numbers oriented with or against replication.


Subject(s)
Chromosome Mapping , DNA, Bacterial/genetics , Escherichia coli/genetics , Genome, Bacterial , Alcohol Dehydrogenase/genetics , Aldehyde Oxidoreductases/genetics , Chromosomes, Bacterial , Codon , Gene Transfer Techniques , Molecular Sequence Data , Open Reading Frames , Promoter Regions, Genetic , Protein Sorting Signals , Saccharomyces cerevisiae/enzymology , Saccharomyces cerevisiae/genetics , Terminator Regions, Genetic , Transcription, Genetic , Zymomonas/enzymology , Zymomonas/genetics
18.
Nucleic Acids Res ; 21(23): 5408-17, 1993 Nov 25.
Article in English | MEDLINE | ID: mdl-8265357

ABSTRACT

We present the sequence of 176 kilobases of the Escherichia coli K-12 genome, from katG at 89.2 to an open reading frame (ORF) of unknown function at 92.8 minutes on the genetic map. This brings the total of contiguous sequence from the E. coli genome project to 500 kb (81.5 to 92.8 minutes). This segment contains 134 putative coding genes (ORFs) of which 66 genes were previously identified. Eight new genes--acs, pepE, and nrfB-G--were identified as well as the previously mapped gldA and alr genes. Still, 58 ORFs remain unidentified despite literature and similarity searches. The arrangement of proposed genes relative to possible promoters and terminators suggests 55 potential transcription units. Other features include 13 REP elements, one IRU (ERIC) repeat, 59 computer-predicted bends, 42 Chi sites and one new grey hole. Sixteen signal peptides were found, including those of lamB, btuB, and malE. Two ribosomal RNA loci, rrnB and rrnE, are located in this segment, so we have now sequenced four of the seven E. coli rRNA loci. Comparison of the rRNA loci reveals some differences in the ribosomal structural RNAs which are generally compatible with the proposed secondary structures.


Subject(s)
DNA, Bacterial/genetics , DNA, Ribosomal/genetics , Escherichia coli/genetics , Genes, Bacterial , Base Sequence , Molecular Sequence Data , Nucleic Acid Conformation , Open Reading Frames , Protein Sorting Signals/chemistry , Regulatory Sequences, Nucleic Acid , Repetitive Sequences, Nucleic Acid , Restriction Mapping
19.
Gene ; 134(1): 1-6, 1993 Nov 30.
Article in English | MEDLINE | ID: mdl-8244018

ABSTRACT

Sequences of four new heat-shock (HS) genes of Escherichia coli organized into two operons were determined. The operon at 83 min specifies two proteins of 15.8 kDa (HslT) and 16.1 kDa (HslS), which are identical to IbpA and IbpB, respectively. Expression of mRNA from a sigma 32-dependent promoter of the hslTS/ibpAB operon is stimulated 30-75-fold upon temperature upshift. The transcription start point (tsp) is located at a G, 96 bp upstream from the AUG start codon of hslT/ibpA. The deduced amino acid sequences of HslT/IbpA and HslS/IbpB are 48% identical to each other and were found to be remotely related to the chloroplast low-molecular-weight HS protein, which is highly conserved among plants. The second hs operon is much less actively stimulated by temperature upshift, although it has a hs promoter that perfectly matches the consensus of promoters recognized by sigma 32. Located at 88.9 min, the hslVU operon specifies proteins of 19.1 kDa (HslV) and 49.6 kDa (HslU). Multiple tsp were found in this operon. HslV is remotely related to the eukaryotic proteasome proteins, and HslU is very similar to a Pasteurella haemolytica protein of unknown function. Both HslU and the P. haemolytica protein share a ATP/GTP-binding motif near their N-termini. The two operons described here are transcribed counterclockwise on the standard genetic map.


Subject(s)
Bacterial Proteins/genetics , Escherichia coli/genetics , Heat-Shock Proteins/genetics , Operon , Sequence Analysis, DNA , Serine Endopeptidases , ATP-Dependent Proteases , Amino Acid Sequence , Base Sequence , DNA, Bacterial , Molecular Sequence Data , RNA, Messenger/genetics , Sequence Homology, Amino Acid
20.
Nucleic Acids Res ; 21(15): 3385-90, 1993 Jul 25.
Article in English | MEDLINE | ID: mdl-8346017

ABSTRACT

The design of large scale DNA sequencing projects such as genome analysis demands a new approach to sequencing strategy, since neither a purely random nor a purely directed method is satisfactory. We have developed a strategy that combines these two methods in a way that preserves the advantages of both while avoiding their particular limitations. Computer simulations showed that a specific balance of random and directed sequencing was required for the most efficient strategy, termed the Janus strategy, which has been used in the Escherichia coli genome sequencing project. This approach depended on obtaining sequence easily from either strand of a cloned insert, and was facilitated by inversion of the insert in the engineered M13 vector Janus, by site-specific recombination. The inversion was accomplished simply by growth on the appropriate host strain, when the DNA strand incorporated into the new single stranded phage was complementary to that in the original phage, and was sequenced by the same simple protocol as the first strand.


Subject(s)
Sequence Analysis, DNA/methods , Bacteriophage lambda/genetics , Base Sequence , Cloning, Molecular , Computer Simulation , DNA, Bacterial/chemistry , DNA, Viral/chemistry , Escherichia coli/genetics , Gene Library , Genetic Engineering , Molecular Sequence Data , Software , beta-Galactosidase/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...