Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
Add more filters










Publication year range
1.
J Bacteriol ; 185(19): 5673-84, 2003 Oct.
Article in English | MEDLINE | ID: mdl-13129938

ABSTRACT

Defining the gene products that play an essential role in an organism's functional repertoire is vital to understanding the system level organization of living cells. We used a genetic footprinting technique for a genome-wide assessment of genes required for robust aerobic growth of Escherichia coli in rich media. We identified 620 genes as essential and 3,126 genes as dispensable for growth under these conditions. Functional context analysis of these data allows individual functional assignments to be refined. Evolutionary context analysis demonstrates a significant tendency of essential E. coli genes to be preserved throughout the bacterial kingdom. Projection of these data over metabolic subsystems reveals topologic modules with essential and evolutionarily preserved enzymes with reduced capacity for error tolerance.


Subject(s)
DNA Footprinting/methods , Escherichia coli Proteins/genetics , Escherichia coli/growth & development , Genome, Bacterial , Aerobiosis , Amino Acids/biosynthesis , Culture Media , DNA Transposable Elements , Escherichia coli/genetics , Escherichia coli Proteins/metabolism , Evolution, Molecular , Gene Expression Regulation, Bacterial , Genes, Essential , Mutagenesis, Insertional , Phylogeny
3.
J Bacteriol ; 183(1): 292-300, 2001 Jan.
Article in English | MEDLINE | ID: mdl-11114929

ABSTRACT

Shikimate kinase (EC 2.7.1.71) is a committed enzyme in the seven-step biosynthesis of chorismate, a major precursor of aromatic amino acids and many other aromatic compounds. Genes for all enzymes of the chorismate pathway except shikimate kinase are found in archaeal genomes by sequence homology to their bacterial counterparts. In this study, a conserved archaeal gene (gi1500322 in Methanococcus jannaschii) was identified as the best candidate for the missing shikimate kinase gene by the analysis of chromosomal clustering of chorismate biosynthetic genes. The encoded hypothetical protein, with no sequence similarity to bacterial and eukaryotic shikimate kinases, is distantly related to homoserine kinases (EC 2.7.1.39) of the GHMP-kinase superfamily. The latter functionality in M. jannaschii is assigned to another gene (gi591748), in agreement with sequence similarity and chromosomal clustering analysis. Both archaeal proteins, overexpressed in Escherichia coli and purified to homogeneity, displayed activity of the predicted type, with steady-state kinetic parameters similar to those of the corresponding bacterial kinases: K(m,shikimate) = 414 +/- 33 microM, K(m,ATP) = 48 +/- 4 microM, and k(cat) = 57 +/- 2 s(-1) for the predicted shikimate kinase and K(m,homoserine) = 188 +/- 37 microM, K(m,ATP) = 101 +/- 7 microM, and k(cat) = 28 +/- 1 s(-1) for the homoserine kinase. No overlapping activity could be detected between shikimate kinase and homoserine kinase, both revealing a >1,000-fold preference for their own specific substrates. The case of archaeal shikimate kinase illustrates the efficacy of techniques based on reconstruction of metabolism from genomic data and analysis of gene clustering on chromosomes in finding missing genes.


Subject(s)
Methanococcus/enzymology , Mevalonic Acid/analogs & derivatives , Phosphotransferases (Alcohol Group Acceptor)/genetics , Phosphotransferases (Alcohol Group Acceptor)/metabolism , Amino Acid Sequence , Chorismic Acid/metabolism , Cloning, Molecular , Escherichia coli/enzymology , Escherichia coli/genetics , Galactose/metabolism , Homoserine/metabolism , Methanococcus/genetics , Mevalonic Acid/metabolism , Molecular Sequence Data , Phosphorylation , Phosphotransferases/classification , Phosphotransferases/metabolism , Phosphotransferases (Alcohol Group Acceptor)/chemistry , Phosphotransferases (Alcohol Group Acceptor)/isolation & purification , Polymerase Chain Reaction , Sequence Alignment , Sequence Analysis, DNA , Substrate Specificity
4.
Nucleic Acids Res ; 28(22): 4573-6, 2000 Nov 15.
Article in English | MEDLINE | ID: mdl-11071948

ABSTRACT

The proliferation of genome sequence data has led to the development of a number of tools and strategies that facilitate computational analysis. These methods include the identification of motif patterns, membership of the query sequences in family databases, metabolic pathway involvement and gene proximity. We re-examined the completely sequenced genome of Thermotoga maritima by employing the combined use of the above methods. By analyzing all 1877 proteins encoded in this genome, we identified 193 cases of conflicting annotations (10%), of which 164 are new function predictions and 29 are amendments of previously proposed assignments. These results suggest that the combined use of existing computational tools can resolve inconclusive sequence similarities and significantly improve the prediction of protein function from genome sequence.


Subject(s)
Genome, Bacterial , Sequence Alignment/methods , Thermotoga maritima/genetics , Computational Biology , Genes, Bacterial/genetics , Open Reading Frames , Sequence Analysis
5.
Proc Natl Acad Sci U S A ; 97(7): 3509-14, 2000 Mar 28.
Article in English | MEDLINE | ID: mdl-10737802

ABSTRACT

A gapped genome sequence of the biomining bacterium Thiobacillus ferrooxidans strain ATCC23270 was assembled from sheared DNA fragments (3.2-times coverage) into 1,912 contigs. A total of 2,712 potential genes (ORFs) were identified in 2.6 Mbp (megabase pairs) of Thiobacillus genomic sequence. Of these genes, 2,159 could be assigned functions by using the WIT-Pro/EMP genome analysis system, most with a high degree of certainty. Nine hundred of the genes have been assigned roles in metabolic pathways, producing an overview of cellular biosynthesis, bioenergetics, and catabolism. Sequence similarities, relative gene positions on the chromosome, and metabolic reconstruction (placement of gene products in metabolic pathways) were all used to aid gene assignments and for development of a functional overview. Amino acid biosynthesis was chosen to demonstrate the analytical capabilities of this approach. Only 10 expected enzymatic activities, of the nearly 150 involved in the biosynthesis of all 20 amino acids, are currently unassigned in the Thiobacillus genome. This result compares favorably with 10 missing genes for amino acid biosynthesis in the complete Escherichia coli genome. Gapped genome analysis can therefore give a decent picture of the central metabolism of a microorganism, equivalent to that of a complete sequence, at significantly lower cost.


Subject(s)
Amino Acids/metabolism , Genome, Bacterial , Thiobacillus/metabolism , Chromosomes, Bacterial , Cloning, Molecular , Molecular Sequence Data , Open Reading Frames , Thiobacillus/genetics
6.
Proc Natl Acad Sci U S A ; 97(7): 3304-8, 2000 Mar 28.
Article in English | MEDLINE | ID: mdl-10716711

ABSTRACT

Comparisons of complete genome sequences allow the most objective and comprehensive descriptions possible of a lineage's evolution. This communication uses the completed genomes from four major euryarchaeal taxa to define a genomic signature for the Euryarchaeota and, by extension, the Archaea as a whole. The signature is defined in terms of the set of protein-encoding genes found in at least two diverse members of the euryarchaeal taxa that function uniquely within the Archaea; most signature proteins have no recognizable bacterial or eukaryal homologs. By this definition, 351 clusters of signature proteins have been identified. Functions of most proteins in this signature set are currently unknown. At least 70% of the clusters that contain proteins from all the euryarchaeal genomes also have crenarchaeal homologs. This conservative set, which appears refractory to horizontal gene transfer to the Bacteria or the Eukarya, would seem to reflect the significant innovations that were unique and fundamental to the archaeal "design fabric." Genomic protein signature analysis methods may be extended to characterize the evolution of any phylogenetically defined lineage. The complete set of protein clusters for the archaeal genomic signature is presented as supplementary material (see the PNAS web site, www.pnas.org).


Subject(s)
Genome, Archaeal , Archaeal Proteins/genetics , Multigene Family , Open Reading Frames , Species Specificity
8.
Nucleic Acids Res ; 28(1): 123-5, 2000 Jan 01.
Article in English | MEDLINE | ID: mdl-10592199

ABSTRACT

The WIT (What Is There) (http://wit.mcs.anl.gov/WIT2/) system has been designed to support comparative analysis of sequenced genomes and to generate metabolic reconstructions based on chromosomal sequences and metabolic modules from the EMP/MPW family of databases. This system contains data derived from about 40 completed or nearly completed genomes. Sequence homologies, various ORF-clustering algorithms, relative gene positions on the chromosome and placement of gene products in metabolic pathways (metabolic reconstruction) can be used for the assignment of gene functions and for development of overviews of genomes within WIT. The integration of a large number of phylogenetically diverse genomes in WIT facilitates the understanding of the physiology of different organisms.


Subject(s)
Databases, Factual , Genome , Systems Integration , Internet , Open Reading Frames
9.
J Mol Evol ; 49(4): 413-23, 1999 Oct.
Article in English | MEDLINE | ID: mdl-10485999

ABSTRACT

The phylogenetic distribution of Methanococcus jannaschii proteins can provide, for the first time, an estimate of the genome content of the last common ancestor of the three domains of life. Relying on annotation and comparison with reference to the species distribution of sequence similarities results in 324 proteins forming the universal family set. This set is very well characterized and relatively small and nonredundant, containing 301 biochemical functions, of which 246 are unique. This universal function set contains mostly genes coding for energy metabolism or information processing. It appears that the Last Universal Common Ancestor was an organism with metabolic networks and genetic machinery similar to those of extant unicellular organisms.


Subject(s)
Evolution, Molecular , Genome , Methanococcus/genetics , Archaeal Proteins/classification , Archaeal Proteins/genetics , Databases, Factual , Genes, Archaeal/genetics , Phylogeny
10.
Proc Natl Acad Sci U S A ; 96(6): 2896-901, 1999 Mar 16.
Article in English | MEDLINE | ID: mdl-10077608

ABSTRACT

Previously, we presented evidence that it is possible to predict functional coupling between genes based on conservation of gene clusters between genomes. With the rapid increase in the availability of prokaryotic sequence data, it has become possible to verify and apply the technique. In this paper, we extend our characterization of the parameters that determine the utility of the approach, and we generalize the approach in a way that supports detection of common classes of functionally coupled genes (e.g., transport and signal transduction clusters). Now that the analysis includes over 30 complete or nearly complete genomes, it has become clear that this approach will play a significant role in supporting efforts to assign functionality to the remaining uncharacterized genes in sequenced genomes.


Subject(s)
Gene Expression Regulation, Bacterial , Genome, Bacterial , Multigene Family , Databases, Factual , Sequence Analysis, DNA
11.
In Silico Biol ; 1(2): 93-108, 1999.
Article in English | MEDLINE | ID: mdl-11471247

ABSTRACT

The availability of a growing number of completely sequenced genomes opens new opportunities for understanding of complex biological systems. Success of genome-based biology will, to a large extent, depend on the development of new approaches and tools for efficient comparative analysis of the genomes and their organization. We have developed a technique for detecting possible functional coupling between genes based on detection of potential operons. The approach involves computation of "pairs of close bidirectional best hits", which are pairs of genes that apparently occur within operons in multiple genomes. Using these pairs, one can compose evidence (based on the number of distinct genomes and the phylogenetic distance between the orthologous pairs) that a pair of genes is potentially functionally coupled. The technique has revealed a surprisingly rich and apparently accurate set of functionally coupled genes. The approach depends on the use of a relatively large number of genomes, and the amount of detected coupling grows dramatically as the number of genomes increases.


Subject(s)
Chromosomes/genetics , Computer Simulation , Models, Genetic , Algorithms , Chromosomes, Bacterial/genetics , Diaminopimelic Acid/metabolism , Escherichia coli/genetics , Escherichia coli/metabolism , Genome , Genome, Bacterial , Operon , Prokaryotic Cells , Purines/metabolism
12.
Nucleic Acids Res ; 27(1): 171-3, 1999 Jan 01.
Article in English | MEDLINE | ID: mdl-9847171

ABSTRACT

The Ribosomal Database Project (RDP-II), previously described by Maidak et al. [ Nucleic Acids Res. (1997), 25, 109-111], is now hosted by the Center for Microbial Ecology at Michigan State University. RDP-II is a curated database that offers ribosomal RNA (rRNA) nucleotide sequence data in aligned and unaligned forms, analysis services, and associated computer programs. During the past two years, data alignments have been updated and now include >9700 small subunit rRNA sequences. The recent development of an ObjectStore database will provide more rapid updating of data, better data accuracy and increased user access. RDP-II includes phylogenetically ordered alignments of rRNA sequences, derived phylogenetic trees, rRNA secondary structure diagrams, and various software programs for handling, analyzing and displaying alignments and trees. The data are available via anonymous ftp (ftp.cme.msu. edu) and WWW (http://www.cme.msu.edu/RDP). The WWW server provides ribosomal probe checking, approximate phylogenetic placement of user-submitted sequences, screening for possible chimeric rRNA sequences, automated alignment, and a suggested placement of an unknown sequence on an existing phylogenetic tree. Additional utilities also exist at RDP-II, including distance matrix, T-RFLP, and a Java-based viewer of the phylogenetic trees that can be used to create subtrees.


Subject(s)
Databases, Factual , RNA, Ribosomal , Ribosomes/genetics , Base Sequence , Databases, Factual/trends , Information Storage and Retrieval , Internet , Michigan , Phylogeny , RNA, Ribosomal/chemistry , RNA, Ribosomal/genetics , Sequence Alignment , Universities
13.
Nature ; 392(6674): 353-8, 1998 Mar 26.
Article in English | MEDLINE | ID: mdl-9537320

ABSTRACT

Aquifex aeolicus was one of the earliest diverging, and is one of the most thermophilic, bacteria known. It can grow on hydrogen, oxygen, carbon dioxide, and mineral salts. The complex metabolic machinery needed for A. aeolicus to function as a chemolithoautotroph (an organism which uses an inorganic carbon source for biosynthesis and an inorganic chemical energy source) is encoded within a genome that is only one-third the size of the E. coli genome. Metabolic flexibility seems to be reduced as a result of the limited genome size. The use of oxygen (albeit at very low concentrations) as an electron acceptor is allowed by the presence of a complex respiratory apparatus. Although this organism grows at 95 degrees C, the extreme thermal limit of the Bacteria, only a few specific indications of thermophily are apparent from the genome. Here we describe the complete genome sequence of 1,551,335 base pairs of this evolutionarily and physiologically interesting organism.


Subject(s)
Genome, Bacterial , Gram-Negative Aerobic Rods and Cocci/genetics , Chromosome Mapping , Chromosomes, Bacterial , Citric Acid Cycle , DNA Repair , DNA, Bacterial/biosynthesis , DNA, Bacterial/genetics , Gram-Negative Aerobic Rods and Cocci/metabolism , Molecular Sequence Data , Oxidative Stress , Phylogeny , Protein Biosynthesis , Temperature , Transcription, Genetic
14.
Nature ; 390(6658): 364-70, 1997 Nov 27.
Article in English | MEDLINE | ID: mdl-9389475

ABSTRACT

Archaeoglobus fulgidus is the first sulphur-metabolizing organism to have its genome sequence determined. Its genome of 2,178,400 base pairs contains 2,436 open reading frames (ORFs). The information processing systems and the biosynthetic pathways for essential components (nucleotides, amino acids and cofactors) have extensive correlation with their counterparts in the archaeon Methanococcus jannaschii. The genomes of these two Archaea indicate dramatic differences in the way these organisms sense their environment, perform regulatory and transport functions, and gain energy. In contrast to M. jannaschii, A. fulgidus has fewer restriction-modification systems, and none of its genes appears to contain inteins. A quarter (651 ORFs) of the A. fulgidus genome encodes functionally uncharacterized yet conserved proteins, two-thirds of which are shared with M. jannaschii (428 ORFs). Another quarter of the genome encodes new proteins indicating substantial archaeal gene diversity.


Subject(s)
Archaeoglobus fulgidus/genetics , Genes, Archaeal , Genome , Archaeoglobus fulgidus/metabolism , Archaeoglobus fulgidus/physiology , Base Sequence , Cell Division , DNA, Bacterial/genetics , Energy Metabolism , Gene Expression Regulation, Bacterial , Molecular Sequence Data , Protein Biosynthesis , Transcription, Genetic
15.
Gene ; 197(1-2): GC11-26, 1997 Sep 15.
Article in English | MEDLINE | ID: mdl-9332394

ABSTRACT

The interpretation of the Methanococcus jannaschii genome will inevitably require many years of effort. This initial attempt to connect the sequence data to aspects of known biochemistry and to provide an overview of what is already apparent from the sequence data will be refined. Numerous issues remain that can be resolved only by direct biochemical analysis. Let us draw the reader's attention to just a few that might be considered central: (1) We are still missing key enzymes from the glycolytic pathway, and the conjecture is that this is due to ADP-dependency. The existence of glycolytic activity in the cell-free extract should be tested. (2) The issue of whether the Calvin cycle is present needs to be examined. (3) We need to determine whether the 2-oxoglutarate synthase (ferredoxin-dependent) (EC 1.2.7.3) activity is present. (4) The issue of whether cyclic 2,3-bisphosphate is detectable in the cell-free extracts needs to be checked. If it is, this result would confirm our assertion of the two pathways controlling synthesis and degradation of cyclic 2,3-bisphosphate.


Subject(s)
Methanococcus/genetics , Methanococcus/metabolism , Models, Chemical , Models, Genetic , Amino Acid Sequence , Amino Acids/metabolism , Carbohydrate Metabolism , Coenzymes/metabolism , Databases, Factual , Lipid Metabolism , Methane/metabolism , Methanococcus/enzymology , Nucleotides/metabolism , Polyamines/metabolism
17.
Nucleic Acids Res ; 25(1): 37-8, 1997 Jan 01.
Article in English | MEDLINE | ID: mdl-9016500

ABSTRACT

The Metabolic Pathway Collection from EMP is an extraction of data from the larger Enzymes and Metabolic Pathways database (EMP). This extraction has been made publicly available in the hope that others will find it useful for a variety of purposes. The original release in October 1995 contained 1814 distinct pathways. The current collection contains 2180. Metabolic reconstructions for the first completely sequenced organisms-Haemophilus influenzae,Mycoplasma genitalium,Saccharomyces cerevisiaeandMethanococcus janaschii-are all included in the current release. All of the pathways in the collections are available as ASCII files in the form generated by the main curator, Evgeni Selkov. In addition, we are offering a more structured encoding of a subset of the collection; our initial release of this subcollection includes all of the pathways inMycoplasma genitalium, and we ultimately intend to offer the entire collection in this form as well.


Subject(s)
Databases, Factual , Metabolism
18.
Nucleic Acids Res ; 25(1): 109-11, 1997 Jan 01.
Article in English | MEDLINE | ID: mdl-9016515

ABSTRACT

The Ribosomal Database Project (RDP) is a curated database that offers ribosome-related data, analysis services and associated computer programs. The offerings include phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams, and various software for handling, analyzing and displaying alignments and trees. The data are available via anonymous FTP (rdp.life.uiuc.edu), electronic mail (server@rdp.life.uiuc.edu), gopher (rdpgopher.life.uiuc.edu) and WWW (http://rdpwww.life.uiuc.edu/ ). The electronic mail and WWW servers provide ribosomal probe checking, approximate phylogenetic placement of user-submitted sequences, screening for possible chimeric rRNA sequences, automated alignment, and a suggested placement of an unknown sequence on an existing phylogenetic tree.


Subject(s)
Databases, Factual , RNA, Ribosomal/genetics , Ribosomes/genetics , Animals , Base Sequence , Computer Communication Networks
19.
Trends Genet ; 13(12): 497-8, 1997 Dec.
Article in English | MEDLINE | ID: mdl-9433140
20.
Science ; 273(5278): 1058-73, 1996 Aug 23.
Article in English | MEDLINE | ID: mdl-8688087

ABSTRACT

The complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements have been determined by whole-genome random sequencing. A total of 1738 predicted protein-coding genes were identified; however, only a minority of these (38 percent) could be assigned a putative cellular role with high confidence. Although the majority of genes related to energy production, cell division, and metabolism in M. jannaschii are most similar to those found in Bacteria, most of the genes involved in transcription, translation, and replication in M. jannaschii are more similar to those found in Eukaryotes.


Subject(s)
Bacterial Proteins/genetics , DNA, Bacterial/genetics , Genome, Bacterial , Methanococcus/genetics , Amino Acid Sequence , Bacterial Proteins/chemistry , Base Composition , Base Sequence , Biological Transport/genetics , Carbon Dioxide/metabolism , Chromosome Mapping , Chromosomes, Bacterial/genetics , DNA Replication , Databases, Factual , Energy Metabolism/genetics , Genes, Bacterial , Hydrogen/metabolism , Methane/metabolism , Methanococcus/physiology , Molecular Sequence Data , Protein Biosynthesis , Sequence Analysis, DNA , Transcription, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...