Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 49
Filter
Add more filters










Publication year range
1.
Mol Biol Evol ; 40(11)2023 Nov 03.
Article in English | MEDLINE | ID: mdl-37948764

ABSTRACT

Performing phylogenetic analysis with genome sequences maximizes the information used to estimate phylogenies and the resolution of closely related taxa. The use of single-nucleotide polymorphisms (SNPs) permits estimating trees without genome alignments and permits the use of data sets of hundreds of microbial genomes. kSNP4 is a program that identifies SNPs without using a reference genome, estimates parsimony, maximum likelihood, and neighbor-joining trees, and is able to annotate the discovered SNPs. kSNP4 is a command-line program that does not require any additional programs or dependencies to install or use. kSNP4 does not require any programming experience or bioinformatics experience to install and use. It is suitable for use by students through senior investigators. It includes a detailed user guide that explains all of the many features of kSNP4. In this study, we provide a detailed step-by-step protocol for downloading, installing, and using kSNP4 to build phylogenetic trees from genome sequences.


Subject(s)
Computational Biology , Evolution, Molecular , Humans , Phylogeny
2.
PLoS One ; 17(10): e0276040, 2022.
Article in English | MEDLINE | ID: mdl-36228033

ABSTRACT

The spectrophotometer has been used for decades to measure the density of bacterial populations as the turbidity expressed as optical density-OD. However, the OD alone is an unreliable metric and is only proportionately accurate to cell titers to about an OD of 0.1. The relationship between OD and cell titer depends on the configuration of the spectrophotometer, the length of the light path through the culture, the size of the bacterial cells, and the cell culture density. We demonstrate the importance of plate reader calibration to identify the exact relationship between OD and cells/mL. We use four bacterial genera and two sizes of micro-titer plates (96-well and 384-well) to show that the cell/ml per unit OD depends heavily on the bacterial cell size and plate size. We applied our calibration curve to real growth curve data and conclude the cells/mL-rather than OD-is a metric that can be used to directly compare results across experiments, labs, instruments, and species.


Subject(s)
Bacteria , Spectrophotometry/methods
4.
Mol Biol Evol ; 34(12): 3303-3309, 2017 Dec 01.
Article in English | MEDLINE | ID: mdl-29029174

ABSTRACT

Growth rates are an important tool in microbiology because they provide high throughput fitness measurements. The release of GrowthRates, a program that uses the output of plate reader files to automatically calculate growth rates, has facilitated experimental procedures in many areas. However, many sources of variation within replicate growth rate data exist and can decrease data reliability. We have developed a new statistical package, CompareGrowthRates (CGR), to enhance the program GrowthRates and accurately measure variation in growth rate data sets. We define a metric, Variability-score (V-score), that can help determine if variation within a data set might result in false interpretations. CGR also uses the bootstrap method to determine the fraction of bootstrap replicates in which a strain will grow the fastest. We illustrate the usage of CGR with growth rate data sets similar to those in Mira, Meza, et al. (Adaptive landscapes of resistance genes change as antibiotic concentrations change. Mol Biol Evol. 32(10): 2707-2715). These statistical methods are compatible with the analytic methods described in Growth Rates Made Easy and can be used with any set of growth rate output from GrowthRates.


Subject(s)
Bacteria/growth & development , Colony Count, Microbial/methods , Colony Count, Microbial/statistics & numerical data , Biometry/methods , Microbial Viability/genetics , Reproducibility of Results , Software
5.
J Clin Microbiol ; 55(7): 2143-2152, 2017 07.
Article in English | MEDLINE | ID: mdl-28446577

ABSTRACT

Strict infection control practices have been implemented for health care visits by cystic fibrosis (CF) patients in an attempt to prevent transmission of important pathogens. This study used whole-genome sequencing (WGS) to determine strain relatedness and assess population dynamics of Staphylococcus aureus isolates from a cohort of CF patients as assessed by strain relatedness. A total of 311 S. aureus isolates were collected from respiratory cultures of 115 CF patients during a 22-month study period. Whole-genome sequencing was performed, and using single nucleotide polymorphism (SNP) analysis, phylogenetic trees were assembled to determine relatedness between isolates. Methicillin-resistant Staphylococcus aureus (MRSA) phenotypes were predicted using PPFS2 and compared to the observed phenotype. The accumulation of SNPs in multiple isolates obtained over time from the same patient was examined to determine if a genomic molecular clock could be calculated. Pairs of isolates with ≤71 SNP differences were considered to be the "same" strain. All of the "same" strain isolates were either from the same patient or siblings pairs. There were 47 examples of patients being superinfected with an unrelated strain. The predicted MRSA phenotype was accurate in all but three isolates. Mutation rates were unable to be determined because the branching order in the phylogenetic tree was inconsistent with the order of isolation. The observation that transmissions were identified between sibling patients shows that WGS is an effective tool for determining transmission between patients. The observation that transmission only occurred between siblings suggests that Staphylococcus aureus acquisition in our CF population occurred outside the hospital environment and indicates that current infection prevention efforts appear effective.


Subject(s)
Cystic Fibrosis/complications , Genetic Variation , Staphylococcal Infections/microbiology , Staphylococcus aureus/classification , Staphylococcus aureus/genetics , Whole Genome Sequencing , Adolescent , Child , Child, Preschool , Female , Humans , Infant , Infant, Newborn , Male , Phylogeny , Polymorphism, Single Nucleotide , Population Dynamics , Staphylococcus aureus/isolation & purification , Young Adult
6.
Cladistics ; 32(1): 90-99, 2016 Feb.
Article in English | MEDLINE | ID: mdl-34732024

ABSTRACT

kSNP v2 is a powerful tool for single nucleotide polymorphism (SNP) identification from complete microbial genomes and for estimating phylogenetic trees from the identified SNPs. kSNP can analyse finished genomes, genome assemblies, raw reads or any combination of those and does not require either genome alignment or reference genomes. This study uses sequence evolution simulations to evaluate the topological accuracy of kSNP trees and to assess the effects of diversity and recombination on that accuracy. The accuracies of kSNP trees are strongly affected by increasing diversity, with parsimony accuracy > maximum-likelihood accuracy > neighbour-joining accuracy. Accuracy is also strongly influenced by recombination; as recombination increases accuracy decreases. Reliable trees are arbitrarily defined as those that have ≥ 90% topological accuracy. It is determined that the best predictor of topological accuracy is the ratio of r/m, a measure of the effect of recombination, to FCK (the fraction of core kmers), a measure of diversity. Tools are available to allow investigators to determine both r/m and FCK, and the relationship between topological accuracy and the ratio of r/m to FCK is determined. The practical implication of this study is that kSNP is an effective tool for estimating phylogenetic trees from microbial genome sequences provided that both recombination and sequence diversity are within acceptable ranges.

7.
Bioinformatics ; 31(17): 2877-8, 2015 Sep 01.
Article in English | MEDLINE | ID: mdl-25913206

ABSTRACT

UNLABELLED: We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2. AVAILABILITY AND IMPLEMENTATION: kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/ CONTACT: barryghall@gmail.com.


Subject(s)
Computational Biology/methods , Escherichia coli/genetics , Genome, Bacterial , Phylogeny , Polymorphism, Single Nucleotide/genetics , Sequence Analysis, DNA/methods , Software , Databases, Nucleic Acid , Escherichia coli/classification , Molecular Sequence Annotation
8.
PLoS One ; 9(2): e90490, 2014.
Article in English | MEDLINE | ID: mdl-24587377

ABSTRACT

SNP-association studies are a starting point for identifying genes that may be responsible for specific phenotypes, such as disease traits. The vast bulk of tools for SNP-association studies are directed toward SNPs in the human genome, and I am unaware of any tools designed specifically for such studies in bacterial or viral genomes. The PPFS (Predict Phenotypes From SNPs) package described here is an add-on to kSNP , a program that can identify SNPs in a data set of hundreds of microbial genomes. PPFS identifies those SNPs that are non-randomly associated with a phenotype based on the χ² probability, then uses those diagnostic SNPs for two distinct, but related, purposes: (1) to predict the phenotypes of strains whose phenotypes are unknown, and (2) to identify those diagnostic SNPs that are most likely to be causally related to the phenotype. In the example illustrated here, from a set of 68 E. coli genomes, for 67 of which the pathogenicity phenotype was known, there were 418,500 SNPs. Using the phenotypes of 36 of those strains, PPFS identified 207 diagnostic SNPs. The diagnostic SNPs predicted the phenotypes of all of the genomes with 97% accuracy. It then identified 97 SNPs whose probability of being causally related to the pathogenic phenotype was >0.999. In a second example, from a set of 116 E. coli genome sequences, using the phenotypes of 65 strains PPFS identified 101 SNPs that predicted the source host (human or non-human) with 90% accuracy.


Subject(s)
Computational Biology/methods , Genome, Microbial/genetics , Polymorphism, Single Nucleotide , Software , Escherichia coli/classification , Escherichia coli/genetics , Genome, Bacterial/genetics , Phenotype , Phylogeny , Reproducibility of Results , Shigella/classification , Shigella/genetics
9.
Mol Biol Evol ; 31(1): 232-8, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24170494

ABSTRACT

In the 1960s-1980s, determination of bacterial growth rates was an important tool in microbial genetics, biochemistry, molecular biology, and microbial physiology. The exciting technical developments of the 1990s and the 2000s eclipsed that tool; as a result, many investigators today lack experience with growth rate measurements. Recently, investigators in a number of areas have started to use measurements of bacterial growth rates for a variety of purposes. Those measurements have been greatly facilitated by the availability of microwell plate readers that permit the simultaneous measurements on up to 384 different cultures. Only the exponential (logarithmic) portions of the resulting growth curves are useful for determining growth rates, and manual determination of that portion and calculation of growth rates can be tedious for high-throughput purposes. Here, we introduce the program GrowthRates that uses plate reader output files to automatically determine the exponential portion of the curve and to automatically calculate the growth rate, the maximum culture density, and the duration of the growth lag phase. GrowthRates is freely available for Macintosh, Windows, and Linux. We discuss the effects of culture volume, the classical bacterial growth curve, and the differences between determinations in rich media and minimal (mineral salts) media. This protocol covers calibration of the plate reader, growth of culture inocula for both rich and minimal media, and experimental setup. As a guide to reliability, we report typical day-to-day variation in growth rates and variation within experiments with respect to position of wells within the plates.


Subject(s)
Bacteria/growth & development , Software , Algorithms , Bacteriological Techniques , Culture Media/chemistry , Phenotype , Reproducibility of Results
10.
PLoS One ; 8(12): e81760, 2013.
Article in English | MEDLINE | ID: mdl-24349125

ABSTRACT

Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.


Subject(s)
Computational Biology , Escherichia coli/genetics , Genome, Bacterial , Phylogeny , Polymorphism, Single Nucleotide , Shigella/genetics , Software , Algorithms , Databases, Nucleic Acid , Escherichia coli/classification , Escherichia coli/pathogenicity , Humans , Molecular Sequence Annotation , Sequence Alignment , Sequence Analysis, DNA , Shigella/classification
11.
PLoS One ; 8(7): e68901, 2013.
Article in English | MEDLINE | ID: mdl-23935901

ABSTRACT

In clinical settings it is often important to know not just the identity of a microorganism, but also the danger posed by that particular strain. For instance, Escherichia coli can range from being a harmless commensal to being a very dangerous enterohemorrhagic (EHEC) strain. Determining pathogenic phenotypes can be both time consuming and expensive. Here we propose a simple, rapid, and inexpensive method of predicting pathogenic phenotypes on the basis of the presence or absence of short homologous DNA segments in an isolate. Our method compares completely sequenced genomes without the necessity of genome alignments in order to identify the presence or absence of the segments to produce an automatic alignment of the binary string that describes each genome. Analysis of the segment alignment allows identification of those segments whose presence strongly predicts a phenotype. Clinical application of the method requires nothing more that PCR amplification of each of the set of predictive segments. Here we apply the method to identifying EHEC strains of E. coli and to distinguishing E. coli from Shigella. We show in silico that with as few as 8 predictive sequences, if even three of those predictive sequences are amplified the probability of being EHEC or Shigella is >0.99. The method is thus very robust to the occasional amplification failure for spurious reasons. Experimentally, we apply the method to screening a set of 98 isolates to distinguishing E. coli from Shigella, and EHEC from non-EHEC E. coli strains and show that all isolates are correctly identified.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial/genetics , Sequence Analysis, DNA , Shigella/genetics , Base Sequence , Cluster Analysis , Computer Simulation , DNA Probes/metabolism , Databases, Genetic , Dysentery, Bacillary/microbiology , Escherichia coli/isolation & purification , Escherichia coli Infections/microbiology , Phenotype , Polymerase Chain Reaction , Reproducibility of Results
12.
Genome Biol Evol ; 5(6): 1176-84, 2013.
Article in English | MEDLINE | ID: mdl-23739739

ABSTRACT

Optical mapping is a technique that produces an ordered restriction map of a bacterial or eukaryotic chromosome. We have developed a new method, the BOP method, to compare experimental optical maps with in silico optical maps of complete genomes to infer the presence/absence of short DNA sequences (bops) in each genome. The BOP method, as implemented by the Optical Mapping suite of four programs, circumvents the necessity of whole-genome multiple alignments and permits reliable strain typing and clustering on the basis of optical maps. We have applied the Optical Mapping Suite to 125 strains of Acinetobacter sp., including 11 completely sequenced genomes and 114 Acinetobacter complex from three US military hospitals. We found that optical mapping completely resolves all 125 strains. Signal to noise analysis showed that when the 125 strains were considered together almost 1/3 of the experimental fragments were misidentified. We found that the set of 125 genomes could be divided into three clusters, two of which included sequenced genomes. Signal to noise analysis after clustering showed that only 3.5% of the experimental restriction fragments were misidentified. Minimum spanning trees of the two clusters that included sequenced genomes are presented. The programs we have developed provide a more rigorous approach for analyzing optical map data than previously existed.


Subject(s)
Acinetobacter Infections/microbiology , Acinetobacter/classification , Acinetobacter/genetics , Optical Restriction Mapping/methods , Acinetobacter/isolation & purification , Acinetobacter Infections/diagnosis , Bacterial Typing Techniques/methods , Cluster Analysis , Genome, Bacterial , Humans , Sequence Analysis, DNA
13.
Mol Biol Evol ; 30(5): 1229-35, 2013 May.
Article in English | MEDLINE | ID: mdl-23486614

ABSTRACT

Phylogenetic analysis is sometimes regarded as being an intimidating, complex process that requires expertise and years of experience. In fact, it is a fairly straightforward process that can be learned quickly and applied effectively. This Protocol describes the several steps required to produce a phylogenetic tree from molecular data for novices. In the example illustrated here, the program MEGA is used to implement all those steps, thereby eliminating the need to learn several programs, and to deal with multiple file formats from one step to another (Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 28:2731-2739). The first step, identification of a set of homologous sequences and downloading those sequences, is implemented by MEGA's own browser built on top of the Google Chrome toolkit. For the second step, alignment of those sequences, MEGA offers two different algorithms: ClustalW and MUSCLE. For the third step, construction of a phylogenetic tree from the aligned sequences, MEGA offers many different methods. Here we illustrate the maximum likelihood method, beginning with MEGA's Models feature, which permits selecting the most suitable substitution model. Finally, MEGA provides a powerful and flexible interface for the final step, actually drawing the tree for publication. Here a step-by-step protocol is presented in sufficient detail to allow a novice to start with a sequence of interest and to build a publication-quality tree illustrating the evolution of an appropriate set of homologs of that sequence. MEGA is available for use on PCs and Macs from www.megasoftware.net.


Subject(s)
Evolution, Molecular , Phylogeny , Software , Algorithms , Internet
14.
PLoS One ; 8(2): e56040, 2013.
Article in English | MEDLINE | ID: mdl-23418506

ABSTRACT

The evolution of antibiotic resistance among bacteria threatens our continued ability to treat infectious diseases. The need for sustainable strategies to cure bacterial infections has never been greater. So far, all attempts to restore susceptibility after resistance has arisen have been unsuccessful, including restrictions on prescribing [1] and antibiotic cycling [2], [3]. Part of the problem may be that those efforts have implemented different classes of unrelated antibiotics, and relied on removal of resistance by random loss of resistance genes from bacterial populations (drift). Here, we show that alternating structurally similar antibiotics can restore susceptibility to antibiotics after resistance has evolved. We found that the resistance phenotypes conferred by variant alleles of the resistance gene encoding the TEM ß-lactamase (bla(TEM)) varied greatly among 15 different ß-lactam antibiotics. We captured those differences by characterizing complete adaptive landscapes for the resistance alleles bla(TEM-50) and bla(TEM-85), each of which differs from its ancestor bla(TEM-1) by four mutations. We identified pathways through those landscapes where selection for increased resistance moved in a repeating cycle among a limited set of alleles as antibiotics were alternated. Our results showed that susceptibility to antibiotics can be sustainably renewed by cycling structurally similar antibiotics. We anticipate that these results may provide a conceptual framework for managing antibiotic resistance. This approach may also guide sustainable cycling of the drugs used to treat malaria and HIV.


Subject(s)
Anti-Bacterial Agents/administration & dosage , Drug Resistance, Bacterial/drug effects , Drug Administration Schedule , Escherichia coli , Microbial Sensitivity Tests , Mutagenesis, Site-Directed
15.
FEBS Lett ; 587(6): 799-803, 2013 Mar 18.
Article in English | MEDLINE | ID: mdl-23416295

ABSTRACT

The catalytic activity of the Family 4 glycosidase LplD protein, whose active site motif is CHEV, is unknown despite its crystal structure having been determined in 2008. Here we identify that activity as being an α-galacturonidase whose natural substrate is probably α-1,4-di-galacturonate (GalUA2). Phylogenetic analysis shows that LplD belongs to a monophyletic clade of CHEV Family 4 enzymes, of which four other members are also shown to be galacturonidases. Family GH 4 enzymes catalyze the cleavage of the glycosidic bond, via a non-canonical redox-assisted mechanism that contrasts with Koshland's double-displacement mechanism.


Subject(s)
Bacillus subtilis/enzymology , Bacterial Proteins/chemistry , Galactose/analogs & derivatives , Glycoside Hydrolases/chemistry , Amino Acid Motifs , Bacillus subtilis/chemistry , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Biocatalysis , Catalytic Domain , Escherichia coli/genetics , Galactose/metabolism , Glycoside Hydrolases/genetics , Glycoside Hydrolases/metabolism , Phylogeny , Protein Structure, Tertiary , Recombinant Proteins/chemistry , Recombinant Proteins/genetics , Recombinant Proteins/metabolism , Substrate Specificity
16.
J Bacteriol ; 194(15): 3922-37, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22609915

ABSTRACT

Gardnerella vaginalis is associated with a spectrum of clinical conditions, suggesting high degrees of genetic heterogeneity among stains. Seventeen G. vaginalis isolates were subjected to a battery of comparative genomic analyses to determine their level of relatedness. For each measure, the degree of difference among the G. vaginalis strains was the highest observed among 23 pathogenic bacterial species for which at least eight genomes are available. Genome sizes ranged from 1.491 to 1.716 Mb; GC contents ranged from 41.18% to 43.40%; and the core genome, consisting of only 746 genes, makes up only 51.6% of each strain's genome on average and accounts for only 27% of the species supragenome. Neighbor-grouping analyses, using both distributed gene possession data and core gene allelic data, each identified two major sets of strains, each of which is composed of two groups. Each of the four groups has its own characteristic genome size, GC ratio, and greatly expanded core gene content, making the genomic diversity of each group within the range for other bacterial species. To test whether these 4 groups corresponded to genetically isolated clades, we inferred the phylogeny of each distributed gene that was present in at least two strains and absent in at least two strains; this analysis identified frequent homologous recombination within groups but not between groups or sets. G. vaginalis appears to include four nonrecombining groups/clades of organisms with distinct gene pools and genomic properties, which may confer distinct ecological properties. Consequently, it may be appropriate to treat these four groups as separate species.


Subject(s)
Bacterial Infections/microbiology , DNA, Bacterial/genetics , Gardnerella vaginalis/classification , Gardnerella vaginalis/genetics , Genome, Bacterial , Polymorphism, Genetic , Base Composition , Cluster Analysis , DNA, Bacterial/chemistry , Gardnerella vaginalis/isolation & purification , Genes, Bacterial , Genotype , Humans , Molecular Sequence Data , Phylogeny , Sequence Analysis, DNA
18.
J Clin Microbiol ; 49(10): 3568-75, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21849692

ABSTRACT

Minimum spanning trees (MSTs) are frequently used in molecular epidemiology research to estimate relationships among individual strains or isolates. Nevertheless, there are significant caveats to MST algorithms that have been largely ignored in molecular epidemiology studies and that have the potential to confound or alter the interpretation of the results of those analyses. Specifically, (i) presenting a single, arbitrarily selected MST illustrates only one of potentially many equally optimal solutions, and (ii) statistical metrics are not used to assess the credibility of MST estimations. Here, we survey published MSTs previously used to infer microbial population structure in order to determine the effect of these factors. We propose a technique to estimate the number of alternative MSTs for a data set and find that multiple MSTs exist for each case in our survey. By implementing a bootstrapping metric to evaluate the reliability of alternative MST solutions, we discover that they encompass a wide range of credibility values. On the basis of these observations, we conclude that current approaches to studying population structure using MSTs are inadequate. We instead propose a systematic approach to MST estimation that bases analyses on the optimal computation of an input distance matrix, provides information about the number and configurations of alternative MSTs, and allows identification of the most credible MST or MSTs by using a bootstrapping metric. It is our hope this algorithm will become the new "gold standard" approach for analyzing MSTs for molecular epidemiology so that this generally useful computational approach can be used informatively and to its full potential.


Subject(s)
Molecular Typing/methods , Polymorphism, Genetic , Cluster Analysis , Genotype , Humans , Molecular Epidemiology/methods
19.
Infect Genet Evol ; 11(7): 1505-13, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21708290

ABSTRACT

Mycobacterium leprae, the causative agent of leprosy, is an unusual organism that presents unique challenges to those studying the disease through molecular epidemiology. As a consequence, many basic aspects of disease transmission and biology remain unilluminated. In this review, we explore the general principles of molecular epidemiology, and the special difficulties surrounding the application of molecular epidemiology to M. leprae. We briefly discuss the computational tools commonly employed in molecular epidemiology studies. The past decade of developments in molecular strain typing approaches through VNTRs and SNP loci, and their merits and limitations, are discussed. We summarize what has been learned about the transmission and historical origins of leprosy through molecular epidemiology and Bacterial Population Genetics, to date. Lastly, we critically evaluate the strengths and shortcomings of leprosy research, and present recommendations for future work that will hopefully shed light on some of the disease's most fundamental mysteries.


Subject(s)
Leprosy/epidemiology , Leprosy/microbiology , Mycobacterium leprae/classification , Mycobacterium leprae/genetics , Bacterial Typing Techniques , Disease Reservoirs/microbiology , Environmental Microbiology , Genes, Bacterial , Humans , Leprosy/transmission , Minisatellite Repeats , Molecular Epidemiology , Phylogeny , Polymorphism, Single Nucleotide
20.
BMC Genomics ; 12: 187, 2011 Apr 13.
Article in English | MEDLINE | ID: mdl-21489287

ABSTRACT

BACKGROUND: Staphylococcus aureus is associated with a spectrum of symbiotic relationships with its human host from carriage to sepsis and is frequently associated with nosocomial and community-acquired infections, thus the differential gene content among strains is of interest. RESULTS: We sequenced three clinical strains and combined these data with 13 publically available human isolates and one bovine strain for comparative genomic analyses. All genomes were annotated using RAST, and then their gene similarities and differences were delineated. Gene clustering yielded 3,155 orthologous gene clusters, of which 2,266 were core, 755 were distributed, and 134 were unique. Individual genomes contained between 2,524 and 2,648 genes. Gene-content comparisons among all possible S. aureus strain pairs (n = 136) revealed a mean difference of 296 genes and a maximum difference of 476 genes. We developed a revised version of our finite supragenome model to estimate the size of the S. aureus supragenome (3,221 genes, with 2,245 core genes), and compared it with those of Haemophilus influenzae and Streptococcus pneumoniae. There was excellent agreement between RAST's annotations and our CDS clustering procedure providing for high fidelity metabolomic subsystem analyses to extend our comparative genomic characterization of these strains. CONCLUSIONS: Using a multi-species comparative supragenomic analysis enabled by an improved version of our finite supragenome model we provide data and an interpretation explaining the relatively larger core genome of S. aureus compared to other opportunistic nasopharyngeal pathogens. In addition, we provide independent validation for the efficiency and effectiveness of our orthologous gene clustering algorithm.


Subject(s)
Genome, Bacterial , Haemophilus influenzae/genetics , Staphylococcus aureus/genetics , Streptococcus pneumoniae/genetics , Algorithms , Animals , Cattle , Gene Expression Regulation, Bacterial , Haemophilus influenzae/isolation & purification , Humans , Models, Genetic , Multigene Family , Open Reading Frames , Staphylococcal Infections/microbiology , Staphylococcus aureus/isolation & purification , Streptococcus pneumoniae/isolation & purification
SELECTION OF CITATIONS
SEARCH DETAIL
...