Pesquisa | Portal Regional da BVS (teste)

1.

Routine Whole-Genome Sequencing for Outbreak Investigations of Staphylococcus aureus in a National Reference Center.

Durand, Geraldine; Javerliat, Fabien; Bes, Michèle; Veyrieras, Jean-Baptiste; Guigon, Ghislaine; Mugnier, Nathalie; Schicklin, Stéphane; Kaneko, Gaël; Santiago-Allexant, Emmanuelle; Bouchiat, Coralie; Martins-Simões, Patrícia; Laurent, Frederic; Van Belkum, Alex; Vandenesch, François; Tristan, Anne.

Front Microbiol ; 9: 511, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-29616014

RESUMO

The French National Reference Center for Staphylococci currently uses DNA arrays and spa typing for the initial epidemiological characterization of Staphylococcus aureus strains. We here describe the use of whole-genome sequencing (WGS) to investigate retrospectively four distinct and virulent S. aureus lineages [clonal complexes (CCs): CC1, CC5, CC8, CC30] involved in hospital and community outbreaks or sporadic infections in France. We used a WGS bioinformatics pipeline based on de novo assembly (reference-free approach), single nucleotide polymorphism analysis, and on the inclusion of epidemiological markers. We examined the phylogeographic diversity of the French dominant hospital-acquired CC8-MRSA (methicillin-resistant S. aureus) Lyon clone through WGS analysis which did not demonstrate evidence of large-scale geographic clustering. We analyzed sporadic cases along with two outbreaks of a CC1-MSSA (methicillin-susceptible S. aureus) clone containing the Panton-Valentine leukocidin (PVL) and results showed that two sporadic cases were closely related. We investigated an outbreak of PVL-positive CC30-MSSA in a school environment and were able to reconstruct the transmission history between eight families. We explored different outbreaks among newborns due to the CC5-MRSA Geraldine clone and we found evidence of an unsuspected link between two otherwise distinct outbreaks. Here, WGS provides the resolving power to disprove transmission events indicated by conventional methods (same sequence type, spa type, toxin profile, and antibiotic resistance profile) and, most importantly, WGS can reveal unsuspected transmission events. Therefore, WGS allows to better describe and understand outbreaks and (inter-)national dissemination of S. aureus lineages. Our findings underscore the importance of adding WGS for (inter-)national surveillance of infections caused by virulent clones of S. aureus but also substantiate the fact that technological optimization at the bioinformatics level is still urgently needed for routine use. However, the greatest limitation of WGS analysis is the completeness and the correctness of the reference database being used and the conversion of floods of data into actionable results. The WGS bioinformatics pipeline (EpiSeqTM) we used here can easily generate a uniform database and associated metadata for epidemiological applications.

2.

Correlation between phenotypic antibiotic susceptibility and the resistome in Pseudomonas aeruginosa.

Jaillard, Magali; van Belkum, Alex; Cady, Kyle C; Creely, David; Shortridge, Dee; Blanc, Bernadette; Barbu, E Magda; Dunne, W Michael; Zambardi, Gilles; Enright, Mark; Mugnier, Nathalie; Le Priol, Christophe; Schicklin, Stéphane; Guigon, Ghislaine; Veyrieras, Jean-Baptiste.

Int J Antimicrob Agents ; 50(2): 210-218, 2017 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-28554735

RESUMO

Genetic determinants of antibiotic resistance (AR) have been extensively investigated. High-throughput sequencing allows for the assessment of the relationship between genotype and phenotype. A panel of 672 Pseudomonas aeruginosa strains was analysed, including representatives of globally disseminated multidrug-resistant and extensively drug-resistant clones; genomes and multiple antibiograms were available. This panel was annotated for AR gene presence and polymorphism, defining a resistome in which integrons were included. Integrons were present in >70 distinct cassettes, with In5 being the most prevalent. Some cassettes closely associated with clonal complexes, whereas others spread across the phylogenetic diversity, highlighting the importance of horizontal transfer. A resistome-wide association study (RWAS) was performed for clinically relevant antibiotics by correlating the variability in minimum inhibitory concentration (MIC) values with resistome data. Resistome annotation identified 147 loci associated with AR. These loci consisted mainly of acquired genomic elements and intrinsic genes. The RWAS allowed for correct identification of resistance mechanisms for meropenem, amikacin, levofloxacin and cefepime, and added 46 novel mutations. Among these, 29 were variants of the oprD gene associated with variation in meropenem MIC. Using genomic and MIC data, phenotypic AR was successfully correlated with molecular determinants at the whole-genome sequence level.

Assuntos

Antibacterianos/farmacologia , Farmacorresistência Bacteriana , Genes Bacterianos , Genótipo , Pseudomonas aeruginosa/efeitos dos fármacos , Pseudomonas aeruginosa/genética , Loci Gênicos , Humanos , Sequências Repetitivas Dispersas , Testes de Sensibilidade Microbiana , Infecções por Pseudomonas/microbiologia , Pseudomonas aeruginosa/isolamento & purificação

3.

A comprehensive hybridization model allows whole HERV transcriptome profiling using high density microarray.

Becker, Jérémie; Pérot, Philippe; Cheynet, Valérie; Oriol, Guy; Mugnier, Nathalie; Mommert, Marine; Tabone, Olivier; Textoris, Julien; Veyrieras, Jean-Baptiste; Mallet, François.

BMC Genomics ; 18(1): 286, 2017 04 08.

Artigo em Inglês | MEDLINE | ID: mdl-28390408

RESUMO

BACKGROUND: Human endogenous retroviruses (HERVs) have received much attention for their implications in the etiology of many human diseases and their profound effect on evolution. Notably, recent studies have highlighted associations between HERVs expression and cancers (Yu et al., Int J Mol Med 32, 2013), autoimmunity (Balada et al., Int Rev Immunol 29:351-370, 2010) and neurological (Christensen, J Neuroimmune Pharmacol 5:326-335, 2010) conditions. Their repetitive nature makes their study particularly challenging, where expression studies have largely focused on individual loci (De Parseval et al., J Virol 77:10414-10422, 2003) or general trends within families (Forsman et al., J Virol Methods 129:16-30, 2005; Seifarth et al., J Virol 79:341-352, 2005; Pichon et al., Nucleic Acids Res 34:e46, 2006). METHODS: To refine our understanding of HERVs activity, we introduce here a new microarray, HERV-V3. This work was made possible by the careful detection and annotation of genomic HERV/MaLR sequences as well as the development of a new hybridization model, allowing the optimization of probe performances and the control of cross-reactions.ï»¿ï»¿ï»¿ RESULTS: HERV-V3 offers an almost complete coverage of HERVs and their ancestors (mammalian apparent LTR-retrotransposons, MaLRs) at the locus level along with four other repertoires (active LINE-1 elements, lncRNA, a selection of 1559 human genes and common infectious viruses). We demonstrate that HERV-V3 analytical performances are comparable with commercial Affymetrix arrays, and that for a selection of tissue/pathological specific loci, the patterns of expression measured on HERV-V3 is consistent with those reported in the literature. CONCLUSIONS: Given its large HERVs/MaLRs coverage and additional repertoires, HERV-V3 opens the door to multiple applications such as enhancers and alternative promoters identification, biomarkers identification as well as the characterization of genes and HERVs/MaLRs modulation caused by viral infection.

Assuntos

Retrovirus Endógenos/genética , Perfilação da Expressão Gênica , Hibridização Genética , Modelos Genéticos , Transcriptoma , Algoritmos , Análise por Conglomerados , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica/métodos , Loci Gênicos , Humanos , Hibridização de Ácido Nucleico , Reprodutibilidade dos Testes , Fluxo de Trabalho

4.

Optimization of alignment-based methods for taxonomic binning of metagenomics reads.

Jaillard, Magali; Tournoud, Maud; Meynier, Faustine; Veyrieras, Jean-Baptiste.

Bioinformatics ; 32(12): 1779-87, 2016 06 15.

Artigo em Inglês | MEDLINE | ID: mdl-26833346

RESUMO

MOTIVATION: Alignment-based taxonomic binning for metagenome characterization proceeds in two steps: reads mapping against a reference database (RDB) and taxonomic assignment according to the best hits. Beyond the sequencing technology and the completeness of the RDB, selecting the optimal configuration of the workflow, in particular the mapper parameters and the best hit selection threshold, to get the highest binning performance remains quite empirical. RESULTS: We developed a statistical framework to perform such optimization at a minimal computational cost. Using an optimization experimental design and simulated datasets for three sequencing technologies, we built accurate prediction models for five performance indicators and then derived the parameter configuration providing the optimal performance. Whatever the mapper and the dataset, we observed that the optimal configuration yielded better performance than the default configuration and that the best hit selection threshold had a large impact on performance. Finally, on a reference dataset from the Human Microbiome Project, we confirmed that the optimized configuration increased the performance compared with the default configuration. AVAILABILITY AND IMPLEMENTATION: Not applicable. CONTACT: magali.dancette@biomerieux.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Metagenômica , Algoritmos , Humanos , Metagenoma , Microbiota , Modelos Teóricos

5.

Large-scale machine learning for metagenomics sequence classification.

Vervier, Kévin; Mahé, Pierre; Tournoud, Maud; Veyrieras, Jean-Baptiste; Vert, Jean-Philippe.

Bioinformatics ; 32(7): 1023-32, 2016 04 01.

Artigo em Inglês | MEDLINE | ID: mdl-26589281

RESUMO

MOTIVATION: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Because of the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasonable computing requirements. While standard alignment-based methods provide state-of-the-art performance, compositional approaches that assign a taxonomic class to a DNA read based on the k-mers it contains have the potential to provide faster solutions. RESULTS: We propose a new rank-flexible machine learning-based compositional approach for taxonomic assignment of metagenomics reads and show that it benefits from increasing the number of fragments sampled from reference genome to tune its parameters, up to a coverage of about 10, and from increasing the k-mer size to about 12. Tuning the method involves training machine learning models on about 10(8) samples in 10(7) dimensions, which is out of reach of standard softwares but can be done efficiently with modern implementations for large-scale machine learning. The resulting method is competitive in terms of accuracy with well-established alignment and composition-based tools for problems involving a small to moderate number of candidate species and for reasonable amounts of sequencing errors. We show, however, that machine learning-based compositional approaches are still limited in their ability to deal with problems involving a greater number of species and more sensitive to sequencing errors. We finally show that the new method outperforms the state-of-the-art in its ability to classify reads from species of lineage absent from the reference database and confirm that compositional approaches achieve faster prediction times, with a gain of 2-17 times with respect to the BWA-MEM short read mapper, depending on the number of candidate species and the level of sequencing noise. AVAILABILITY AND IMPLEMENTATION: Data and codes are available at http://cbio.ensmp.fr/largescalemetagenomics CONTACT: pierre.mahe@biomerieux.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Aprendizado de Máquina , Metagenômica , Análise de Sequência de DNA , Algoritmos , Metagenoma , Software

6.

Phylogenetic Distribution of CRISPR-Cas Systems in Antibiotic-Resistant Pseudomonas aeruginosa.

van Belkum, Alex; Soriaga, Leah B; LaFave, Matthew C; Akella, Srividya; Veyrieras, Jean-Baptiste; Barbu, E Magda; Shortridge, Dee; Blanc, Bernadette; Hannum, Gregory; Zambardi, Gilles; Miller, Kristofer; Enright, Mark C; Mugnier, Nathalie; Brami, Daniel; Schicklin, Stéphane; Felderman, Martina; Schwartz, Ariel S; Richardson, Toby H; Peterson, Todd C; Hubby, Bolyn; Cady, Kyle C.

mBio ; 6(6): e01796-15, 2015 Nov 24.

Artigo em Inglês | MEDLINE | ID: mdl-26604259

RESUMO

UNLABELLED: Pseudomonas aeruginosa is an antibiotic-refractory pathogen with a large genome and extensive genotypic diversity. Historically, P. aeruginosa has been a major model system for understanding the molecular mechanisms underlying type I clustered regularly interspaced short palindromic repeat (CRISPR) and CRISPR-associated protein (CRISPR-Cas)-based bacterial immune system function. However, little information on the phylogenetic distribution and potential role of these CRISPR-Cas systems in molding the P. aeruginosa accessory genome and antibiotic resistance elements is known. Computational approaches were used to identify and characterize CRISPR-Cas systems within 672 genomes, and in the process, we identified a previously unreported and putatively mobile type I-C P. aeruginosa CRISPR-Cas system. Furthermore, genomes harboring noninhibited type I-F and I-E CRISPR-Cas systems were on average ~300 kb smaller than those without a CRISPR-Cas system. In silico analysis demonstrated that the accessory genome (n = 22,036 genes) harbored the majority of identified CRISPR-Cas targets. We also assembled a global spacer library that aided the identification of difficult-to-characterize mobile genetic elements within next-generation sequencing (NGS) data and allowed CRISPR typing of a majority of P. aeruginosa strains. In summary, our analysis demonstrated that CRISPR-Cas systems play an important role in shaping the accessory genomes of globally distributed P. aeruginosa isolates. IMPORTANCE: P. aeruginosa is both an antibiotic-refractory pathogen and an important model system for type I CRISPR-Cas bacterial immune systems. By combining the genome sequences of 672 newly and previously sequenced genomes, we were able to provide a global view of the phylogenetic distribution, conservation, and potential targets of these systems. This analysis identified a new and putatively mobile P. aeruginosa CRISPR-Cas subtype, characterized the diverse distribution of known CRISPR-inhibiting genes, and provided a potential new use for CRISPR spacer libraries in accessory genome analysis. Our data demonstrated the importance of CRISPR-Cas systems in modulating the accessory genomes of globally distributed strains while also providing substantial data for subsequent genomic and experimental studies in multiple fields. Understanding why certain genotypes of P. aeruginosa are clinically prevalent and adept at horizontally acquiring virulence and antibiotic resistance elements is of major clinical and economic importance.

Assuntos

Antibacterianos/farmacologia , Sistemas CRISPR-Cas , Farmacorresistência Bacteriana , Variação Genética , Filogenia , Pseudomonas aeruginosa/efeitos dos fármacos , Pseudomonas aeruginosa/genética , Biologia Computacional , Genoma Bacteriano , Pseudomonas aeruginosa/classificação , Análise de Sequência de DNA

7.

A strategy to build and validate a prognostic biomarker model based on RT-qPCR gene expression and clinical covariates.

Tournoud, Maud; Larue, Audrey; Cazalis, Marie-Angelique; Venet, Fabienne; Pachot, Alexandre; Monneret, Guillaume; Lepape, Alain; Veyrieras, Jean-Baptiste.

BMC Bioinformatics ; 16: 106, 2015 Mar 28.

Artigo em Inglês | MEDLINE | ID: mdl-25880752

RESUMO

BACKGROUND: Construction and validation of a prognostic model for survival data in the clinical domain is still an active field of research. Nevertheless there is no consensus on how to develop routine prognostic tests based on a combination of RT-qPCR biomarkers and clinical or demographic variables. In particular, the estimation of the model performance requires to properly account for the RT-qPCR experimental design. RESULTS: We present a strategy to build, select, and validate a prognostic model for survival data based on a combination of RT-qPCR biomarkers and clinical or demographic data and we provide an illustration on a real clinical dataset. First, we compare two cross-validation schemes: a classical outcome-stratified cross-validation scheme and an alternative one that accounts for the RT-qPCR plate design, especially when samples are processed by batches. The latter is intended to limit the performance discrepancies, also called the validation surprise, between the training and the test sets. Second, strategies for model building (covariate selection, functional relationship modeling, and statistical model) as well as performance indicators estimation are presented. Since in practice several prognostic models can exhibit similar performances, complementary criteria for model selection are discussed: the stability of the selected variables, the model optimism, and the impact of the omitted variables on the model performance. CONCLUSION: On the training dataset, appropriate resampling methods are expected to prevent from any upward biases due to unaccounted technical and biological variability that may arise from the experimental and intrinsic design of the RT-qPCR assay. Moreover, the stability of the selected variables, the model optimism, and the impact of the omitted variables on the model performances are pivotal indicators to select the optimal model to be validated on the test dataset.

Assuntos

Expressão Gênica , Modelos de Riscos Proporcionais , Reação em Cadeia da Polimerase em Tempo Real , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Biomarcadores , Humanos , Prognóstico , Choque Séptico/mortalidade

8.

Three-dimensional characterization of bacterial microcolonies on solid agar-based culture media.

Drazek, Laurent; Tournoud, Maud; Derepas, Frédéric; Guicherd, Maryse; Mahé, Pierre; Pinston, Frédéric; Veyrieras, Jean-Baptiste; Chatellier, Sonia.

J Microbiol Methods ; 109: 149-56, 2015 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-25533218

RESUMO

For the last century, in vitro diagnostic process in microbiology has mainly relied on the growth of bacteria on the surface of a solid agar medium. Nevertheless, few studies focused in the past on the dynamics of microcolonies growth on agar surface before 8 to 10h of incubation. In this article, chromatic confocal microscopy has been applied to characterize the early development of a bacterial colony. This technology relies on a differential focusing depth of the white light. It allows one to fully measure the tridimensional shape of microcolonies more quickly than classical confocal microscopy but with the same spatial resolution. Placing the device in an incubator, the method was able to individually track colonies growing on an agar plate, and to follow the evolution of their surface or volume. Using an appropriate statistical modeling framework, for a given microorganism, the doubling time has been estimated for each individual colony, as well as its variability between colonies, both within and between agar plates. A proof of concept led on four bacterial strains of four distinct species demonstrated the feasibility and the interest of the approach. It showed in particular that doubling times derived from early tri-dimensional measurements on microcolonies differed from classical measurements in micro-dilutions based on optical diffusion. Such a precise characterization of the tri-dimensional shape of microcolonies in their late-lag to early-exponential phase could be beneficial in terms of in vitro diagnostics. Indeed, real-time monitoring of the biomass available in a colony could allow to run well established microbial identification workflows like, for instance, MALDI-TOF mass-spectrometry, as soon as a sufficient quantity of material is available, thereby reducing the time needed to provide a diagnostic. Moreover, as done for pre-identification of macro-colonies, morphological indicators such as three-dimensional growth profiles derived from microcolonies could be used to perform a first pre-identification step, but in a shorten time.

Assuntos

Bactérias/crescimento & desenvolvimento , Meios de Cultura/química , Processamento de Imagem Assistida por Computador/métodos , Imageamento Tridimensional , Ágar , Técnicas Bacteriológicas/métodos

9.

Comparison of two approaches for the classification of 16S rRNA gene sequences.

Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan.

J Med Microbiol ; 63(Pt 10): 1311-1315, 2014 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-25062942

RESUMO

The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80â% of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5â%. Up to 1.4â% of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods.

Assuntos

Bactérias/classificação , Técnicas Bacteriológicas/métodos , Biologia Computacional/métodos , Genes de RNAr , Técnicas de Diagnóstico Molecular/métodos , RNA Ribossômico 16S/genética , Análise de Sequência de DNA/métodos , Animais , Bactérias/genética , Bactérias/isolamento & purificação , Infecções Bacterianas/diagnóstico , Infecções Bacterianas/veterinária , Humanos

10.

Challenges in the culture-independent analysis of oral and respiratory samples from intubated patients.

Lazarevic, Vladimir; Gaïa, Nadia; Emonet, Stéphane; Girard, Myriam; Renzi, Gesuele; Despres, Lena; Wozniak, Hannah; Yugueros Marcos, Javier; Veyrieras, Jean-Baptiste; Chatellier, Sonia; van Belkum, Alex; Pugin, Jérôme; Schrenzel, Jacques.

Front Cell Infect Microbiol ; 4: 65, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24904840

RESUMO

The spread of microorganisms in hospitals is an important public health threat, and yet few studies have assessed how human microbial communities (microbiota) evolve in the hospital setting. Studies conducted so far have mainly focused on a limited number of bacterial species, mostly pathogenic ones and primarily during outbreaks. We explored the bacterial community diversity of the microbiota from oral and respiratory samples of intubated patients hospitalized in the intensive care unit and we discuss the technical challenges that may arise while using culture-independent approaches to study these types of samples.

Assuntos

Biota , Intubação Intratraqueal , Microbiota , Boca/microbiologia , Sistema Respiratório/microbiologia , Humanos , Unidades de Terapia Intensiva , Técnicas Microbiológicas/métodos , Biologia Molecular/métodos , Projetos Piloto , RNA Ribossômico 16S/genética

11.

Automatic identification of mixed bacterial species fingerprints in a MALDI-TOF mass-spectrum.

Mahé, Pierre; Arsac, Maud; Chatellier, Sonia; Monnin, Valérie; Perrot, Nadine; Mailler, Sandrine; Girard, Victoria; Ramjeet, Mahendrasingh; Surre, Jérémy; Lacroix, Bruno; van Belkum, Alex; Veyrieras, Jean-Baptiste.

Bioinformatics ; 30(9): 1280-6, 2014 May 01.

Artigo em Inglês | MEDLINE | ID: mdl-24443381

RESUMO

MOTIVATION: Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry has been broadly adopted by routine clinical microbiology laboratories for bacterial species identification. An isolated colony of the targeted microorganism is the single prerequisite. Currently, MS-based microbial identification directly from clinical specimens can not be routinely performed, as it raises two main challenges: (i) the nature of the sample itself may increase the level of technical variability and bring heterogeneity with respect to the reference database and (ii) the possibility of encountering polymicrobial samples that will yield a 'mixed' MS fingerprint. In this article, we introduce a new method to infer the composition of polymicrobial samples on the basis of a single mass spectrum. Our approach relies on a penalized non-negative linear regression framework making use of species-specific prototypes, which can be derived directly from the routine reference database of pure spectra. RESULTS: A large spectral dataset obtained from in vitro mono- and bi-microbial samples allowed us to evaluate the performance of the method in a comprehensive way. Provided that the reference matrix-assisted laser desorption/ionization time-of-flight mass spectrometry fingerprints were sufficiently distinct for the individual species, the method automatically predicted which bacterial species were present in the sample. Only few samples (5.3%) were misidentified, and bi-microbial samples were correctly identified in up to 61.2% of the cases. This method could be used in routine clinical microbiology practice.

Assuntos

Bactérias Gram-Negativas/química , Bactérias Gram-Positivas/isolamento & purificação , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Automação , Bases de Dados Genéticas , Bactérias Gram-Negativas/isolamento & purificação , Modelos Lineares

12.

De novo finished 2.8 Mbp Staphylococcus aureus genome assembly from 100 bp short and long range paired-end reads.

Hernandez, David; Tewhey, Ryan; Veyrieras, Jean-Baptiste; Farinelli, Laurent; Østerås, Magne; François, Patrice; Schrenzel, Jacques.

Bioinformatics ; 30(1): 40-9, 2014 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-24130309

RESUMO

MOTIVATION: Paired-end sequencing allows circumventing the shortness of the reads produced by second generation sequencers and is essential for de novo assembly of genomes. However, obtaining a finished genome from short reads is still an open challenge. We present an algorithm that exploits the pairing information issued from inserts of potentially any length. The method determines paths through an overlaps graph by using a constrained search tree. We also present a method that automatically determines suited overlaps cutoffs according to the contextual coverage, reducing thus the need for manual parameterization. Finally, we introduce an interactive mode that allows querying an assembly at targeted regions. RESULTS: We assess our methods by assembling two Staphylococcus aureus strains that were sequenced on the Illumina platform. Using 100 bp paired-end reads and minimal manual curation, we produce a finished genome sequence for the previously undescribed isolate SGH-10-168. AVAILABILITY AND IMPLEMENTATION: The presented algorithms are implemented in the standalone Edena software, freely available under the General Public License (GPLv3) at www.genomic.ch/edena.php.

Assuntos

Mapeamento Cromossômico/métodos , Genoma , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Staphylococcus aureus/genética , Algoritmos , Sequência de Bases , Dados de Sequência Molecular , Análise de Sequência de DNA/métodos , Software

13.

The contribution of RNA decay quantitative trait loci to inter-individual variation in steady-state gene expression levels.

Pai, Athma A; Cain, Carolyn E; Mizrahi-Man, Orna; De Leon, Sherryl; Lewellen, Noah; Veyrieras, Jean-Baptiste; Degner, Jacob F; Gaffney, Daniel J; Pickrell, Joseph K; Stephens, Matthew; Pritchard, Jonathan K; Gilad, Yoav.

PLoS Genet ; 8(10): e1003000, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-23071454

RESUMO

Recent gene expression QTL (eQTL) mapping studies have provided considerable insight into the genetic basis for inter-individual regulatory variation. However, a limitation of all eQTL studies to date, which have used measurements of steady-state gene expression levels, is the inability to directly distinguish between variation in transcription and decay rates. To address this gap, we performed a genome-wide study of variation in gene-specific mRNA decay rates across individuals. Using a time-course study design, we estimated mRNA decay rates for over 16,000 genes in 70 Yoruban HapMap lymphoblastoid cell lines (LCLs), for which extensive genotyping data are available. Considering mRNA decay rates across genes, we found that: (i) as expected, highly expressed genes are generally associated with lower mRNA decay rates, (ii) genes with rapid mRNA decay rates are enriched with putative binding sites for miRNA and RNA binding proteins, and (iii) genes with similar functional roles tend to exhibit correlated rates of mRNA decay. Focusing on variation in mRNA decay across individuals, we estimate that steady-state expression levels are significantly correlated with variation in decay rates in 10% of genes. Somewhat counter-intuitively, for about half of these genes, higher expression is associated with faster decay rates, possibly due to a coupling of mRNA decay with transcriptional processes in genes involved in rapid cellular responses. Finally, we used these data to map genetic variation that is specifically associated with variation in mRNA decay rates across individuals. We found 195 such loci, which we named RNA decay quantitative trait loci ("rdQTLs"). All the observed rdQTLs are located near the regulated genes and therefore are assumed to act in cis. By analyzing our data within the context of known steady-state eQTLs, we estimate that a substantial fraction of eQTLs are associated with inter-individual variation in mRNA decay rates.

Assuntos

Expressão Gênica , Variação Genética , Locos de Características Quantitativas , Estabilidade de RNA , Linhagem Celular , Mapeamento Cromossômico , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Estudo de Associação Genômica Ampla , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Interferência de RNA

14.

Genetic modifiers of chromatin acetylation antagonize the reprogramming of epi-polymorphisms.

Abraham, Anne-Laure; Nagarajan, Muniyandi; Veyrieras, Jean-Baptiste; Bottin, Hélène; Steinmetz, Lars M; Yvert, Gaël.

PLoS Genet ; 8(9): e1002958, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-23028365

RESUMO

Natural populations are known to differ not only in DNA but also in their chromatin-associated epigenetic marks. When such inter-individual epigenomic differences (or "epi-polymorphisms") are observed, their stability is usually not known: they may or may not be reprogrammed over time or upon environmental changes. In addition, their origin may be purely epigenetic, or they may result from regulatory variation encoded in the DNA. Studying epi-polymorphisms requires, therefore, an assessment of their nature and stability. Here we estimate the stability of yeast epi-polymorphisms of chromatin acetylation, and we provide a genome-by-epigenome map of their genetic control. A transient epi-drug treatment was able to reprogram acetylation variation at more than one thousand nucleosomes, whereas a similar amount of variation persisted, distinguishing "labile" from "persistent" epi-polymorphisms. Hundreds of genetic loci underlied acetylation variation at 2,418 nucleosomes either locally (in cis) or distantly (in trans), and this genetic control overlapped only partially with the genetic control of gene expression. Trans-acting regulators were not necessarily associated with genes coding for chromatin modifying enzymes. Strikingly, "labile" and "persistent" epi-polymorphisms were associated with poor and strong genetic control, respectively, showing that genetic modifiers contribute to persistence. These results estimate the amount of natural epigenomic variation that can be lost after transient environmental exposures, and they reveal the complex genetic architecture of the DNA-encoded determinism of chromatin epi-polymorphisms. Our observations provide a basis for the development of population epigenetics.

Assuntos

Cromatina/genética , Epigênese Genética/genética , Histona-Lisina N-Metiltransferase , Polimorfismo Genético , Saccharomyces cerevisiae , Acetilação , Regulação Fúngica da Expressão Gênica , Genética Populacional , Histona-Lisina N-Metiltransferase/genética , Histona-Lisina N-Metiltransferase/metabolismo , Histonas/genética , Histonas/metabolismo , Nucleossomos/metabolismo , Polimorfismo de Nucleotídeo Único , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo

15.

Exon-specific QTLs skew the inferred distribution of expression QTLs detected using gene expression array data.

Veyrieras, Jean-Baptiste; Gaffney, Daniel J; Pickrell, Joseph K; Gilad, Yoav; Stephens, Matthew; Pritchard, Jonathan K.

PLoS One ; 7(2): e30629, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-22359548

RESUMO

Mapping of expression quantitative trait loci (eQTLs) is an important technique for studying how genetic variation affects gene regulation in natural populations. In a previous study using Illumina expression data from human lymphoblastoid cell lines, we reported that cis-eQTLs are especially enriched around transcription start sites (TSSs) and immediately upstream of transcription end sites (TESs). In this paper, we revisit the distribution of eQTLs using additional data from Affymetrix exon arrays and from RNA sequencing. We confirm that most eQTLs lie close to the target genes; that transcribed regions are generally enriched for eQTLs; that eQTLs are more abundant in exons than introns; and that the peak density of eQTLs occurs at the TSS. However, we find that the intriguing TES peak is greatly reduced or absent in the Affymetrix and RNA-seq data. Instead our data suggest that the TES peak observed in the Illumina data is mainly due to exon-specific QTLs that affect 3' untranslated regions, where most of the Illumina probes are positioned. Nonetheless, we do observe an overall enrichment of eQTLs in exons versus introns in all three data sets, consistent with an important role for exonic sequences in gene regulation.

Assuntos

Éxons/genética , Perfilação da Expressão Gênica , Locos de Características Quantitativas , Regulação da Expressão Gênica , Humanos , Distribuições Estatísticas , Regiões Terminadoras Genéticas , Sítio de Iniciação de Transcrição

16.

Dissecting the regulatory architecture of gene expression QTLs.

Gaffney, Daniel J; Veyrieras, Jean-Baptiste; Degner, Jacob F; Pique-Regi, Roger; Pai, Athma A; Crawford, Gregory E; Stephens, Matthew; Gilad, Yoav; Pritchard, Jonathan K.

Genome Biol ; 13(1): R7, 2012 Jan 31.

Artigo em Inglês | MEDLINE | ID: mdl-22293038

RESUMO

BACKGROUND: Expression quantitative trait loci (eQTLs) are likely to play an important role in the genetics of complex traits; however, their functional basis remains poorly understood. Using the HapMap lymphoblastoid cell lines, we combine 1000 Genomes genotypes and an extensive catalogue of human functional elements to investigate the biological mechanisms that eQTLs perturb. RESULTS: We use a Bayesian hierarchical model to estimate the enrichment of eQTLs in a wide variety of regulatory annotations. We find that approximately 40% of eQTLs occur in open chromatin, and that they are particularly enriched in transcription factor binding sites, suggesting that many directly impact protein-DNA interactions. Analysis of core promoter regions shows that eQTLs also frequently disrupt some known core promoter motifs but, surprisingly, are not enriched in other well-known motifs such as the TATA box. We also show that information from regulatory annotations alone, when weighted by the hierarchical model, can provide a meaningful ranking of the SNPs that are most likely to drive gene expression variation. CONCLUSIONS: Our study demonstrates how regulatory annotation and the association signal derived from eQTL-mapping can be combined into a single framework. We used this approach to further our understanding of the biology that drives human gene expression variation, and of the putatively causal SNPs that underlie it.

Assuntos

Proteínas de Ligação a DNA/genética , Desoxirribonuclease I , Expressão Gênica , Locos de Características Quantitativas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Teorema de Bayes , Linhagem Celular , Cromatina/genética , Desoxirribonuclease I/genética , Desoxirribonuclease I/metabolismo , Genoma Humano , Genótipo , Projeto HapMap , Humanos , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Fatores de Transcrição/genética

17.

DNase I sensitivity QTLs are a major determinant of human expression variation.

Degner, Jacob F; Pai, Athma A; Pique-Regi, Roger; Veyrieras, Jean-Baptiste; Gaffney, Daniel J; Pickrell, Joseph K; De Leon, Sherryl; Michelini, Katelyn; Lewellen, Noah; Crawford, Gregory E; Stephens, Matthew; Gilad, Yoav; Pritchard, Jonathan K.

Nature ; 482(7385): 390-4, 2012 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-22307276

RESUMO

The mapping of expression quantitative trait loci (eQTLs) has emerged as an important tool for linking genetic variation to changes in gene regulation. However, it remains difficult to identify the causal variants underlying eQTLs, and little is known about the regulatory mechanisms by which they act. Here we show that genetic variants that modify chromatin accessibility and transcription factor binding are a major mechanism through which genetic variation leads to gene expression differences among humans. We used DNase I sequencing to measure chromatin accessibility in 70 Yoruba lymphoblastoid cell lines, for which genome-wide genotypes and estimates of gene expression levels are also available. We obtained a total of 2.7 billion uniquely mapped DNase I-sequencing (DNase-seq) reads, which allowed us to produce genome-wide maps of chromatin accessibility for each individual. We identified 8,902 locations at which the DNase-seq read depth correlated significantly with genotype at a nearby single nucleotide polymorphism or insertion/deletion (false discovery rate = 10%). We call such variants 'DNase I sensitivity quantitative trait loci' (dsQTLs). We found that dsQTLs are strongly enriched within inferred transcription factor binding sites and are frequently associated with allele-specific changes in transcription factor binding. A substantial fraction (16%) of dsQTLs are also associated with variation in the expression levels of nearby genes (that is, these loci are also classified as eQTLs). Conversely, we estimate that as many as 55% of eQTL single nucleotide polymorphisms are also dsQTLs. Our observations indicate that dsQTLs are highly abundant in the human genome and are likely to be important contributors to phenotypic variation.

Assuntos

Pegada de DNA , Desoxirribonuclease I/metabolismo , Regulação da Expressão Gênica/genética , Variação Genética/genética , Locos de Características Quantitativas/genética , Cromatina/genética , Cromatina/metabolismo , Perfilação da Expressão Gênica , Genoma Humano/genética , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo

18.

Deciphering the genetics of flowering time by an association study on candidate genes in bread wheat (Triticum aestivum L.).

Rousset, Michel; Bonnin, Isabelle; Remoué, Carine; Falque, Matthieu; Rhoné, Bénédicte; Veyrieras, Jean-Baptiste; Madur, Delphine; Murigneux, Alain; Balfourier, François; Le Gouis, Jacques; Santoni, Sylvain; Goldringer, Isabelle.

Theor Appl Genet ; 123(6): 907-26, 2011 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-21761163

RESUMO

Earliness is very important for the adaptation of wheat to environmental conditions and the achievement of high grain yield. A detailed knowledge of key genetic components of the life cycle would enable an easier control by the breeders. The objective of the study was to investigate the effect of candidate genes on flowering time. Using a collection of hexaploid wheat composed of 235 lines from diverse geographical origins, we conducted an association study for six candidate genes for flowering time and its components (vernalization sensitivity and earliness per se). The effect on the variation of earliness components of polymorphisms within the copies of each gene was tested in ANOVA models accounting for the underlying genetic structure. The collection was structured in five groups that minimized the residual covariance. Vernalization requirement and lateness tend to increase according to the mean latitude of each group. Heading date for an autumnal sowing was mainly determined by the earliness per se. Except for the Constans (CO) gene orthologous of the barley HvCO3, all gene polymorphisms had a significant impact on earliness components. The three traits used to quantify vernalization requirement were primarily associated with polymorphisms at Vrn-1 and then at Vrn-3 and Luminidependens (LD) genes. We found a good correspondence between spring/winter types and genotypes at the three homeologous copies of Vrn-1. Earliness per se was mainly explained by polymorphisms at Vrn-3 and to a lesser extent at Vrn-1, Hd-1 and Gigantea (GI) genes. Vernalization requirement and earliness as a function of geographical origin, as well as the possible role of the breeding practices in the geographical distribution of the alleles and the hypothetical adaptive value of the candidate genes, are discussed.

Assuntos

Flores/genética , Flores/fisiologia , Triticum/genética , Triticum/fisiologia , Alelos , Sequência de Bases , Mapeamento Cromossômico , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Estudos de Associação Genética , Variação Genética , Genótipo , Haplótipos , Desequilíbrio de Ligação , Família Multigênica , Fenótipo , Proteínas de Plantas/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Alinhamento de Sequência , Análise de Sequência de DNA

19.

Construction of a potato consensus map and QTL meta-analysis offer new insights into the genetic architecture of late blight resistance and plant maturity traits.

Danan, Sarah; Veyrieras, Jean-Baptiste; Lefebvre, Véronique.

BMC Plant Biol ; 11: 16, 2011 Jan 19.

Artigo em Inglês | MEDLINE | ID: mdl-21247437

RESUMO

BACKGROUND: Integrating QTL results from independent experiments performed on related species helps to survey the genetic diversity of loci/alleles underlying complex traits, and to highlight potential targets for breeding or QTL cloning. Potato (Solanum tuberosum L.) late blight resistance has been thoroughly studied, generating mapping data for many Rpi-genes (R-genes to Phytophthora infestans) and QTLs (quantitative trait loci). Moreover, late blight resistance was often associated with plant maturity. To get insight into the genomic organization of late blight resistance loci as compared to maturity QTLs, a QTL meta-analysis was performed for both traits. RESULTS: Nineteen QTL publications for late blight resistance were considered, seven of them reported maturity QTLs. Twenty-one QTL maps and eight reference maps were compiled to construct a 2,141-marker consensus map on which QTLs were projected and clustered into meta-QTLs. The whole-genome QTL meta-analysis reduced by six-fold late blight resistance QTLs (by clustering 144 QTLs into 24 meta-QTLs), by ca. five-fold maturity QTLs (by clustering 42 QTLs into eight meta-QTLs), and by ca. two-fold QTL confidence interval mean. Late blight resistance meta-QTLs were observed on every chromosome and maturity meta-QTLs on only six chromosomes. CONCLUSIONS: Meta-analysis helped to refine the genomic regions of interest frequently described, and provided the closest flanking markers. Meta-QTLs of late blight resistance and maturity juxtaposed along chromosomes IV, V and VIII, and overlapped on chromosomes VI and XI. The distribution of late blight resistance meta-QTLs is significantly independent from those of Rpi-genes, resistance gene analogs and defence-related loci. The anchorage of meta-QTLs to the potato genome sequence, recently publicly released, will especially improve the candidate gene selection to determine the genes underlying meta-QTLs. All mapping data are available from the Sol Genomics Network (SGN) database.

Assuntos

Imunidade Inata/genética , Phytophthora infestans/fisiologia , Doenças das Plantas/imunologia , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável , Solanum tuberosum/genética , Solanum tuberosum/microbiologia , Mapeamento Cromossômico , Genes de Plantas/genética , Estudos de Associação Genética , Marcadores Genéticos , Doenças das Plantas/genética , Doenças das Plantas/microbiologia

20.

Natural single-nucleosome epi-polymorphisms in yeast.

Nagarajan, Muniyandi; Veyrieras, Jean-Baptiste; de Dieuleveult, Maud; Bottin, Hélène; Fehrmann, Steffen; Abraham, Anne-Laure; Croze, Séverine; Steinmetz, Lars M; Gidrol, Xavier; Yvert, Gaël.

PLoS Genet ; 6(4): e1000913, 2010 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-20421933

RESUMO

Epigenomes commonly refer to the sequence of presence/absence of specific epigenetic marks along eukaryotic chromatin. Complete histone-borne epigenomes have now been described at single-nucleosome resolution from various organisms, tissues, developmental stages, or diseases, yet their intra-species natural variation has never been investigated. We describe here that the epigenomic sequence of histone H3 acetylation at Lysine 14 (H3K14ac) differs greatly between two unrelated strains of the yeast Saccharomyces cerevisiae. Using single-nucleosome chromatin immunoprecipitation and mapping, we interrogated 58,694 nucleosomes and found that 5,442 of them differed in their level of H3K14 acetylation, at a false discovery rate (FDR) of 0.0001. These Single Nucleosome Epi-Polymorphisms (SNEPs) were enriched at regulatory sites and conserved non-coding DNA sequences. Surprisingly, higher acetylation in one strain did not imply higher expression of the relevant gene. However, SNEPs were enriched in genes of high transcriptional variability and one SNEP was associated with the strength of gene activation upon stimulation. Our observations suggest a high level of inter-individual epigenomic variation in natural populations, with essential questions on the origin of this diversity and its relevance to gene x environment interactions.

Assuntos

Epigênese Genética , Nucleossomos/metabolismo , Polimorfismo de Nucleotídeo Único , Saccharomyces cerevisiae/genética , Acetilação , Sequência Conservada , Genoma Fúngico , Saccharomyces cerevisiae/metabolismo

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA