Búsqueda | Portal Regional de la BVS

1.

Synteruptor: mining genomic islands for non-classical specialized metabolite gene clusters.

Haas, Drago; Barba, Matthieu; Vicente, Cláudia M; Nezbedová, Sarká; Garénaux, Amélie; Bury-Moné, Stéphanie; Lorenzi, Jean-Noël; Hôtel, Laurence; Laureti, Luisa; Thibessard, Annabelle; Le Goff, Géraldine; Ouazzani, Jamal; Leblond, Pierre; Aigle, Bertrand; Pernodet, Jean-Luc; Lespinet, Olivier; Lautru, Sylvie.

NAR Genom Bioinform ; 6(2): lqae069, 2024 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-38915823

RESUMEN

Microbial specialized metabolite biosynthetic gene clusters (SMBGCs) are a formidable source of natural products of pharmaceutical interest. With the multiplication of genomic data available, very efficient bioinformatic tools for automatic SMBGC detection have been developed. Nevertheless, most of these tools identify SMBGCs based on sequence similarity with enzymes typically involved in specialised metabolism and thus may miss SMBGCs coding for undercharacterised enzymes. Here we present Synteruptor (https://bioi2.i2bc.paris-saclay.fr/synteruptor), a program that identifies genomic islands, known to be enriched in SMBGCs, in the genomes of closely related species. With this tool, we identified a SMBGC in the genome of Streptomyces ambofaciens ATCC23877, undetected by antiSMASH versions prior to antiSMASH 5, and experimentally demonstrated that it directs the biosynthesis of two metabolites, one of which was identified as sphydrofuran. Synteruptor is also a valuable resource for the delineation of individual SMBGCs within antiSMASH regions that may encompass multiple clusters, and for refining the boundaries of these SMBGCs.

2.

CusProSe: a customizable protein annotation software with an application to the prediction of fungal secondary metabolism genes.

Oliveira, Leonor; Chevrollier, Nicolas; Dallery, Jean-Felix; O'Connell, Richard J; Lebrun, Marc-Henri; Viaud, Muriel; Lespinet, Olivier.

Sci Rep ; 13(1): 1417, 2023 01 25.

Artículo en Inglés | MEDLINE | ID: mdl-36697464

RESUMEN

We report here a new application, CustomProteinSearch (CusProSe), whose purpose is to help users to search for proteins of interest based on their domain composition. The application is customizable. It consists of two independent tools, IterHMMBuild and ProSeCDA. IterHMMBuild allows the iterative construction of Hidden Markov Model (HMM) profiles for conserved domains of selected protein sequences, while ProSeCDA scans a proteome of interest against an HMM profile database, and annotates identified proteins using user-defined rules. CusProSe was successfully used to identify, in fungal genomes, genes encoding key enzyme families involved in secondary metabolism, such as polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS), hybrid PKS-NRPS and dimethylallyl tryptophan synthases (DMATS), as well as to characterize distinct terpene synthases (TS) sub-families. The highly configurable characteristics of this application makes it a generic tool, which allows the user to refine the function of predicted proteins, to extend detection to new enzymes families, and may also be applied to biological systems other than fungi and to other proteins than those involved in secondary metabolism.

Asunto(s)

Hongos , Anotación de Secuencia Molecular , Metabolismo Secundario , Programas Informáticos , Secuencia de Aminoácidos , Anotación de Secuencia Molecular/métodos , Péptido Sintasas/genética , Sintasas Poliquetidas/genética , Metabolismo Secundario/genética , Hongos/enzimología , Hongos/genética , Triptófano Sintasa/genética , Secuencia Conservada/genética

3.

A Phylogenetic Framework to Simulate Synthetic Interspecies RNA-Seq Data.

Bastide, Paul; Soneson, Charlotte; Stern, David B; Lespinet, Olivier; Gallopin, Mélina.

Mol Biol Evol ; 40(1)2023 01 04.

Artículo en Inglés | MEDLINE | ID: mdl-36508357

RESUMEN

Interspecies RNA-Seq datasets are increasingly common, and have the potential to answer new questions about the evolution of gene expression. Single-species differential expression analysis is now a well-studied problem that benefits from sound statistical methods. Extensive reviews on biological or synthetic datasets have provided the community with a clear picture on the relative performances of the available methods in various settings. However, synthetic dataset simulation tools are still missing in the interspecies gene expression context. In this work, we develop and implement a new simulation framework. This tool builds on both the RNA-Seq and the phylogenetic comparative methods literatures to generate realistic count datasets, while taking into account the phylogenetic relationships between the samples. We illustrate the usefulness of this new framework through a targeted simulation study, that reproduces the features of a recently published dataset, containing gene expression data in adult eye tissue across blind and sighted freshwater crayfish species. Using our simulated datasets, we perform a fair comparison of several approaches used for differential expression analysis. This benchmark reveals some of the strengths and weaknesses of both the classical and phylogenetic approaches for interspecies differential expression analysis, and allows for a reanalysis of the crayfish dataset. The tool has been integrated in the R package compcodeR, freely available on Bioconductor.

Asunto(s)

Perfilación de la Expresión Génica , Programas Informáticos , RNA-Seq , Filogenia , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos

4.

On the predictibility of A-minor motifs from their local contexts.

Gianfrotta, Coline; Reinharz, Vladimir; Lespinet, Olivier; Barth, Dominique; Denise, Alain.

RNA Biol ; 19(1): 1208-1227, 2022 01.

Artículo en Inglés | MEDLINE | ID: mdl-36384383

RESUMEN

This study investigates the importance of the structural context in the formation of a type I/II A-minor motif. This very frequent structural motif has been shown to be important in the spatial folding of RNA molecules. We developed an automated method to classify A-minor motif occurrences according to their 3D context similarities, and we used a graph approach to represent both the structural A-minor motif occurrences and their classes at different scales. This approach leads us to uncover new subclasses of A-minor motif occurrences according to their local 3D similarities. The majority of classes are composed of homologous occurrences, but some of them are composed of non-homologous occurrences. The different classifications we obtain allow us to better understand the importance of the context in the formation of A-minor motifs. In a second step, we investigate how much knowledge of the context around an A-minor motif can help to infer its presence (and position). More specifically, we want to determine what kind of information, contained in the structural context, can be useful to characterize and predict A-minor motifs. We show that, for some A-minor motifs, the topology combined with a sequence signal is sufficient to predict the presence and the position of an A-minor motif occurrence. In most other cases, these signals are not sufficient for predicting the A-minor motif, however we show that they are good signals for this purpose. All the classification and prediction pipelines rely on automated processes, for which we describe the underlying algorithms and parameters.

Asunto(s)

Imagenología Tridimensional , ARN , Algoritmos , Valor Predictivo de las Pruebas , Humanos , ARN/química

5.

Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution.

Papadopoulos, Chris; Callebaut, Isabelle; Gelly, Jean-Christophe; Hatin, Isabelle; Namy, Olivier; Renard, Maxime; Lespinet, Olivier; Lopes, Anne.

Genome Res ; 31(12): 2303-2315, 2021 Dec.

Artículo en Inglés | MEDLINE | ID: mdl-34810219

RESUMEN

The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.

6.

Dynamics of the compartmentalized Streptomyces chromosome during metabolic differentiation.

Lioy, Virginia S; Lorenzi, Jean-Noël; Najah, Soumaya; Poinsignon, Thibault; Leh, Hervé; Saulnier, Corinne; Aigle, Bertrand; Lautru, Sylvie; Thibessard, Annabelle; Lespinet, Olivier; Leblond, Pierre; Jaszczyszyn, Yan; Gorrichon, Kevin; Varoquaux, Nelle; Junier, Ivan; Boccard, Frédéric; Pernodet, Jean-Luc; Bury-Moné, Stéphanie.

Nat Commun ; 12(1): 5221, 2021 09 01.

Artículo en Inglés | MEDLINE | ID: mdl-34471117

RESUMEN

Bacteria of the genus Streptomyces are prolific producers of specialized metabolites, including antibiotics. The linear chromosome includes a central region harboring core genes, as well as extremities enriched in specialized metabolite biosynthetic gene clusters. Here, we show that chromosome structure in Streptomyces ambofaciens correlates with genetic compartmentalization during exponential phase. Conserved, large and highly transcribed genes form boundaries that segment the central part of the chromosome into domains, whereas the terminal ends tend to be transcriptionally quiescent compartments with different structural features. The onset of metabolic differentiation is accompanied by a rearrangement of chromosome architecture, from a rather 'open' to a 'closed' conformation, in which highly expressed specialized metabolite biosynthetic genes form new boundaries. Thus, our results indicate that the linear chromosome of S. ambofaciens is partitioned into structurally distinct entities, suggesting a link between chromosome folding, gene expression and genome evolution.

Asunto(s)

Antibacterianos/metabolismo , Cromosomas Bacterianos , Streptomyces/genética , Streptomyces/metabolismo , Estructuras Cromosómicas , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano , Familia de Multigenes , Transcriptoma

7.

Genome Sequences of 11 Conspecific Streptomyces sp. Strains.

Tidjani, Abdoul-Razak; Lorenzi, Jean-Noël; Toussaint, Maxime; van Dijk, Erwin; Naquin, Delphine; Lespinet, Olivier; Bontemps, Cyril; Leblond, Pierre.

Microbiol Resour Announc ; 8(38)2019 Sep 19.

Artículo en Inglés | MEDLINE | ID: mdl-31537669

RESUMEN

The genomes of 11 conspecific Streptomyces strains, i.e., from the same species and inhabiting the same ecological niche, were sequenced and assembled. This data set offers an ideal framework to assess the genome evolution of Streptomyces species in their ecological context.

8.

Massive Gene Flux Drives Genome Diversity between Sympatric Streptomyces Conspecifics.

Tidjani, Abdoul-Razak; Lorenzi, Jean-Noël; Toussaint, Maxime; van Dijk, Erwin; Naquin, Delphine; Lespinet, Olivier; Bontemps, Cyril; Leblond, Pierre.

mBio ; 10(5)2019 09 03.

Artículo en Inglés | MEDLINE | ID: mdl-31481382

RESUMEN

In this work, by comparing genomes of closely related individuals of Streptomyces isolated at a spatial microscale (millimeters or centimeters), we investigated the extent and impact of horizontal gene transfer in the diversification of a natural Streptomyces population. We show that despite these conspecific strains sharing a recent common ancestor, all harbored significantly different gene contents, implying massive and rapid gene flux. The accessory genome of the strains was distributed across insertion/deletion events (indels) ranging from one to several hundreds of genes. Indels were preferentially located in the arms of the linear chromosomes (ca. 12 Mb) and appeared to form recombination hot spots. Some of them harbored biosynthetic gene clusters (BGCs) whose products confer an inhibitory capacity and may constitute public goods that can favor the cohesiveness of the bacterial population. Moreover, a significant proportion of these variable genes were either plasmid borne or harbored signatures of actinomycete integrative and conjugative elements (AICEs). We propose that conjugation is the main driver for the indel flux and diversity in Streptomyces populations.IMPORTANCE Horizontal gene transfer is a rapid and efficient way to diversify bacterial gene pools. Currently, little is known about this gene flux within natural soil populations. Using comparative genomics of Streptomyces strains belonging to the same species and isolated at microscale, we reveal frequent transfer of a significant fraction of the pangenome. We show that it occurs at a time scale enabling the population to diversify and to cope with its changing environment, notably, through the production of public goods.

Asunto(s)

Transferencia de Gen Horizontal , Genes Bacterianos/genética , Variación Genética , Streptomyces/genética , Actinobacteria/genética , Vías Biosintéticas/genética , Cromosomas Bacterianos , Conjugación Genética , ADN Bacteriano/genética , Genoma Bacteriano , Familia de Multigenes , Tipificación de Secuencias Multilocus , Filogenia , Plásmidos

9.

Subtelomeres are fast-evolving regions of the Streptomyces linear chromosome.

Lorenzi, Jean-Noël; Lespinet, Olivier; Leblond, Pierre; Thibessard, Annabelle.

Microb Genom ; 7(6)2019 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-33749576

RESUMEN

Streptomyces possess a large linear chromosome (6-12 Mb) consisting of a conserved central region flanked by variable arms covering several megabases. In order to study the evolution of the chromosome across evolutionary times, a representative panel of Streptomyces strains and species (125) whose chromosomes are completely sequenced and assembled was selected. The pan-genome of the genus was modelled and shown to be open with a core-genome reaching 1018 genes. The evolution of Streptomyces chromosome was analysed by carrying out pairwise comparisons, and by monitoring indexes measuring the conservation of genes (presence/absence) and their synteny along the chromosome. Using the phylogenetic depth offered by the chosen panel, it was possible to infer that within the central region of the chromosome, the core-genes form a highly conserved organization, which can reveal the existence of an ancestral chromosomal skeleton. Conversely, the chromosomal arms, enriched in variable genes evolved faster than the central region under the combined effect of rearrangements and addition of new information from horizontal gene transfer. The genes hosted in these regions may be localized there because of the adaptive advantage that their rapid evolution may confer. We speculate that (i) within a bacterial population, the variability of these genes may contribute to the establishment of social characters by the production of 'public goods' (ii) at the evolutionary scale, this variability contributes to the diversification of the genetic pool of the bacteria.

10.

Comparative Genomics among Closely Related Streptomyces Strains Revealed Specialized Metabolite Biosynthetic Gene Cluster Diversity.

Vicente, Cláudia M; Thibessard, Annabelle; Lorenzi, Jean-Noël; Benhadj, Mabrouka; Hôtel, Laurence; Gacemi-Kirane, Djamila; Lespinet, Olivier; Leblond, Pierre; Aigle, Bertrand.

Antibiotics (Basel) ; 7(4)2018 Oct 02.

Artículo en Inglés | MEDLINE | ID: mdl-30279346

RESUMEN

Specialized metabolites are of great interest due to their possible industrial and clinical applications. The increasing number of antimicrobial resistant infectious agents is a major health threat and therefore, the discovery of chemical diversity and new antimicrobials is crucial. Extensive genomic data from Streptomyces spp. confirm their production potential and great importance. Genome sequencing of the same species strains indicates that specialized metabolite biosynthetic gene cluster (SMBGC) diversity is not exhausted, and instead, a pool of novel specialized metabolites still exists. Here, we analyze the genome sequence data from six phylogenetically close Streptomyces strains. The results reveal that the closer strains are phylogenetically, the number of shared gene clusters is higher. Eight specialized metabolites comprise the core metabolome, although some strains have only six core gene clusters. The number of conserved gene clusters common between the isolated strains and their closest phylogenetic counterparts varies from nine to 23 SMBGCs. However, the analysis of these phylogenetic relationships is not affected by the acquisition of gene clusters, probably by horizontal gene transfer events, as each strain also harbors strain-specific SMBGCs. Between one and 15 strain-specific gene clusters were identified, of which up to six gene clusters in a single strain are unknown and have no identifiable orthologs in other species, attesting to the existing SMBGC novelty at the strain level.

11.

Meet-U: Educating through research immersion.

Abdollahi, Nika; Albani, Alexandre; Anthony, Eric; Baud, Agnes; Cardon, Mélissa; Clerc, Robert; Czernecki, Dariusz; Conte, Romain; David, Laurent; Delaune, Agathe; Djerroud, Samia; Fourgoux, Pauline; Guiglielmoni, Nadège; Laurentie, Jeanne; Lehmann, Nathalie; Lochard, Camille; Montagne, Rémi; Myrodia, Vasiliki; Opuu, Vaitea; Parey, Elise; Polit, Lélia; Privé, Sylvain; Quignot, Chloé; Ruiz-Cuevas, Maria; Sissoko, Mariam; Sompairac, Nicolas; Vallerix, Audrey; Verrecchia, Violaine; Delarue, Marc; Guérois, Raphael; Ponty, Yann; Sacquin-Mora, Sophie; Carbone, Alessandra; Froidevaux, Christine; Le Crom, Stéphane; Lespinet, Olivier; Weigt, Martin; Abboud, Samer; Bernardes, Juliana; Bouvier, Guillaume; Dequeker, Chloé; Ferré, Arnaud; Fuchs, Patrick; Lelandais, Gaëlle; Poulain, Pierre; Richard, Hugues; Schweke, Hugo; Laine, Elodie; Lopes, Anne.

PLoS Comput Biol ; 14(3): e1005992, 2018 03.

Artículo en Inglés | MEDLINE | ID: mdl-29543809

RESUMEN

We present a new educational initiative called Meet-U that aims to train students for collaborative work in computational biology and to bridge the gap between education and research. Meet-U mimics the setup of collaborative research projects and takes advantage of the most popular tools for collaborative work and of cloud computing. Students are grouped in teams of 4-5 people and have to realize a project from A to Z that answers a challenging question in biology. Meet-U promotes "coopetition," as the students collaborate within and across the teams and are also in competition with each other to develop the best final product. Meet-U fosters interactions between different actors of education and research through the organization of a meeting day, open to everyone, where the students present their work to a jury of researchers and jury members give research seminars. This very unique combination of education and research is strongly motivating for the students and provides a formidable opportunity for a scientific community to unite and increase its visibility. We report on our experience with Meet-U in two French universities with master's students in bioinformatics and modeling, with protein-protein docking as the subject of the course. Meet-U is easy to implement and can be straightforwardly transferred to other fields and/or universities. All the information and data are available at www.meet-u.org.

Asunto(s)

Biología Computacional/educación , Biología Computacional/métodos , Investigación/educación , Humanos , Proyectos de Investigación , Estudiantes , Universidades

12.

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters.

Dallery, Jean-Félix; Lapalu, Nicolas; Zampounis, Antonios; Pigné, Sandrine; Luyten, Isabelle; Amselem, Joëlle; Wittenberg, Alexander H J; Zhou, Shiguo; de Queiroz, Marisa V; Robin, Guillaume P; Auger, Annie; Hainaut, Matthieu; Henrissat, Bernard; Kim, Ki-Tae; Lee, Yong-Hwan; Lespinet, Olivier; Schwartz, David C; Thon, Michael R; O'Connell, Richard J.

BMC Genomics ; 18(1): 667, 2017 Aug 29.

Artículo en Inglés | MEDLINE | ID: mdl-28851275

RESUMEN

BACKGROUND: The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. RESULTS: Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications. CONCLUSION: The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.

Asunto(s)

Cromosomas Fúngicos/genética , Colletotrichum/genética , Colletotrichum/metabolismo , Elementos Transponibles de ADN/genética , Genómica , Familia de Multigenes/genética , Recombinación Homóloga/genética , Anotación de Secuencia Molecular , Filogenia , Mutación Puntual/genética

13.

Erratum: Survival trade-offs in plant roots during colonization by closely related beneficial and pathogenic fungi.

Hacquard, Stéphane; Kracher, Barbara; Hiruma, Kei; Münch, Philipp C; Garrido-Oter, Ruben; Thon, Michael R; Weimann, Aaron; Damm, Ulrike; Dallery, Jean-Félix; Hainaut, Matthieu; Henrissat, Bernard; Lespinet, Olivier; Sacristán, Soledad; van Themaat, Emiel Ver Loren; Kemen, Eric; McHardy, Alice C; Schulze-Lefert, Paul; O'Connell, Richard J.

Nat Commun ; 7: 13072, 2016 Sep 29.

Artículo en Inglés | MEDLINE | ID: mdl-27681013

14.

Survival trade-offs in plant roots during colonization by closely related beneficial and pathogenic fungi.

Hacquard, Stéphane; Kracher, Barbara; Hiruma, Kei; Münch, Philipp C; Garrido-Oter, Ruben; Thon, Michael R; Weimann, Aaron; Damm, Ulrike; Dallery, Jean-Félix; Hainaut, Matthieu; Henrissat, Bernard; Lespinet, Olivier; Sacristán, Soledad; Ver Loren van Themaat, Emiel; Kemen, Eric; McHardy, Alice C; Schulze-Lefert, Paul; O'Connell, Richard J.

Nat Commun ; 7: 11362, 2016 05 06.

Artículo en Inglés | MEDLINE | ID: mdl-27150427

RESUMEN

The sessile nature of plants forced them to evolve mechanisms to prioritize their responses to simultaneous stresses, including colonization by microbes or nutrient starvation. Here, we compare the genomes of a beneficial root endophyte, Colletotrichum tofieldiae and its pathogenic relative C. incanum, and examine the transcriptomes of both fungi and their plant host Arabidopsis during phosphate starvation. Although the two species diverged only 8.8 million years ago and have similar gene arsenals, we identify genomic signatures indicative of an evolutionary transition from pathogenic to beneficial lifestyles, including a narrowed repertoire of secreted effector proteins, expanded families of chitin-binding and secondary metabolism-related proteins, and limited activation of pathogenicity-related genes in planta. We show that beneficial responses are prioritized in C. tofieldiae-colonized roots under phosphate-deficient conditions, whereas defense responses are activated under phosphate-sufficient conditions. These immune responses are retained in phosphate-starved roots colonized by pathogenic C. incanum, illustrating the ability of plants to maximize survival in response to conflicting stresses.

Asunto(s)

Arabidopsis/metabolismo , Colletotrichum/metabolismo , Endófitos/metabolismo , Fosfatos/deficiencia , Raíces de Plantas/metabolismo , Arabidopsis/inmunología , Quitina/metabolismo , Colletotrichum/genética , Endófitos/genética , Genoma Fúngico/genética , Inanición , Simbiosis/inmunología , Simbiosis/fisiología

15.

Profiling the orphan enzymes.

Sorokina, Maria; Stam, Mark; Médigue, Claudine; Lespinet, Olivier; Vallenet, David.

Biol Direct ; 9: 10, 2014 Jun 06.

Artículo en Inglés | MEDLINE | ID: mdl-24906382

RESUMEN

The emergence of Next Generation Sequencing generates an incredible amount of sequence and great potential for new enzyme discovery. Despite this huge amount of data and the profusion of bioinformatic methods for function prediction, a large part of known enzyme activities is still lacking an associated protein sequence. These particular activities are called "orphan enzymes". The present review proposes an update of previous surveys on orphan enzymes by mining the current content of public databases. While the percentage of orphan enzyme activities has decreased from 38% to 22% in ten years, there are still more than 1,000 orphans among the 5,000 entries of the Enzyme Commission (EC) classification. Taking into account all the reactions present in metabolic databases, this proportion dramatically increases to reach nearly 50% of orphans and many of them are not associated to a known pathway. We extended our survey to "local orphan enzymes" that are activities which have no representative sequence in a given clade, but have at least one in organisms belonging to other clades. We observe an important bias in Archaea and find that in general more than 30% of the EC activities have incomplete sequence information in at least one superkingdom. To estimate if candidate proteins for local orphans could be retrieved by homology search, we applied a simple strategy based on the PRIAM software and noticed that candidates may be proposed for an important fraction of local orphan enzymes. Finally, by studying relation between protein domains and catalyzed activities, it appears that newly discovered enzymes are mostly associated with already known enzyme domains. Thus, the exploration of the promiscuity and the multifunctional aspect of known enzyme families may solve part of the orphan enzyme issue. We conclude this review with a presentation of recent initiatives in finding proteins for orphan enzymes and in extending the enzyme world by the discovery of new activities.

Asunto(s)

Enzimas/genética , Genómica/métodos , Proteínas/genética , Proteómica/métodos , Archaea/genética , Archaea/metabolismo , Bacterias/genética , Bacterias/metabolismo , Bases de Datos de Proteínas , Enzimas/clasificación , Enzimas/metabolismo , Eucariontes/genética , Eucariontes/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , Filogenia , Proteínas/clasificación , Proteínas/metabolismo , Análisis de Secuencia de Proteína

16.

A meta-approach for improving the prediction and the functional annotation of ortholog groups.

Pereira, Cécile; Denise, Alain; Lespinet, Olivier.

BMC Genomics ; 15 Suppl 6: S16, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25573073

RESUMEN

BACKGROUND: In comparative genomics, orthologs are used to transfer annotation from genes already characterized to newly sequenced genomes. Many methods have been developed for finding orthologs in sets of genomes. However, the application of different methods on the same proteome set can lead to distinct orthology predictions. METHODS: We developed a method based on a meta-approach that is able to combine the results of several methods for orthologous group prediction. The purpose of this method is to produce better quality results by using the overlapping results obtained from several individual orthologous gene prediction procedures. Our method proceeds in two steps. The first aims to construct seeds for groups of orthologous genes; these seeds correspond to the exact overlaps between the results of all or several methods. In the second step, these seed groups are expanded by using HMM profiles. RESULTS: We evaluated our method on two standard reference benchmarks, OrthoBench and Orthology Benchmark Service. Our method presents a higher level of accurately predicted groups than the individual input methods of orthologous group prediction. Moreover, our method increases the number of annotated orthologous pairs without decreasing the annotation quality compared to twelve state-of-the-art methods. CONCLUSIONS: The meta-approach based method appears to be a reliable procedure for predicting orthologous groups. Since a large number of methods for predicting groups of orthologous genes exist, it is quite conceivable to apply this meta-approach to several combinations of different methods.

Asunto(s)

Biología Computacional/métodos , Genómica/métodos , Anotación de Secuencia Molecular/métodos , Programas Informáticos , Evolución Molecular , Reproducibilidad de los Resultados

17.

Comparative genomics of emerging pathogens in the Candida glabrata clade.

Gabaldón, Toni; Martin, Tiphaine; Marcet-Houben, Marina; Durrens, Pascal; Bolotin-Fukuhara, Monique; Lespinet, Olivier; Arnaise, Sylvie; Boisnard, Stéphanie; Aguileta, Gabriela; Atanasova, Ralitsa; Bouchier, Christiane; Couloux, Arnaud; Creno, Sophie; Almeida Cruz, Jose; Devillers, Hugo; Enache-Angoulvant, Adela; Guitard, Juliette; Jaouen, Laure; Ma, Laurence; Marck, Christian; Neuvéglise, Cécile; Pelletier, Eric; Pinard, Amélie; Poulain, Julie; Recoquillay, Julien; Westhof, Eric; Wincker, Patrick; Dujon, Bernard; Hennequin, Christophe; Fairhead, Cécile.

BMC Genomics ; 14: 623, 2013 Sep 14.

Artículo en Inglés | MEDLINE | ID: mdl-24034898

RESUMEN

BACKGROUND: Candida glabrata follows C. albicans as the second or third most prevalent cause of candidemia worldwide. These two pathogenic yeasts are distantly related, C. glabrata being part of the Nakaseomyces, a group more closely related to Saccharomyces cerevisiae. Although C. glabrata was thought to be the only pathogenic Nakaseomyces, two new pathogens have recently been described within this group: C. nivariensis and C. bracarensis. To gain insight into the genomic changes underlying the emergence of virulence, we sequenced the genomes of these two, and three other non-pathogenic Nakaseomyces, and compared them to other sequenced yeasts. RESULTS: Our results indicate that the two new pathogens are more closely related to the non-pathogenic N. delphensis than to C. glabrata. We uncover duplications and accelerated evolution that specifically affected genes in the lineage preceding the group containing N. delphensis and the three pathogens, which may provide clues to the higher propensity of this group to infect humans. Finally, the number of Epa-like adhesins is specifically enriched in the pathogens, particularly in C. glabrata. CONCLUSIONS: Remarkably, some features thought to be the result of adaptation of C. glabrata to a pathogenic lifestyle, are present throughout the Nakaseomyces, indicating these are rather ancient adaptations to other environments. Phylogeny suggests that human pathogenesis evolved several times, independently within the clade. The expansion of the EPA gene family in pathogens establishes an evolutionary link between adhesion and virulence phenotypes. Our analyses thus shed light onto the relationships between virulence and the recent genomic changes that occurred within the Nakaseomyces. SEQUENCE ACCESSION NUMBERS: Nakaseomyces delphensis: CAPT01000001 to CAPT01000179Candida bracarensis: CAPU01000001 to CAPU01000251Candida nivariensis: CAPV01000001 to CAPV01000123Candida castellii: CAPW01000001 to CAPW01000101Nakaseomyces bacillisporus: CAPX01000001 to CAPX01000186.

Asunto(s)

Candida glabrata/clasificación , Genoma Fúngico , Filogenia , Saccharomycetales/clasificación , Candida glabrata/genética , ADN de Hongos/genética , Evolución Molecular , Saccharomycetales/genética , Selección Genética , Análisis de Secuencia de ADN

18.

Genome-wide comparative analysis of pogo-like transposable elements in different Fusarium species.

Dufresne, Marie; Lespinet, Olivier; Daboussi, Marie-Josée; Hua-Van, Aurélie.

J Mol Evol ; 73(3-4): 230-43, 2011 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-22094890

RESUMEN

The recent availability of genome sequences of four different Fusarium species offers the opportunity to perform extensive comparative analyses, in particular of repeated sequences. In a recent work, the overall content of such sequences in the genomes of three phylogenetically related Fusarium species, F. graminearum, F. verticillioides, and F. oxysporum f. sp. lycopersici has been estimated. In this study, we present an exhaustive characterization of pogo-like elements, named Fots, in four Fusarium genomes. Overall 10 Fot and two Fot-related miniature inverted-repeat transposable element families were identified, revealing a diversification of multiple lineages of pogo-like elements, some of which accompanied by a gain of introns. This analysis also showed that such elements are present in an unusual high proportion in the genomes of F. oxysporum f. sp. lycopersici and Nectria haematococca (anamorph F. solani f. sp. pisi) in contrast with most other fungal genomes in which retroelements are the most represented. Interestingly, our analysis showed that the most numerous Fot families all contain potentially active or mobilisable copies, thus conferring a mutagenic potential of these transposable elements and consequently a role in strain adaptation and genome evolution. This role is strongly reinforced when examining their genomic distribution which is clearly biased with a high proportion (more than 80%) located on strain- or species-specific regions enriched in genes involved in pathogenicity and/or adaptation. Finally, the different reproductive characteristics of the four Fusarium species allowed us to investigate the impact of the process of repeat-induced point mutations on the expansion and diversification of Fot elements.

Asunto(s)

Elementos Transponibles de ADN/genética , Fusarium/genética , Genoma Fúngico , Secuencia de Bases , Análisis por Conglomerados , Evolución Molecular , Dosificación de Gen , Funciones de Verosimilitud , Modelos Genéticos , Familia de Multigenes , Sistemas de Lectura Abierta , Filogenia , Polimorfismo Genético

19.

A general framework for optimization of probes for gene expression microarray and its application to the fungus Podospora anserina.

Bidard, Frédérique; Imbeaud, Sandrine; Reymond, Nancie; Lespinet, Olivier; Silar, Philippe; Clavé, Corinne; Delacroix, Hervé; Berteaux-Lecellier, Véronique; Debuchy, Robert.

BMC Res Notes ; 3: 171, 2010 Jun 18.

Artículo en Inglés | MEDLINE | ID: mdl-20565839

RESUMEN

BACKGROUND: The development of new microarray technologies makes custom long oligonucleotide arrays affordable for many experimental applications, notably gene expression analyses. Reliable results depend on probe design quality and selection. Probe design strategy should cope with the limited accuracy of de novo gene prediction programs, and annotation up-dating. We present a novel in silico procedure which addresses these issues and includes experimental screening, as an empirical approach is the best strategy to identify optimal probes in the in silico outcome. FINDINGS: We used four criteria for in silico probe selection: cross-hybridization, hairpin stability, probe location relative to coding sequence end and intron position. This latter criterion is critical when exon-intron gene structure predictions for intron-rich genes are inaccurate. For each coding sequence (CDS), we selected a sub-set of four probes. These probes were included in a test microarray, which was used to evaluate the hybridization behavior of each probe. The best probe for each CDS was selected according to three experimental criteria: signal-to-noise ratio, signal reproducibility, and representative signal intensities. This procedure was applied for the development of a gene expression Agilent platform for the filamentous fungus Podospora anserina and the selection of a single 60-mer probe for each of the 10,556 P. anserina CDS. CONCLUSIONS: A reliable gene expression microarray version based on the Agilent 44K platform was developed with four spot replicates of each probe to increase statistical significance of analysis.

20.

FUNGIpath: a tool to assess fungal metabolic pathways predicted by orthology.

Grossetête, Sandrine; Labedan, Bernard; Lespinet, Olivier.

BMC Genomics ; 11: 81, 2010 Feb 01.

Artículo en Inglés | MEDLINE | ID: mdl-20122162

RESUMEN

BACKGROUND: More and more completely sequenced fungal genomes are becoming available and many more sequencing projects are in progress. This deluge of data should improve our knowledge of the various primary and secondary metabolisms of Fungi, including their synthesis of useful compounds such as antibiotics or toxic molecules such as mycotoxins. Functional annotation of many fungal genomes is imperfect, especially of genes encoding enzymes, so we need dedicated tools to analyze their metabolic pathways in depth. DESCRIPTION: FUNGIpath is a new tool built using a two-stage approach. Groups of orthologous proteins predicted using complementary methods of detection were collected in a relational database. Each group was further mapped on to steps in the metabolic pathways published in the public databases KEGG and MetaCyc. As a result, FUNGIpath allows the primary and secondary metabolisms of the different fungal species represented in the database to be compared easily, making it possible to assess the level of specificity of various pathways at different taxonomic distances. It is freely accessible at http://www.fungipath.u-psud.fr. CONCLUSIONS: As more and more fungal genomes are expected to be sequenced during the coming years, FUNGIpath should help progressively to reconstruct the ancestral primary and secondary metabolisms of the main branches of the fungal tree of life and to elucidate the evolution of these ancestral fungal metabolisms to various specific derived metabolisms.

Asunto(s)

Biología Computacional/métodos , Bases de Datos de Proteínas , Genoma Fúngico , Redes y Vías Metabólicas , Minería de Datos , Hongos/genética

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA