Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
Genome Biol Evol ; 7(10): 2896-912, 2015 Oct 09.
Article in English | MEDLINE | ID: mdl-26454013

ABSTRACT

Deciphering the genetic bases of pathogen adaptation to its host is a key question in ecology and evolution. To understand how the fungus Magnaporthe oryzae adapts to different plants, we sequenced eight M. oryzae isolates differing in host specificity (rice, foxtail millet, wheat, and goosegrass), and one Magnaporthe grisea isolate specific of crabgrass. Analysis of Magnaporthe genomes revealed small variation in genome sizes (39-43 Mb) and gene content (12,283-14,781 genes) between isolates. The whole set of Magnaporthe genes comprised 14,966 shared families, 63% of which included genes present in all the nine M. oryzae genomes. The evolutionary relationships among Magnaporthe isolates were inferred using 6,878 single-copy orthologs. The resulting genealogy was mostly bifurcating among the different host-specific lineages, but was reticulate inside the rice lineage. We detected traces of introgression from a nonrice genome in the rice reference 70-15 genome. Among M. oryzae isolates and host-specific lineages, the genome composition in terms of frequencies of genes putatively involved in pathogenicity (effectors, secondary metabolism, cazome) was conserved. However, 529 shared families were found only in nonrice lineages, whereas the rice lineage possessed 86 specific families absent from the nonrice genomes. Our results confirmed that the host specificity of M. oryzae isolates was associated with a divergence between lineages without major gene flow and that, despite the strong conservation of gene families between lineages, adaptation to different hosts, especially to rice, was associated with the presence of a small number of specific gene families. All information was gathered in a public database (http://genome.jouy.inra.fr/gemo).


Subject(s)
Evolution, Molecular , Genome, Fungal , Magnaporthe/genetics , Adaptation, Biological , Base Sequence , Biological Evolution , Burkholderia/genetics , Burkholderia/isolation & purification , DNA Transposable Elements , Digitaria/microbiology , Fungal Proteins/genetics , Genes, Fungal , Genetic Variation , Magnaporthe/isolation & purification , Oryza/microbiology , Plant Diseases/microbiology , Sequence Analysis, DNA
2.
Infect Genet Evol ; 12(5): 987-96, 2012 Jul.
Article in English | MEDLINE | ID: mdl-22406010

ABSTRACT

The rapid evolution of particular genes is essential for the adaptation of pathogens to new hosts and new environments. Powerful methods have been developed for detecting targets of selection in the genome. Here we used divergence data to compare genes among four closely related fungal pathogens adapted to different hosts to elucidate the functions putatively involved in adaptive processes. For this goal, ESTs were sequenced in the specialist fungal pathogens Botrytis tulipae and Botrytis ficariarum, and compared with genome sequences of Botrytis cinerea and Sclerotinia sclerotiorum, responsible for diseases on over 200 plant species. A maximum likelihood-based analysis of 642 predicted orthologs detected 21 genes showing footprints of positive selection. These results were validated by resequencing nine of these genes in additional Botrytis species, showing they have also been rapidly evolving in other related species. Twenty of the 21 genes had not previously been identified as pathogenicity factors in B. cinerea, but some had functions related to plant-fungus interactions. The putative functions were involved in respiratory and energy metabolism, protein and RNA metabolism, signal transduction or virulence, similarly to what was detected in previous studies using the same approach in other pathogens. Mutants of B. cinerea were generated for four of these genes as a first attempt to elucidate their functions.


Subject(s)
Botrytis/genetics , Evolution, Molecular , Genes, Fungal , Cell Line , Cluster Analysis , Computer Simulation , Genome, Fungal , Solanum lycopersicum/microbiology , Reproducibility of Results , Selection, Genetic , Sequence Analysis, DNA
3.
J Comput Biol ; 19(1): 13-29, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22149633

ABSTRACT

We present a general method for assessing threading score significance. The threading score of a protein sequence, thread onto a given structure, should be compared with the threading score distribution of a random amino-acid sequence, of the same length, thread on the same structure; small p-values point significantly high scores. We claim that, due to general protein contact map properties, this reference distribution is a Weibull extreme value distribution whose parameters depend on the threading method, the structure, the length of the query and the random sequence simulation model used. These parameters can be estimated off-line with simulated sequence samples, for different sequence lengths. They can further be interpolated at the exact length of a query, enabling the quick computation of the p-value.


Subject(s)
Models, Statistical , Sequence Alignment/methods , Sequence Analysis/methods , Statistical Distributions , Algorithms , Amino Acid Sequence , Computational Biology/methods , Computer Simulation , Markov Chains , Protein Conformation , Proteins/chemistry
4.
BMC Bioinformatics ; 9: 456, 2008 Oct 27.
Article in English | MEDLINE | ID: mdl-18954438

ABSTRACT

BACKGROUND: The increasing availability of fungal genome sequences provides large numbers of proteins for evolutionary and phylogenetic analyses. However the heterogeneity of data, including the quality of genome annotation and the difficulty of retrieving true orthologs, makes such investigations challenging. The aim of this study was to provide a reliable and integrated resource of orthologous gene families to perform comparative and phylogenetic analyses in fungi. DESCRIPTION: FUNYBASE is a database dedicated to the analysis of fungal single-copy genes extracted from available fungal genomes sequences, their classification into reliable clusters of orthologs, and the assessment of their informative value for phylogenetic reconstruction based on amino acid sequences. The current release of FUNYBASE contains two types of protein data: (i) a complete set of protein sequences extracted from 30 public fungal genomes and classified into clusters of orthologs using a robust automated procedure, and (ii) a subset of 246 reliable ortholog clusters present as single copy genes in 21 fungal genomes. For each of these 246 ortholog clusters, phylogenetic trees were reconstructed based on their amino acid sequences. To assess the informative value of each ortholog cluster, each was compared to a reference species tree constructed using a concatenation of roughly half of the 246 sequences that are best approximated by the WAG evolutionary model. The orthologs were classified according to a topological score, which measures their ability to recover the same topology as the reference species tree. The full results of these analyses are available on-line with a user-friendly interface that allows for searches to be performed by species name, the ortholog cluster, various keywords, or using the BLAST algorithm. Examples of fruitful utilization of FUNYBASE for investigation of fungal phylogenetics are also presented. CONCLUSION: FUNYBASE constitutes a novel and useful resource for two types of analyses: (i) comparative studies can be greatly facilitated by reliable clusters of orthologs across sets of user-defined fungal genomes, and (ii) phylogenetic reconstruction can be improved by identifying genes with the highest informative value at the desired taxonomic level.


Subject(s)
Databases, Genetic , Genome, Fungal , Genomics/methods , Information Storage and Retrieval/methods , Phylogeny , Algorithms , Databases, Protein , Evolution, Molecular , Fungi/genetics , Genes, Fungal
5.
Theor Popul Biol ; 73(2): 289-99, 2008 Mar.
Article in English | MEDLINE | ID: mdl-18190938

ABSTRACT

This paper provides a theoretical description of the chromosome architecture resulting from a given number of generations in a back-cross. It is worth considering chromosome architecture as being dependent on a marked point process, whose properties themselves depend on the crossing-over model used. The resulting architecture is presented here for two different models, one without interference, the other with complete interference. Exact distributions, with easy-to-compute formulae, are derived for quantities of interest, as the lengths of donor or receiver fragments, for any chromosome length and for both crossing-over models. Examples are presented to illustrate the use of these distributions in introgression programs.


Subject(s)
Chromosomes/genetics , Crossing Over, Genetic/genetics , Models, Genetic , Plants/genetics , France , Models, Statistical , Poisson Distribution
6.
BMC Genomics ; 8: 272, 2007 Aug 10.
Article in English | MEDLINE | ID: mdl-17692127

ABSTRACT

BACKGROUND: The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. RESULTS: A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. CONCLUSION: This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.


Subject(s)
Expressed Sequence Tags , Genes, Fungal/genetics , Genes, Mating Type, Fungal/genetics , Virulence/genetics , Base Sequence , DNA, Fungal , Databases, Nucleic Acid , Fungi , Gene Library , Plant Diseases/microbiology
7.
BMC Struct Biol ; 6: 25, 2006 Dec 13.
Article in English | MEDLINE | ID: mdl-17166267

ABSTRACT

BACKGROUND: Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models. RESULTS: Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model alpha-helices, 12 that model coil and 9 that model beta-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%. CONCLUSION: The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content.


Subject(s)
Computational Biology/methods , Markov Chains , Models, Chemical , Protein Structure, Secondary , Proteins/chemistry
8.
BMC Genomics ; 7: 194, 2006 Aug 01.
Article in English | MEDLINE | ID: mdl-16882342

ABSTRACT

BACKGROUND: Comparative mapping provides new insights into the evolutionary history of genomes. In particular, recent studies in mammals have suggested a role for segmental duplication in genome evolution. In some species such as Drosophila or maize, transposable elements (TEs) have been shown to be involved in chromosomal rearrangements. In this work, we have explored the presence of interspersed repeats in regions of chromosomal rearrangements, using an updated high-resolution integrated comparative map among cattle, man and mouse. RESULTS: The bovine, human and mouse comparative autosomal map has been constructed using data from bovine genetic and physical maps and from FISH-mapping studies. We confirm most previous results but also reveal some discrepancies. A total of 211 conserved segments have been identified between cattle and man, of which 33 are new segments and 72 correspond to extended, previously known segments. The resulting map covers 91% and 90% of the human and bovine genomes, respectively. Analysis of breakpoint regions revealed a high density of species-specific interspersed repeats in the human and mouse genomes. CONCLUSION: Analysis of the breakpoint regions has revealed specific repeat density patterns, suggesting that TEs may have played a significant role in chromosome evolution and genome plasticity. However, we cannot rule out that repeats and breakpoints accumulate independently in the few same regions where modifications are better tolerated. Likewise, we cannot ascertain whether increased TE density is the cause or the consequence of chromosome rearrangements. Nevertheless, the identification of high density repeat clusters combined with a well-documented repeat phylogeny should highlight probable breakpoints, and permit their precise dating. Combining new statistical models taking the present information into account should help reconstruct ancestral karyotypes.


Subject(s)
Chromosome Mapping/methods , Evolution, Molecular , Genome/genetics , Animals , Cattle , Chromosome Breakage , Humans , Mice , Repetitive Sequences, Nucleic Acid , Translocation, Genetic
9.
BMC Bioinformatics ; 6: 150, 2005 Jun 16.
Article in English | MEDLINE | ID: mdl-15960854

ABSTRACT

BACKGROUND: Analysis of variance is a powerful approach to identify differentially expressed genes in a complex experimental design for microarray and macroarray data. The advantage of the anova model is the possibility to evaluate multiple sources of variation in an experiment. RESULTS: AnovArray is a package implementing ANOVA for gene expression data using SAS statistical software. The originality of the package is 1) to quantify the different sources of variation on all genes together, 2) to provide a quality control of the model, 3) to propose two models for a gene's variance estimation and to perform a correction for multiple comparisons. CONCLUSION: AnovArray is freely available at http://www-mig.jouy.inra.fr/stat/AnovArray and requires only SAS statistical software.


Subject(s)
Computational Biology/methods , Gene Expression Regulation , Oligonucleotide Array Sequence Analysis/methods , Algorithms , Analysis of Variance , Animals , Cattle , Computer Simulation , Data Interpretation, Statistical , Gene Expression , Gene Expression Profiling , Gene Library , Internet , Models, Statistical , Programming Languages , Quality Control , Reproducibility of Results , Sample Size , Sensitivity and Specificity , Sequence Analysis, DNA , Software , Tissue Distribution
10.
Nucleic Acids Res ; 30(6): 1418-26, 2002 Mar 15.
Article in English | MEDLINE | ID: mdl-11884641

ABSTRACT

We present here the use of a new statistical segmentation method on the Bacillus subtilis chromosome sequence. Maximum likelihood parameter estimation of a hidden Markov model, based on the expectation-maximization algorithm, enables one to segment the DNA sequence according to its local composition. This approach is not based on sliding windows; it enables different compositional classes to be separated without prior knowledge of their content, size and localization. We compared these compositional classes, obtained from the sequence, with the annotated DNA physical map, sequence homologies and repeat regions. The first heterogeneity revealed discriminates between the two coding strands and the non-coding regions. Other main heterogeneities arise; some are related to horizontal gene transfer, some to t-enriched composition of hydrophobic protein coding strands, and others to the codon usage fitness of highly expressed genes. Concerning potential and established gene transfers, we found 9 of the 10 known prophages, plus 14 new regions of atypical composition. Some of them are surrounded by repeats, most of their genes have unknown function or possess homology to genes involved in secondary catabolism, metal and antibiotic resistance. Surprisingly, we notice that all of these detected regions are a + t-richer than the host genome, raising the question of their remote sources.


Subject(s)
Bacillus subtilis/genetics , Chromosomes, Bacterial , Markov Chains , Bacterial Proteins/chemistry , Bacterial Proteins/genetics , DNA, Bacterial/classification , Gene Transfer, Horizontal , Genetic Variation , Hydrophobic and Hydrophilic Interactions , Likelihood Functions , Lysogeny , RNA, Bacterial/genetics , Repetitive Sequences, Nucleic Acid , Sequence Homology, Nucleic Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...