Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Heredity (Edinb) ; 108(3): 273-84, 2012 Mar.
Article in English | MEDLINE | ID: mdl-21897435

ABSTRACT

In plants, knowledge about linkage disequilibrium (LD) is relevant for the design of efficient single-nucleotide polymorphism arrays in relation to their use in population and association genomics studies. Previous studies of conifer genes have shown LD to decay rapidly within gene limits, but exceptions have been reported. To evaluate the extent of heterogeneity of LD among conifer genes and its potential causes, we examined LD in 105 genes of white spruce (Picea glauca) by sequencing a panel of 48 haploid megagametophytes from natural populations and further compared it with LD in other conifer species. The average pairwise r(2) value was 0.19 (s.d.=0.19), and LD dropped quickly with a half-decay being reached at a distance of 65 nucleotides between sites. However, LD was significantly heterogeneous among genes. A first group of 29 genes had stronger LD (mean r(2)=0.28), and a second group of 38 genes had weaker LD (mean r(2)=0.12). While a strong relationship was found with the recombination rate, there was no obvious relationship between LD and functional classification. The level of nucleotide diversity, which was highly heterogeneous across genes, was also not significantly correlated with LD. A search for selection signatures highlighted significant deviations from the standard neutral model, which could be mostly attributed to recent demographic changes. Little evidence was seen for hitchhiking and clear relationships with LD. When compared among conifer species, on average, levels of LD were similar in genes from white spruce, Norway spruce and Scots pine, whereas loblolly pine and Douglas fir genes exhibited a significantly higher LD.


Subject(s)
Genetic Heterogeneity , Linkage Disequilibrium , Picea/genetics , Tracheophyta/genetics , Genes, Plant , Molecular Sequence Data , Polymorphism, Single Nucleotide , Recombination, Genetic
2.
J Biotechnol ; 78(3): 293-9, 2000 Mar 31.
Article in English | MEDLINE | ID: mdl-10751690

ABSTRACT

Gene prediction methods for eukaryotic genomes still are not fully satisfying. One way to improve gene prediction accuracy, proven to be relevant for prokaryotes, is to consider more than one model of genes. Thus, we used our classification of Arabidopsis thaliana genes in two classes (CU(1) and CU(2)), previously delineated according to statistical features, in the GeneMark gene identification program. For each gene class, as well as for the two classes combined, a Markov model was developed (respectively, GM-CU(1), GM-CU(2) and GM-all) and then used on a test set of 168 genes to compare their respective efficiency. We concluded from this analysis that GM-CU(1) is more sensitive than GM-CU(2) which seems to be more specific to a gene type. Besides, GM-all does not give better results than GM-CU(1) and combining results from GM-CU(1) and GM-CU(2) greatly improve prediction efficiency in comparison with predictions made with GM-all only. Thus, this work confirms the necessity to consider more than one gene model for gene prediction in eukaryotic genomes, and to look for gene classes in order to build these models.


Subject(s)
Arabidopsis/genetics , Genes, Plant , Biotechnology , Codon/genetics , DNA, Plant/genetics , Databases, Factual , Exons , Models, Genetic , Software
3.
Curr Opin Plant Biol ; 2(2): 90-5, 1999 Apr.
Article in English | MEDLINE | ID: mdl-10322203

ABSTRACT

Genome data have to be converted into knowledge to be useful to biologists. Many valuable computational tools have already been developed to help annotation of plant genome sequences, and these may be improved further, for example by identification of more gene regulatory elements. The lack of a standard computer-assisted annotation platform for eukaryotic genomes remains major bottle-neck.


Subject(s)
Genes, Plant/genetics , Genome, Plant , Arabidopsis/genetics , Databases, Factual , Genes, Plant/physiology , Internet , Sequence Alignment , Software
4.
Bioinformatics ; 15(11): 887-99, 1999 Nov.
Article in English | MEDLINE | ID: mdl-10743555

ABSTRACT

MOTIVATION: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes. RESULTS: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three levels for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software. AVAILABILITY: The AraSet sequence set, the Perl programs and complementary results and notes are available at http://sphinx.rug.ac.be:8080/biocomp/napav/. CONTACT: Pierre.Rouze@gengenp.rug.ac.be.


Subject(s)
Arabidopsis/genetics , Computational Biology/methods , Genome, Plant , Sequence Analysis, DNA/methods , Software Validation , Alternative Splicing/genetics , Contig Mapping/methods , Databases, Factual , Evaluation Studies as Topic , Exons/genetics , Models, Genetic , Reproducibility of Results , Sequence Analysis, Protein
SELECTION OF CITATIONS
SEARCH DETAIL
...