Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Biosystems ; 100(3): 215-24, 2010 Jun.
Article in English | MEDLINE | ID: mdl-20350581

ABSTRACT

Statistical correlations in DNA sequences are an important source of information for processes of genome evolution. As a special case of such correlations and building up on our previous work, here we study, how short-range correlations in Eukaryotic genomes change under elimination of various classes of repetitive DNA. Our main result is that a residual correlation pattern, common to most mammalian species, emerges under elimination of all repetitive DNA, suggesting features of an ancestral correlation signature. Furthermore, using this general framework, we find classes of repeats, which upon deletion move the correlation pattern towards this residual pattern (simple repeats and SINEs) or away from this residual pattern (LINEs). These findings suggest that the common correlation pattern visible in the mammalian species after repeat elimination can be associated with a common mammalian ancestor.


Subject(s)
DNA/genetics , Evolution, Molecular , Models, Genetic , Animals , Genomics , Humans , Long Interspersed Nucleotide Elements , Mice , Principal Component Analysis , Rats , Repetitive Sequences, Nucleic Acid , Sequence Analysis, DNA/statistics & numerical data , Short Interspersed Nucleotide Elements , Species Specificity , Systems Biology
2.
Phys Rev E Stat Nonlin Soft Matter Phys ; 74(2 Pt 1): 021913, 2006 Aug.
Article in English | MEDLINE | ID: mdl-17025478

ABSTRACT

Attempts to identify a species on the basis of its DNA sequence on purely statistical grounds have been formulated for more than a decade. The most prominent of such genome signatures relies on neighborhood correlations (i.e., dinucleotide frequencies) and, consequently, attributes species identification to mechanisms operating on the dinucleotide level (e.g., neighbor-dependent mutations). For the examples of Mus musculus and Rattus norvegicus we analyze short- and intermediate-range statistical correlations in DNA sequences. These correlation profiles are computed for all chromosomes of the two species. We find that with increasing range of correlations the capacity to distinguish between the species on the basis of this correlation profile is getting better and requires ever shorter sequence segments for obtaining a full species separation. This finding suggests that distinctive traits within the sequence are situated beyond the level of few nucleotides. The large-scale statistical patterning of DNA sequences on which such genome signatures are based is thus substantially determined by mobile elements (e.g., transposons and retrotransposons). The study and interspecies comparison of such correlation profiles can, therefore, reveal features of retrotransposition, segmental duplications, and other processes of genome evolution.


Subject(s)
Chromosome Mapping/methods , Genetic Code/genetics , Models, Genetic , Quantitative Trait Loci/genetics , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Animals , Computer Simulation , Genetic Variation/genetics , Information Storage and Retrieval/methods , Mice , Rats , Sequence Homology, Nucleic Acid , Species Specificity
3.
J Comput Biol ; 12(5): 545-53, 2005 Jun.
Article in English | MEDLINE | ID: mdl-15952877

ABSTRACT

The surprising fact that global statistical properties computed on a genomewide scale may reveal species information has first been observed in studies of dinucleotide frequencies. Here we will look at the same phenomenon with a totally different statistical approach. We show that patterns in the short-range statistical correlations in DNA sequences serve as evolutionary fingerprints of eukaryotes. All chromosomes of a species display the same characteristic pattern, markedly different from those of other species. The chromosomes of a species are sorted onto the same branch of a phylogenetic tree due to this correlation pattern. The average correlation between nucleotides at a distance k is quantified in two independent ways: (i) by estimating it from a higher-order Markov process and (ii) by computing the mutual information function at a distance k. We show how the quality of phylogenetic reconstruction depends on the range of correlation strengths and on the length of the underlying sequence segment. This concept of the correlation pattern as a phylogenetic signature of eukaryote species combines two rather distant domains of research, namely phylogenetic analysis based on molecular observation and the study of the correlation structure of DNA sequences.


Subject(s)
Genome , Phylogeny , Sequence Analysis, DNA , Animals , Computational Biology/methods , Humans , Plants/genetics , Sequence Analysis, DNA/methods
4.
Gene ; 345(1): 81-90, 2005 Jan 17.
Article in English | MEDLINE | ID: mdl-15716116

ABSTRACT

We study short-range correlations in DNA sequences with methods from information theory and statistics. We find a persisting degree of identity between the correlation patterns of different chromosomes of a species. Except for the case of human and chimpanzee inter-species differences in this correlation pattern allow robust species distinction: in a clustering tree based upon the correlation curves on the level of individual chromosomes distinct clusters for the individual species are found. This capacity of distinguishing species persists, even when the length of the underlying sequences is drastically reduced. In comparison to the standard tool for studying symbol correlations in DNA sequences, namely the mutual information function, we find that an autoregressive model for higher order Markov processes significantly improves species distinction due to an implicit subtraction of random background.


Subject(s)
Computational Biology/methods , Eukaryotic Cells/metabolism , Genome , Sequence Analysis, DNA/statistics & numerical data , Algorithms , Animals , Humans , Phylogeny , Sequence Alignment/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...