Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Mol Biol Evol ; 24(2): 513-21, 2007 Feb.
Article in English | MEDLINE | ID: mdl-17119011

ABSTRACT

Evolutionary studies commonly model single nucleotide substitutions and assume that they occur as independent draws from a unique probability distribution across the sequence studied. This assumption is violated for protein-coding sequences, and we consider modeling approaches where codon positions (CPs) are treated as separate categories of sites because within each category the assumption is more reasonable. Such "codon-position" models have been shown to explain the evolution of codon data better than homogenous models in previous studies. This paper examines the ways in which codon-position models outperform homogeneous models and characterizes the differences in estimates of model parameters across CPs. Using the PANDIT database of multiple species DNA sequence alignments, we quantify the differences in the evolutionary processes at the 3 CPs in a systematic and comprehensive manner, characterizing previously undescribed features of protein evolution. We relate our findings to the functional constraints imposed by the genetic code, protein function, and the types of mutation that cause synonymous and nonsynonymous codon changes. The results increase our understanding of selective constraints and could be incorporated into phylogenetic analyses or gene-finding techniques in the future. The methods used are extended to an overlapping reading frame data set, and we discover that overlapping reading frames do not necessarily cause more stringent evolutionary constraints.


Subject(s)
Codon/genetics , Evolution, Molecular , Proteins/genetics , Base Sequence , Computational Biology , Databases, Nucleic Acid , Models, Genetic , Models, Statistical , Sequence Alignment
2.
Proc Natl Acad Sci U S A ; 103(14): 5320-5, 2006 Apr 04.
Article in English | MEDLINE | ID: mdl-16569694

ABSTRACT

There is abundant transcription from eukaryotic genomes unaccounted for by protein coding genes. A high-resolution genome-wide survey of transcription in a well annotated genome will help relate transcriptional complexity to function. By quantifying RNA expression on both strands of the complete genome of Saccharomyces cerevisiae using a high-density oligonucleotide tiling array, this study identifies the boundary, structure, and level of coding and noncoding transcripts. A total of 85% of the genome is expressed in rich media. Apart from expected transcripts, we found operon-like transcripts, transcripts from neighboring genes not separated by intergenic regions, and genes with complex transcriptional architecture where different parts of the same gene are expressed at different levels. We mapped the positions of 3' and 5' UTRs of coding genes and identified hundreds of RNA transcripts distinct from annotated genes. These nonannotated transcripts, on average, have lower sequence conservation and lower rates of deletion phenotype than protein coding genes. Many other transcripts overlap known genes in antisense orientation, and for these pairs global correlations were discovered: UTR lengths correlated with gene function, localization, and requirements for regulation; antisense transcripts overlapped 3' UTRs more than 5' UTRs; UTRs with overlapping antisense tended to be longer; and the presence of antisense associated with gene function. These findings may suggest a regulatory role of antisense transcription in S. cerevisiae. Moreover, the data show that even this well studied genome has transcriptional complexity far beyond current annotation.


Subject(s)
Genome, Fungal , Saccharomyces cerevisiae/genetics , Transcription, Genetic , 5' Untranslated Regions , DNA, Complementary , Nucleic Acid Hybridization , Oligonucleotide Array Sequence Analysis , RNA, Fungal/genetics , RNA, Messenger/genetics
3.
J Biomed Inform ; 39(1): 51-61, 2006 Feb.
Article in English | MEDLINE | ID: mdl-16226061

ABSTRACT

Molecular evolutionary studies provide a means of investigating how cells function and how organisms adapt to their environment. The products of evolutionary studies provide medically important insights to the source of major diseases, such as HIV, and hold the key to understand the developing immunity of pathogenic bacteria to antibiotics. They have also helped mankind understand its place in nature, casting light on the selective forces and environmental conditions that resulted in modern humans. The use of likelihood as a framework for statistical modeling in phylogenetics has played a fundamental role in studying molecular evolution, enabling rigorous and robust conclusions to be drawn from sequence data. The first half of this article is a general introduction to the likelihood method for inferring phylogenies, the properties of the models used, and how it can be used for statistical testing. The latter half of the article focuses on the emerging new generation of phylogenetic models that describe heterogeneity in the evolutionary process along sequences, including the recoding of protein coding sequence data to amino acids and codons, and various approaches for describing dependencies between sites in a sequence. We conclude with a detailed case study examining how modern modeling approaches have been successfully employed to identify adaptive evolution in proteins.


Subject(s)
Biological Evolution , Chromosome Mapping/methods , Genetic Variation/genetics , Models, Genetic , Phylogeny , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Animals , Humans , Likelihood Functions , Models, Statistical , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...