Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
Bioinformatics ; 35(15): 2545-2554, 2019 08 01.
Article in English | MEDLINE | ID: mdl-30541063

ABSTRACT

MOTIVATION: Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. Although it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. We show that commonly used thresholds need not yield conservative tests, but instead give larger than expected Type I error rates. Statistical regularity can be restored by using a modified likelihood ratio test. RESULTS: We give theoretical results to prove that, if the number of sites is not too small, the modified likelihood ratio test gives approximately correct Type I error probabilities regardless of the parameter settings of the underlying null hypothesis. Simulations show that modification gives Type I error rates closer to those stated without a loss of power. The simulations also show that parameter estimation for mixture models of codon evolution can be challenging in certain data-generation settings with very different mixing distributions giving nearly identical site pattern distributions unless the number of taxa and tree length are large. Because mixture models are widely used for a variety of problems in molecular evolution, the challenges and general approaches to solving them presented here are applicable in a broader context. AVAILABILITY AND IMPLEMENTATION: https://github.com/jehops/codeml_modl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Software , Biometry , Chi-Square Distribution , Evolution, Molecular , Likelihood Functions
2.
Mol Biol Evol ; 33(11): 2976-2989, 2016 11.
Article in English | MEDLINE | ID: mdl-27486222

ABSTRACT

To detect positive selection at individual amino acid sites, most methods use an empirical Bayes approach. After parameters of a Markov process of codon evolution are estimated via maximum likelihood, they are passed to Bayes formula to compute the posterior probability that a site evolved under positive selection. A difficulty with this approach is that parameter estimates with large errors can negatively impact Bayesian classification. By assigning priors to some parameters, Bayes Empirical Bayes (BEB) mitigates this problem. However, as implemented, it imposes uniform priors, which causes it to be overly conservative in some cases. When standard regularity conditions are not met and parameter estimates are unstable, inference, even under BEB, can be negatively impacted. We present an alternative to BEB called smoothed bootstrap aggregation (SBA), which bootstraps site patterns from an alignment of protein coding DNA sequences to accommodate the uncertainty in the parameter estimates. We show that deriving the correction for parameter uncertainty from the data in hand, in combination with kernel smoothing techniques, improves site specific inference of positive selection. We compare BEB to SBA by simulation and real data analysis. Simulation results show that SBA balances accuracy and power at least as well as BEB, and when parameter estimates are unstable, the performance gap between BEB and SBA can widen in favor of SBA. SBA is applicable to a wide variety of other inference problems in molecular evolution.


Subject(s)
Amino Acids/genetics , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Bayes Theorem , Biological Evolution , Codon/genetics , Computer Simulation , Evolution, Molecular , Likelihood Functions , Markov Chains , Models, Genetic , Models, Statistical , Probability , Selection, Genetic , Uncertainty
3.
Curr Protoc Bioinformatics ; 54: 6.15.1-6.15.32, 2016 06 20.
Article in English | MEDLINE | ID: mdl-27322407

ABSTRACT

This unit provides protocols for using the CODEML program from the PAML package to make inferences about episodic natural selection in protein-coding sequences. The protocols cover inference tasks such as maximum likelihood estimation of selection intensity, testing the hypothesis of episodic positive selection, and identifying sites with a history of episodic evolution. We provide protocols for using the rich set of models implemented in CODEML to assess robustness, and for using bootstrapping to assess if the requirements for reliable statistical inference have been met. An example dataset is used to illustrate how the protocols are used with real protein-coding sequences. The workflow of this design, through automation, is readily extendable to a larger-scale evolutionary survey. © 2016 by John Wiley & Sons, Inc.


Subject(s)
Computational Biology/methods , Selection, Genetic , Software , Codon/chemistry , Evolution, Molecular , Likelihood Functions
4.
Genetics ; 203(2): 905-22, 2016 06.
Article in English | MEDLINE | ID: mdl-27075724

ABSTRACT

Genes encoding nuclear receptors (NRs) are attractive as candidates for investigating the evolution of gene regulation because they (1) have a direct effect on gene expression and (2) modulate many cellular processes that underlie development. We employed a three-phase investigation linking NR molecular evolution among primates with direct experimental assessment of NR function. Phase 1 was an analysis of NR domain evolution and the results were used to guide the design of phase 2, a codon-model-based survey for alterations of natural selection within the hominids. By using a series of reliability and robustness analyses we selected a single gene, NR2C1, as the best candidate for experimental assessment. We carried out assays to determine whether changes between the ancestral and extant NR2C1s could have impacted stem cell pluripotency (phase 3). We evaluated human, chimpanzee, and ancestral NR2C1 for transcriptional modulation of Oct4 and Nanog (key regulators of pluripotency and cell lineage commitment), promoter activity for Pepck (a proxy for differentiation in numerous cell types), and average size of embryological stem cell colonies (a proxy for the self-renewal capacity of pluripotent cells). Results supported the signal for alteration of natural selection identified in phase 2. We suggest that adaptive evolution of gene regulation has impacted several aspects of pluripotentiality within primates. Our study illustrates that the combination of targeted evolutionary surveys and experimental analysis is an effective strategy for investigating the evolution of gene regulation with respect to developmental phenotypes.


Subject(s)
Cell Differentiation/genetics , Evolution, Molecular , Hominidae/genetics , Nuclear Receptor Subfamily 2, Group C, Member 1/genetics , Pluripotent Stem Cells/cytology , Animals , Cell Line , Conserved Sequence , Humans , Mice , Nanog Homeobox Protein/genetics , Nanog Homeobox Protein/metabolism , Nuclear Receptor Subfamily 2, Group C, Member 1/chemistry , Octamer Transcription Factor-3/genetics , Octamer Transcription Factor-3/metabolism , Phosphoenolpyruvate Carboxykinase (ATP)/genetics , Phosphoenolpyruvate Carboxykinase (ATP)/metabolism , Pluripotent Stem Cells/metabolism , Protein Domains
SELECTION OF CITATIONS
SEARCH DETAIL
...