Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
2.
Bioinformatics ; 35(10): 1720-1728, 2019 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-30321307

RESUMO

MOTIVATION: Approximate Bayesian computation (ABC) has grown into a standard methodology that manages Bayesian inference for models associated with intractable likelihood functions. Most ABC implementations require the preliminary selection of a vector of informative statistics summarizing raw data. Furthermore, in almost all existing implementations, the tolerance level that separates acceptance from rejection of simulated parameter values needs to be calibrated. RESULTS: We propose to conduct likelihood-free Bayesian inferences about parameters with no prior selection of the relevant components of the summary statistics and bypassing the derivation of the associated tolerance level. The approach relies on the random forest (RF) methodology of Breiman (2001) applied in a (non-parametric) regression setting. We advocate the derivation of a new RF for each component of the parameter vector of interest. When compared with earlier ABC solutions, this method offers significant gains in terms of robustness to the choice of the summary statistics, does not depend on any type of tolerance level, and is a good trade-off in term of quality of point estimator precision and credible interval estimations for a given computing time. We illustrate the performance of our methodological proposal and compare it with earlier ABC methods on a Normal toy example and a population genetics example dealing with human population evolution. AVAILABILITY AND IMPLEMENTATION: All methods designed here have been incorporated in the R package abcrf (version 1.7.1) available on CRAN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Teorema de Bayes , Biometria , Simulação por Computador , Genética Populacional , Humanos , Funções Verossimilhança
3.
Artigo em Inglês | MEDLINE | ID: mdl-30167072

RESUMO

Markov chain Monte Carlo algorithms are used to simulate from complex statistical distributions by way of a local exploration of these distributions. This local feature avoids heavy requests on understanding the nature of the target, but it also potentially induces a lengthy exploration of this target, with a requirement on the number of simulations that grows with the dimension of the problem and with the complexity of the data behind it. Several techniques are available toward accelerating the convergence of these Monte Carlo algorithms, either at the exploration level (as in tempering, Hamiltonian Monte Carlo and partly deterministic methods) or at the exploitation level (with Rao-Blackwellization and scalable methods). This article is categorized under: Statistical and Graphical Methods of Data Analysis > Markov Chain Monte Carlo (MCMC)Algorithms and Computational Methods > AlgorithmsStatistical and Graphical Methods of Data Analysis > Monte Carlo Methods.

4.
PLoS One ; 11(4): e0147311, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27073897

RESUMO

The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a 'magnitude-based inference' approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.


Assuntos
Exercício Físico/fisiologia , Modelos Biológicos , Esportes/fisiologia , Teorema de Bayes , Humanos , Medicina Esportiva
5.
Bioinformatics ; 32(6): 859-66, 2016 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-26589278

RESUMO

MOTIVATION: Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. RESULTS: We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. AVAILABILITY AND IMPLEMENTATION: The proposed methodology is implemented in the R package abcrf available on the CRAN. CONTACT: jean-michel.marin@umontpellier.fr SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genética Populacional , Algoritmos , Teorema de Bayes , Simulação por Computador , Modelos Genéticos
7.
Proc Natl Acad Sci U S A ; 110(4): 1321-6, 2013 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-23297233

RESUMO

Approximate Bayesian computation has become an essential tool for the analysis of complex stochastic models when the likelihood function is numerically unavailable. However, the well-established statistical method of empirical likelihood provides another route to such settings that bypasses simulations from the model and the choices of the approximate Bayesian computation parameters (summary statistics, distance, tolerance), while being convergent in the number of observations. Furthermore, bypassing model simulations may lead to significant time savings in complex models, for instance those found in population genetics. The Bayesian computation with empirical likelihood algorithm we develop in this paper also provides an evaluation of its own performance through an associated effective sample size. The method is illustrated using several examples, including estimation of standard distributions, time series, and population genetics models.


Assuntos
Teorema de Bayes , Funções Verossimilhança , Algoritmos , Fenômenos Biofísicos , Bioestatística , Genética Populacional/estatística & dados numéricos , Modelos Estatísticos , Processos Estocásticos
8.
Mol Ecol Resour ; 12(5): 846-55, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22571382

RESUMO

Comparison of demo-genetic models using Approximate Bayesian Computation (ABC) is an active research field. Although large numbers of populations and models (i.e. scenarios) can be analysed with ABC using molecular data obtained from various marker types, methodological and computational issues arise when these numbers become too large. Moreover, Robert et al. (Proceedings of the National Academy of Sciences of the United States of America, 2011, 108, 15112) have shown that the conclusions drawn on ABC model comparison cannot be trusted per se and required additional simulation analyses. Monte Carlo inferential techniques to empirically evaluate confidence in scenario choice are very time-consuming, however, when the numbers of summary statistics (Ss) and scenarios are large. We here describe a methodological innovation to process efficient ABC scenario probability computation using linear discriminant analysis (LDA) on Ss before computing logistic regression. We used simulated pseudo-observed data sets (pods) to assess the main features of the method (precision and computation time) in comparison with traditional probability estimation using raw (i.e. not LDA transformed) Ss. We also illustrate the method on real microsatellite data sets produced to make inferences about the invasion routes of the coccinelid Harmonia axyridis. We found that scenario probabilities computed from LDA-transformed and raw Ss were strongly correlated. Type I and II errors were similar for both methods. The faster probability computation that we observed (speed gain around a factor of 100 for LDA-transformed Ss) substantially increases the ability of ABC practitioners to analyse large numbers of pods and hence provides a manageable way to empirically evaluate the power available to discriminate among a large set of complex scenarios.


Assuntos
Bioestatística/métodos , Biologia Computacional/métodos , Modelos Genéticos , Animais , Besouros/genética , Marcadores Genéticos , Genética Populacional
9.
Proc Natl Acad Sci U S A ; 108(37): 15112-7, 2011 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-21876135

RESUMO

Approximate Bayesian computation (ABC) have become an essential tool for the analysis of complex stochastic models. Grelaud et al. [(2009) Bayesian Anal 3:427-442] advocated the use of ABC for model choice in the specific case of Gibbs random fields, relying on an intermodel sufficiency property to show that the approximation was legitimate. We implemented ABC model choice in a wide range of phylogenetic models in the Do It Yourself-ABC (DIY-ABC) software [Cornuet et al. (2008) Bioinformatics 24:2713-2719]. We now present arguments as to why the theoretical arguments for ABC model choice are missing, because the algorithm involves an unknown loss of information induced by the use of insufficient summary statistics. The approximation error of the posterior probabilities of the models under comparison may thus be unrelated with the computational effort spent in running an ABC algorithm. We then conclude that additional empirical verifications of the performances of the ABC procedure as those available in DIY-ABC are necessary to conduct model choice.


Assuntos
Teorema de Bayes , Biologia Computacional/métodos , Simulação por Computador , Genética Populacional
11.
Proc Natl Acad Sci U S A ; 107(3): E5; author reply E6-7, 2010 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-20080613
12.
Bioinformatics ; 24(23): 2713-9, 2008 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-18842597

RESUMO

UNLABELLED: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in which scenarios can be customized by the user to fit many complex situations involving any number of populations and samples. Such scenarios involve any combination of population divergences, admixtures and population size changes. DIY ABC can be used to compare competing scenarios, estimate parameters for one or more scenarios and compute bias and precision measures for a given scenario and known values of parameters (the current version applies to unlinked microsatellite data). This article describes key methods used in the program and provides its main features. The analysis of one simulated and one real dataset, both with complex evolutionary scenarios, illustrates the main possibilities of DIY ABC. AVAILABILITY: The software DIY ABC is freely available at http://www.montpellier.inra.fr/CBGP/diyabc.


Assuntos
Genética Populacional/métodos , Software , Algoritmos , Teorema de Bayes , Evolução Molecular , Humanos , Grupos Populacionais/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA