Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Comput Biol ; 2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38957993

RESUMO

The estimation of haplotype structure and frequencies provides crucial information about the composition of genomes. Techniques, such as single-individual haplotyping, aim to reconstruct individual haplotypes from diploid genome sequencing data. However, our focus is distinct. We address the challenge of reconstructing haplotype structure and frequencies from pooled sequencing samples where multiple individuals are sequenced simultaneously. A frequentist method to address this issue has recently been proposed. In contrast to this and other methods that compute point estimates, our proposed Bayesian hierarchical model delivers a posterior that permits us to also quantify uncertainty. Since matching permutations in both haplotype structure and corresponding frequency matrix lead to the same reconstruction of their product, we introduce an order-preserving shrinkage prior that ensures identifiability with respect to permutations. For inference, we introduce a blocked Gibbs sampler that enforces the required constraints. In a simulation study, we assessed the performance of our method. Furthermore, by using our approach on two distinct sets of real data, we demonstrate that our Bayesian approach can reconstruct the dominant haplotypes in a challenging, high-dimensional set-up.

2.
R Soc Open Sci ; 10(8): 221469, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37538742

RESUMO

Transcription is a complex phenomenon that permits the conversion of genetic information into phenotype by means of an enzyme called RNA polymerase, which erratically moves along and scans the DNA template. We perform Bayesian inference over a paradigmatic mechanistic model of non-equilibrium statistical physics, i.e. the asymmetric exclusion processes in the hydrodynamic limit, assuming a Gaussian process prior for the polymerase progression rate as a latent variable. Our framework allows us to infer the speed of polymerases during transcription given their spatial distribution, while avoiding the explicit inversion of the system's dynamics. The results, which show processing rates strongly varying with genomic position and minor role of traffic-like congestion, may have strong implications for the understanding of gene expression.

3.
Front Physiol ; 13: 985905, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36311230

RESUMO

The transport of platelets in blood is commonly assumed to obey an advection-diffusion equation with a diffusion constant given by the so-called Zydney-Colton theory. Here we reconsider this hypothesis based on experimental observations and numerical simulations including a fully resolved suspension of red blood cells and platelets subject to a shear. We observe that the transport of platelets perpendicular to the flow can be characterized by a non-trivial distribution of velocities with and exponential decreasing bulk, followed by a power law tail. We conclude that such distribution of velocities leads to diffusion of platelets about two orders of magnitude higher than predicted by Zydney-Colton theory. We tested this distribution with a minimal stochastic model of platelets deposition to cover space and time scales similar to our experimental results, and confirm that the exponential-powerlaw distribution of velocities results in a coefficient of diffusion significantly larger than predicted by the Zydney-Colton theory.

4.
PLoS Comput Biol ; 18(3): e1009910, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35271585

RESUMO

Cardio/cerebrovascular diseases (CVD) have become one of the major health issue in our societies. But recent studies show that the present pathology tests to detect CVD are ineffectual as they do not consider different stages of platelet activation or the molecular dynamics involved in platelet interactions and are incapable to consider inter-individual variability. Here we propose a stochastic platelet deposition model and an inferential scheme to estimate the biologically meaningful model parameters using approximate Bayesian computation with a summary statistic that maximally discriminates between different types of patients. Inferred parameters from data collected on healthy volunteers and different patient types help us to identify specific biological parameters and hence biological reasoning behind the dysfunction for each type of patients. This work opens up an unprecedented opportunity of personalized pathology test for CVD detection and medical treatment.


Assuntos
Doenças Cardiovasculares , Doenças Vasculares , Teorema de Bayes , Doenças Cardiovasculares/diagnóstico , Humanos
5.
PLoS Comput Biol ; 17(8): e1009236, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34383756

RESUMO

A mathematical model for the COVID-19 pandemic spread, which integrates age-structured Susceptible-Exposed-Infected-Recovered-Deceased dynamics with real mobile phone data accounting for the population mobility, is presented. The dynamical model adjustment is performed via Approximate Bayesian Computation. Optimal lockdown and exit strategies are determined based on nonlinear model predictive control, constrained to public-health and socio-economic factors. Through an extensive computational validation of the methodology, it is shown that it is possible to compute robust exit strategies with realistic reduced mobility values to inform public policy making, and we exemplify the applicability of the methodology using datasets from England and France.


Assuntos
COVID-19/epidemiologia , Pandemias , Quarentena , Viagem , Teorema de Bayes , COVID-19/virologia , Inglaterra , França , Humanos , SARS-CoV-2/isolamento & purificação , Smartphone
6.
J Chem Phys ; 149(15): 154110, 2018 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-30342443

RESUMO

Molecular dynamics (MD) simulations give access to equilibrium structures and dynamic properties given an ergodic sampling and an accurate force-field. The force-field parameters are calibrated to reproduce properties measured by experiments or simulations. The main contribution of this paper is an approximate Bayesian framework for the calibration and uncertainty quantification of the force-field parameters, without assuming parameter uncertainty to be Gaussian. To this aim, since the likelihood function of the MD simulation models is intractable in the absence of Gaussianity assumption, we use a likelihood-free inference scheme known as approximate Bayesian computation (ABC) and propose an adaptive population Monte Carlo ABC algorithm, which is illustrated to converge faster and scales better than the previously used ABCsubsim algorithm for the calibration of the force-field of a helium system. The second contribution is the adaptation of ABC algorithms for High Performance Computing to MD simulations within the Python ecosystem ABCpy. This adaptation includes a novel use of a dynamic allocation scheme for Message Passing Interface (MPI). We illustrate the performance of the developed methodology to learn posterior distribution and Bayesian estimates of Lennard-Jones force-field parameters of helium and the TIP4P system of water implemented for both simulated and experimental datasets collected using neutron and X-ray diffraction. For simulated data, the Bayesian estimate is in close agreement with the true parameter value used to generate the dataset. For experimental as well as for simulated data, the Bayesian posterior distribution shows a strong correlation pattern between the force-field parameters. Providing an estimate of the entire posterior distribution, our methodology also allows us to perform the uncertainty quantification of model prediction. This research opens up the possibility to rigorously calibrate force-fields from available experimental datasets of any structural and dynamic property.

7.
Front Physiol ; 9: 1128, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30177886

RESUMO

Cardio/cerebrovascular diseases (CVD) have become one of the major health issue in our societies. Recent studies show the existing clinical tests to detect CVD are ineffectual as they do not consider different stages of platelet activation or the molecular dynamics involved in platelet interactions. Further they are also incapable to consider inter-individual variability. A physical description of platelets deposition was introduced recently in Chopard et al. (2017), by integrating fundamental understandings of how platelets interact in a numerical model, parameterized by five parameters. These parameters specify the deposition process and are relevant for a biomedical understanding of the phenomena. One of the main intuition is that these parameters are precisely the information needed for a pathological test identifying CVD captured and that they capture the inter-individual variability. Following this intuition, here we devise a Bayesian inferential scheme for estimation of these parameters, using experimental observations, at different time intervals, on the average size of the aggregation clusters, their number per mm2, the number of platelets, and the ones activated per µâ„“ still in suspension. As the likelihood function of the numerical model is intractable due to the complex stochastic nature of the model, we use a likelihood-free inference scheme approximate Bayesian computation (ABC) to calibrate the parameters in a data-driven manner. As ABC requires the generation of many pseudo-data by expensive simulation runs, we use a high performance computing (HPC) framework for ABC to make the inference possible for this model. We consider a collective dataset of seven volunteers and use this inference scheme to get an approximate posterior distribution and the Bayes estimate of these five parameters. The mean posterior prediction of platelet deposition pattern matches the experimental dataset closely with a tight posterior prediction error margin, justifying our main intuition and providing a methodology to infer these parameters given patient data. The present approach can be used to build a new generation of personalized platelet functionality tests for CVD detection, using numerical modeling of platelet deposition, Bayesian uncertainty quantification, and High performance computing.

8.
Proc Math Phys Eng Sci ; 474(2215): 20180129, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30100809

RESUMO

Infectious diseases are studied to understand their spreading mechanisms, to evaluate control strategies and to predict the risk and course of future outbreaks. Because people only interact with few other individuals, and the structure of these interactions influence spreading processes, the pairwise relationships between individuals can be usefully represented by a network. Although the underlying transmission processes are different, the network approach can be used to study the spread of pathogens in a contact network or the spread of rumours in a social network. We study simulated simple and complex epidemics on synthetic networks and on two empirical networks, a social/contact network in an Indian village and an online social network. Our goal is to learn simultaneously the spreading process parameters and the first infected node, given a fixed network structure and the observed state of nodes at several time points. Our inference scheme is based on approximate Bayesian computation, a likelihood-free inference technique. Our method is agnostic about the network topology and the spreading process. It generally performs well and, somewhat counter-intuitively, the inference problem appears to be easier on more heterogeneous network topologies, which enhances its future applicability to real-world settings where few networks have homogeneous topologies.

9.
Stat Comput ; 28(2): 411-425, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-31997856

RESUMO

Increasingly complex generative models are being used across disciplines as they allow for realistic characterization of data, but a common difficulty with them is the prohibitively large computational cost to evaluate the likelihood function and thus to perform likelihood-based statistical inference. A likelihood-free inference framework has emerged where the parameters are identified by finding values that yield simulated data resembling the observed data. While widely applicable, a major difficulty in this framework is how to measure the discrepancy between the simulated and observed data. Transforming the original problem into a problem of classifying the data into simulated versus observed, we find that classification accuracy can be used to assess the discrepancy. The complete arsenal of classification methods becomes thereby available for inference of intractable generative models. We validate our approach using theory and simulations for both point estimation and Bayesian inference, and demonstrate its use on real data by inferring an individual-based epidemiological model for bacterial infections in child care centers.

10.
Syst Biol ; 66(1): e66-e82, 2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28175922

RESUMO

Bayesian inference plays an important role in phylogenetics, evolutionary biology, and in many other branches of science. It provides a principled framework for dealing with uncertainty and quantifying how it changes in the light of new evidence. For many complex models and inference problems, however, only approximate quantitative answers are obtainable. Approximate Bayesian computation (ABC) refers to a family of algorithms for approximate inference that makes a minimal set of assumptions by only requiring that sampling from a model is possible. We explain here the fundamentals of ABC, review the classical algorithms, and highlight recent developments. [ABC; approximate Bayesian computation; Bayesian inference; likelihood-free inference; phylogenetics; simulator-based models; stochastic simulation models; tree-based models.]


Assuntos
Classificação , Modelos Biológicos , Filogenia , Algoritmos , Teorema de Bayes
11.
Bioinformatics ; 32(9): 1388-94, 2016 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-26740526

RESUMO

MOTIVATION: Public and private repositories of experimental data are growing to sizes that require dedicated methods for finding relevant data. To improve on the state of the art of keyword searches from annotations, methods for content-based retrieval have been proposed. In the context of gene expression experiments, most methods retrieve gene expression profiles, requiring each experiment to be expressed as a single profile, typically of case versus control. A more general, recently suggested alternative is to retrieve experiments whose models are good for modelling the query dataset. However, for very noisy and high-dimensional query data, this retrieval criterion turns out to be very noisy as well. RESULTS: We propose doing retrieval using a denoised model of the query dataset, instead of the original noisy dataset itself. To this end, we introduce a general probabilistic framework, where each experiment is modelled separately and the retrieval is done by finding related models. For retrieval of gene expression experiments, we use a probabilistic model called product partition model, which induces a clustering of genes that show similar expression patterns across a number of samples. The suggested metric for retrieval using clusterings is the normalized information distance. Empirical results finally suggest that inference for the full probabilistic model can be approximated with good performance using computationally faster heuristic clustering approaches (e.g. k-means). The method is highly scalable and straightforward to apply to construct a general-purpose gene expression experiment retrieval method. AVAILABILITY AND IMPLEMENTATION: The method can be implemented using standard clustering algorithms and normalized information distance, available in many statistical software packages. CONTACT: paul.blomstedt@aalto.fi or samuel.kaski@aalto.fi SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Expressão Gênica , Modelos Genéticos , Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica
12.
Biomed Eng Online ; 5: 65, 2006 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-17176476

RESUMO

Electronic Nose based ENT bacteria identification in hospital environment is a classical and challenging problem of classification. In this paper an electronic nose (e-nose), comprising a hybrid array of 12 tin oxide sensors (SnO2) and 6 conducting polymer sensors has been used to identify three species of bacteria, Escherichia coli (E. coli), Staphylococcus aureus (S. aureus), and Pseudomonas aeruginosa (P. aeruginosa) responsible for ear nose and throat (ENT) infections when collected as swab sample from infected patients and kept in ISO agar solution in the hospital environment. In the next stage a sub-classification technique has been developed for the classification of two different species of S. aureus, namely Methicillin-Resistant S. aureus (MRSA) and Methicillin Susceptible S. aureus (MSSA). An innovative Intelligent Bayes Classifier (IBC) based on "Baye's theorem" and "maximum probability rule" was developed and investigated for these three main groups of ENT bacteria. Along with the IBC three other supervised classifiers (namely, Multilayer Perceptron (MLP), Probabilistic neural network (PNN), and Radial Basis Function Network (RBFN)) were used to classify the three main bacteria classes. A comparative evaluation of the classifiers was conducted for this application. IBC outperformed MLP, PNN and RBFN. The best results suggest that we are able to identify and classify three bacteria main classes with up to 100% accuracy rate using IBC. We have also achieved 100% classification accuracy for the classification of MRSA and MSSA samples with IBC. We can conclude that this study proves that IBC based e-nose can provide very strong and rapid solution for the identification of ENT infections in hospital environment.


Assuntos
Bactérias/classificação , Monitoramento Ambiental/instrumentação , Doenças da Laringe/microbiologia , Doenças Nasofaríngeas/microbiologia , Otite/microbiologia , Algoritmos , Teorema de Bayes , Eletrônica Médica , Escherichia coli/isolamento & purificação , Hospitais , Humanos , Redes Neurais de Computação , Pseudomonas aeruginosa/isolamento & purificação , Reprodutibilidade dos Testes , Staphylococcus aureus/isolamento & purificação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...