Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
1.
Mol Biol Evol ; 35(4): 984-1002, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29149300

ABSTRACT

Most phylogenetic models assume that the evolutionary process is stationary and reversible. In addition to being biologically improbable, these assumptions also impair inference by generating models under which the likelihood does not depend on the position of the root. Consequently, the root of the tree cannot be inferred as part of the analysis. Yet identifying the root position is a key component of phylogenetic inference because it provides a point of reference for polarizing ancestor-descendant relationships and therefore interpreting the tree. In this paper, we investigate the effect of relaxing the unrealistic reversibility assumption and allowing the position of the root to be another unknown. We propose two hierarchical models that are centered on a reversible model but perturbed to allow nonreversibility. The models differ in the degree of structure imposed on the perturbations. The analysis is performed in the Bayesian framework using Markov chain Monte Carlo methods for which software is provided. We illustrate the performance of the two nonreversible models in analyses of simulated data using two types of topological priors. We then apply the models to a real biological data set, the radiation of polyploid yeasts, for which there is robust biological opinion about the root position. Finally, we apply the models to a second biological alignment for which the rooted tree is controversial: the ribosomal tree of life. We compare the two nonreversible models and conclude that both are useful in inferring the position of the root from real biological data.


Subject(s)
Models, Genetic , Phylogeny , Bayes Theorem , Markov Chains , Monte Carlo Method , Ribosomes , Saccharomyces cerevisiae
2.
Syst Biol ; 67(2): 320-327, 2018 Mar 01.
Article in English | MEDLINE | ID: mdl-29029295

ABSTRACT

Most existing measures of distance between phylogenetic trees are based on the geometry or topology of the trees. Instead, we consider distance measures which are based on the underlying probability distributions on genetic sequence data induced by trees. Monte Carlo schemes are necessary to calculate these distances approximately, and we describe efficient sampling procedures. Key features of the distances are the ability to include substitution model parameters and to handle trees with different taxon sets in a principled way. We demonstrate some of the properties of these new distance measures and compare them to existing distances, in particular by applying multidimensional scaling to data sets previously reported as containing phylogenetic islands. [Metric; probability distribution; multidimensional scaling; information geometry.


Subject(s)
Classification/methods , Models, Genetic , Phylogeny , Monte Carlo Method , Probability
3.
Biol Lett ; 13(11)2017 Nov.
Article in English | MEDLINE | ID: mdl-29118237

ABSTRACT

Antlers function as primary weapons during fights for many species of ungulate. We examined the association between antler damage and (i) contest dynamics: the behavioural tactics used during fighting including fight duration, and (ii) mating success, fighting rate and dominance. Structural damage of the antlers was associated with contest dynamics: damage was negatively associated with jump clash attacks by individuals with damaged antlers, whereas opponents were more likely to physically displace individuals with damaged antlers during fighting. We found a positive association between dominance and damage indicating that high-ranking individuals were likely to have breaks to their antlers. We found no evidence that damage was associated with either mating success or the number of fights individuals engaged in. Our study provides a new perspective on understanding the association between contest dynamics and weapon structure, while also showing that damage has limited fitness consequences for individuals.


Subject(s)
Aggression , Antlers/injuries , Behavior, Animal/physiology , Deer/physiology , Animals , Deer/psychology , Female , Male , Sexual Behavior, Animal , Social Dominance
4.
Philos Trans R Soc Lond B Biol Sci ; 370(1678): 20140336, 2015 Sep 26.
Article in English | MEDLINE | ID: mdl-26323766

ABSTRACT

The root of a phylogenetic tree is fundamental to its biological interpretation, but standard substitution models do not provide any information on its position. Here, we describe two recently developed models that relax the usual assumptions of stationarity and reversibility, thereby facilitating root inference without the need for an outgroup. We compare the performance of these models on a classic test case for phylogenetic methods, before considering two highly topical questions in evolutionary biology: the deep structure of the tree of life and the root of the archaeal radiation. We show that all three alignments contain meaningful rooting information that can be harnessed by these new models, thus complementing and extending previous work based on outgroup rooting. In particular, our analyses exclude the root of the tree of life from the eukaryotes or Archaea, placing it on the bacterial stem or within the Bacteria. They also exclude the root of the archaeal radiation from several major clades, consistent with analyses using other rooting methods. Overall, our results demonstrate the utility of non-reversible and non-stationary models for rooting phylogenetic trees, and identify areas where further progress can be made.


Subject(s)
Computer Simulation , Models, Genetic , Phylogeny , Archaea/genetics , Bacteria/genetics , Genetic Variation
5.
Stat Appl Genet Mol Biol ; 13(5): 589-609, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25153609

ABSTRACT

In molecular phylogenetics, standard models of sequence evolution generally assume that sequence composition remains constant over evolutionary time. However, this assumption is violated in many datasets which show substantial heterogeneity in sequence composition across taxa. We propose a model which allows compositional heterogeneity across branches, and formulate the model in a Bayesian framework. Specifically, the root and each branch of the tree is associated with its own composition vector whilst a global matrix of exchangeability parameters applies everywhere on the tree. We encourage borrowing of strength between branches by developing two possible priors for the composition vectors: one in which information can be exchanged equally amongst all branches of the tree and another in which more information is exchanged between neighbouring branches than between distant branches. We also propose a Markov chain Monte Carlo (MCMC) algorithm for posterior inference which uses data augmentation of substitutional histories to yield a simple complete data likelihood function that factorises over branches and allows Gibbs updates for most parameters. Standard phylogenetic models are not informative about the root position. Therefore a significant advantage of the proposed model is that it allows inference about rooted trees. The position of the root is fundamental to the biological interpretation of trees, both for polarising trait evolution and for establishing the order of divergence among lineages. Furthermore, unlike some other related models from the literature, inference in the model we propose can be carried out through a simple MCMC scheme which does not require problematic dimension-changing moves. We investigate the performance of the model and priors in analyses of two alignments for which there is strong biological opinion about the tree topology and root position.


Subject(s)
Bayes Theorem , Phylogeny , Markov Chains , Monte Carlo Method
6.
Stat Appl Genet Mol Biol ; 13(5): 531-51, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25153608

ABSTRACT

In this paper we develop a Bayesian statistical inference approach to the unified analysis of isobaric labelled MS/MS proteomic data across multiple experiments. An explicit probabilistic model of the log-intensity of the isobaric labels' reporter ions across multiple pre-defined groups and experiments is developed. This is then used to develop a full Bayesian statistical methodology for the identification of differentially expressed proteins, with respect to a control group, across multiple groups and experiments. This methodology is implemented and then evaluated on simulated data and on two model experimental datasets (for which the differentially expressed proteins are known) that use a TMT labelling protocol.


Subject(s)
Bayes Theorem , Proteins/chemistry , Tandem Mass Spectrometry/methods , Models, Theoretical , Proteomics
7.
Phys Rev E Stat Nonlin Soft Matter Phys ; 86(1 Pt 2): 016105, 2012 Jul.
Article in English | MEDLINE | ID: mdl-23005489

ABSTRACT

We consider a wave-front model for the spread of neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from southern and western Europe. Our wave-front model allows for both an isotropic background spread (incorporating the effects of local geography) and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wave front, and use Gaussian process emulators to further increase the efficiency of our model, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and discuss the inferred distribution of the parameter specifying this uncertainty, along with the distributions of the parameters of our wave-front model. We subsequently use predictive distributions, taking account of parameter uncertainty, to identify radiocarbon sites which do not agree well with our model. These sites may warrant further archaeological study or motivate refinements to the model.


Subject(s)
Bayes Theorem , Human Migration/statistics & numerical data , Models, Statistical , Population Dynamics , Computer Simulation , Europe
8.
BMC Res Notes ; 3: 81, 2010 Mar 19.
Article in English | MEDLINE | ID: mdl-20302631

ABSTRACT

BACKGROUND: Large scale microarray experiments are becoming increasingly routine, particularly those which track a number of different cell lines through time. This time-course information provides valuable insight into the dynamic mechanisms underlying the biological processes being observed. However, proper statistical analysis of time-course data requires the use of more sophisticated tools and complex statistical models. FINDINGS: Using the open source CRAN and Bioconductor repositories for R, we provide example analysis and protocol which illustrate a variety of methods that can be used to analyse time-course microarray data. In particular, we highlight how to construct appropriate contrasts to detect differentially expressed genes and how to generate plausible pathways from the data. A maintained version of the R commands can be found at http://www.mas.ncl.ac.uk/~ncsg3/microarray/. CONCLUSIONS: CRAN and Bioconductor are stable repositories that provide a wide variety of appropriate statistical tools to analyse time course microarray data.

9.
Brief Bioinform ; 11(3): 278-89, 2010 May.
Article in English | MEDLINE | ID: mdl-20056731

ABSTRACT

Dynamic simulation modelling of complex biological processes forms the backbone of systems biology. Discrete stochastic models are particularly appropriate for describing sub-cellular molecular interactions, especially when critical molecular species are thought to be present at low copy-numbers. For example, these stochastic effects play an important role in models of human ageing, where ageing results from the long-term accumulation of random damage at various biological scales. Unfortunately, realistic stochastic simulation of discrete biological processes is highly computationally intensive, requiring specialist hardware, and can benefit greatly from parallel and distributed approaches to computation and analysis. For these reasons, we have developed the BASIS system for the simulation and storage of stochastic SBML models together with associated simulation results. This system is exposed as a set of web services to allow users to incorporate its simulation tools into their workflows. Parameter inference for stochastic models is also difficult and computationally expensive. The CaliBayes system provides a set of web services (together with an R package for consuming these and formatting data) which addresses this problem for SBML models. It uses a sequential Bayesian MCMC method, which is powerful and flexible, providing very rich information. However this approach is exceptionally computationally intensive and requires the use of a carefully designed architecture. Again, these tools are exposed as web services to allow users to take advantage of this system. In this article, we describe these two systems and demonstrate their integrated use with an example workflow to estimate the parameters of a simple model of Saccharomyces cerevisiae growth on agar plates.


Subject(s)
Algorithms , Computer Simulation , Models, Biological , Programming Languages , Software , Biology/methods , Software Design , Systems Integration
10.
J Math Biol ; 55(2): 223-47, 2007 Aug.
Article in English | MEDLINE | ID: mdl-17361423

ABSTRACT

Stochastic compartmental models of the SEIR type are often used to make inferences on epidemic processes from partially observed data in which only removal times are available. For many epidemics, the assumption of constant removal rates is not plausible. We develop methods for models in which these rates are a time-dependent step function. A reversible jump MCMC algorithm is described that permits Bayesian inferences to be made on model parameters, particularly those associated with the step function. The method is applied to two datasets on outbreaks of smallpox and a respiratory disease. The analyses highlight the importance of allowing for time dependence by contrasting the predictive distributions for the removal times and comparing them with the observed data.


Subject(s)
Bayes Theorem , Epidemiology , Models, Biological , Respiratory Tract Infections/epidemiology , Smallpox/epidemiology , Stochastic Processes , Algorithms , Basic Reproduction Number , Computer Simulation , Humans , Likelihood Functions , Markov Chains , Monte Carlo Method
11.
Bioinformatics ; 22(5): 628-9, 2006 Mar 01.
Article in English | MEDLINE | ID: mdl-16410323

ABSTRACT

MOTIVATION: SBML is quickly becoming the standard format to exchange biochemical models. The tools presented in this paper are loosely-coupled, and are intended to be incorporated into SBML aware applications. The rationale for this is to reduce the amount of repeated work carried out within the community and to create tools that offer a greater number of features to the end-user. AVAILABILITY: All tools described are available from http://www.basis.ncl.ac.uk/software and are licensed under GNU General Public License.


Subject(s)
Database Management Systems , Databases, Factual , Models, Biological , Software , Systems Biology/methods , User-Computer Interface , Biochemistry/methods , Computer Simulation
12.
Mech Ageing Dev ; 126(1): 119-31, 2005 Jan.
Article in English | MEDLINE | ID: mdl-15610770

ABSTRACT

Many molecular chaperones are also known as heat shock proteins because they are synthesised in increased amounts after brief exposure of cells to elevated temperatures. They have many cellular functions and are involved in the folding of nascent proteins, the re-folding of denatured proteins, the prevention of protein aggregation, and assisting the targeting of proteins for degradation by the proteasome and lysosomes. They also have a role in apoptosis and are involved in modulating signals for immune and inflammatory responses. Stress-induced transcription of heat shock proteins requires the activation of heat shock factor (HSF). Under normal conditions, HSF is bound to heat shock proteins resulting in feedback repression. During stress, cellular proteins undergo denaturation and sequester heat shock proteins bound to HSF, which is then able to become transcriptionally active. The induction of heat shock proteins is impaired with age and there is also a decline in chaperone function. Aberrant/damaged proteins accumulate with age and are implicated in several important age-related conditions (e.g. Alzheimer's disease, Parkinson's disease, and cataract). Therefore, the balance between damaged proteins and available free chaperones may be greatly disturbed during ageing. We have developed a mathematical model to describe the heat shock system. The aim of the model is two-fold: to explore the heat shock system and its implications in ageing; and to demonstrate how to build a model of a biological system using our simulation system (biology of ageing e-science integration and simulation (BASIS)).


Subject(s)
Cellular Senescence/physiology , DNA-Binding Proteins/metabolism , Models, Biological , Molecular Chaperones/metabolism , Protein Folding , Animals , Heat Shock Transcription Factors , Heat-Shock Response/physiology , Humans , Mice , Rats , Transcription Factors
13.
Biometrics ; 60(3): 573-81; discussion 581-8, 2004 Sep.
Article in English | MEDLINE | ID: mdl-15339274

ABSTRACT

Many deoxyribonucleic acid (DNA) sequences display compositional heterogeneity in the form of segments of similar structure. This article describes a Bayesian method that identifies such segments by using a Markov chain governed by a hidden Markov model. Markov chain Monte Carlo (MCMC) techniques are employed to compute all posterior quantities of interest and, in particular, allow inferences to be made regarding the number of segment types and the order of Markov dependence in the DNA sequence. The method is applied to the segmentation of the bacteriophage lambda genome, a common benchmark sequence used for the comparison of statistical segmentation algorithms.


Subject(s)
Bayes Theorem , Sequence Analysis, DNA/statistics & numerical data , Algorithms , Bacteriophage lambda/genetics , Biometry , DNA, Viral/genetics , Genome, Viral , Markov Chains , Models, Statistical , Monte Carlo Method
14.
J Theor Biol ; 229(2): 189-96, 2004 Jul 21.
Article in English | MEDLINE | ID: mdl-15207474

ABSTRACT

Budding yeast, Saccharomyces cerevisiae, is commonly used as a system to study cellular ageing. Yeast mother cells are capable of only a limited number of divisions before they undergo senescence, whereas newly formed daughters usually have their replicative age "reset" to zero. Accumulation of extrachromosomal ribosomal DNA circles (ERCs) appears to be an important contributor to ageing in yeast, and we describe a mathematical model that we developed to examine this process. We show that an age-related accumulation of ERCs readily explains the observed features of yeast ageing but that in order to match the experimental survival curves quantitatively, it is necessary that the probability of ERC formation increases with the age of the cell. This implies that some other mechanism(s), in addition to ERC accumulation, must underlie yeast ageing. We also demonstrate that the model can be used to gain insight into how an extra copy of the Sir2 gene might extend lifespan and we show how the model makes novel, testable predictions about patterns of age-specific mortality in yeast populations.


Subject(s)
Models, Genetic , Saccharomyces cerevisiae/physiology , Aging/genetics , Cell Division , DNA, Circular , DNA, Ribosomal , Gene Duplication , Histone Deacetylases/genetics , Silent Information Regulator Proteins, Saccharomyces cerevisiae/genetics , Sirtuin 2 , Sirtuins/genetics
15.
Nucleic Acids Res ; 31(20): 6043-52, 2003 Oct 15.
Article in English | MEDLINE | ID: mdl-14530452

ABSTRACT

We applied a hidden Markov model segmentation method to the human mitochondrial genome to identify patterns in the sequence, to compare these patterns to the gene structure of mtDNA and to see whether these patterns reveal additional characteristics important for our understanding of genome evolution, structure and function. Our analysis identified three segmentation categories based upon the sequence transition probabilities. Category 2 segments corresponded to the tRNA and rRNA genes, with a greater strand-symmetry in these segments. Category 1 and 3 segments covered the protein- coding genes and almost all of the non-coding D-loop. Compared to category 1, the mtDNA segments assigned to category 3 had much lower guanine abundance. A comparison to two independent databases of mitochondrial mutations and polymorphisms showed that the high substitution rate of guanine in human mtDNA is largest in the category 3 segments. Analysis of synonymous mutations showed the same pattern. This suggests that this heterogeneity in the mutation rate is partly independent of respiratory chain function and is a direct property of the genome sequence itself. This has important implications for our understanding of mtDNA evolution and its use as a 'molecular clock' to determine the rate of population and species divergence.


Subject(s)
Algorithms , DNA, Mitochondrial/genetics , Guanosine Triphosphate/genetics , Gene Frequency , Genetic Heterogeneity , Genetic Variation , Humans , Mutation , Probability , Time Factors
16.
Nat Rev Mol Cell Biol ; 4(3): 243-9, 2003 03.
Article in English | MEDLINE | ID: mdl-12612643

ABSTRACT

Ageing is a highly complex process; it involves interactions between numerous biochemical and cellular mechanisms that affect many tissues in an organism. Although work on the biology of ageing is now advancing quickly, this inherent complexity means that information remains highly fragmented. We describe how a new web-based modelling initiative is seeking to integrate data and hypotheses from diverse biological sources.


Subject(s)
Aging , Models, Biological , Animals , Cellular Senescence , Computer Simulation , Humans , Internet , Software , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...