Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
Syst Biol ; 67(4): 735-740, 2018 07 01.
Article in English | MEDLINE | ID: mdl-29514307

ABSTRACT

PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.


Subject(s)
Evolution, Molecular , Phylogeny , Software , Bayes Theorem , Hybridization, Genetic , Sequence Alignment
2.
PLoS Comput Biol ; 14(1): e1005932, 2018 01.
Article in English | MEDLINE | ID: mdl-29320496

ABSTRACT

Phylogenetic networks are rooted, directed, acyclic graphs that model reticulate evolutionary histories. Recently, statistical methods were devised for inferring such networks from either gene tree estimates or the sequence alignments of multiple unlinked loci. Bi-allelic markers, most notably single nucleotide polymorphisms (SNPs) and amplified fragment length polymorphisms (AFLPs), provide a powerful source of genome-wide data. In a recent paper, a method called SNAPP was introduced for statistical inference of species trees from unlinked bi-allelic markers. The generative process assumed by the method combined both a model of evolution for the bi-allelic markers, as well as the multispecies coalescent. A novel component of the method was a polynomial-time algorithm for exact computation of the likelihood of a fixed species tree via integration over all possible gene trees for a given marker. Here we report on a method for Bayesian inference of phylogenetic networks from bi-allelic markers. Our method significantly extends the algorithm for exact computation of phylogenetic network likelihood via integration over all possible gene trees. Unlike the case of species trees, the algorithm is no longer polynomial-time on all instances of phylogenetic networks. Furthermore, the method utilizes a reversible-jump MCMC technique to sample the posterior of phylogenetic networks given bi-allelic marker data. Our method has a very good performance in terms of accuracy and robustness as we demonstrate on simulated data, as well as a data set of multiple New Zealand species of the plant genus Ourisia (Plantaginaceae). We implemented the method in the publicly available, open-source PhyloNet software package.


Subject(s)
Genes, Plant , Genetic Markers , Phylogeny , Plantaginaceae/genetics , Algorithms , Alleles , Bayes Theorem , Computational Biology , Computer Simulation , Likelihood Functions , Models, Genetic , New Zealand , Nucleic Acid Hybridization , Plantaginaceae/physiology , Polymorphism, Single Nucleotide , Probability , Recombination, Genetic , Software
3.
Syst Biol ; 67(3): 439-457, 2018 May 01.
Article in English | MEDLINE | ID: mdl-29088409

ABSTRACT

The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only coestimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of "intermixture," we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.


Subject(s)
Models, Genetic , Phylogeny , Bayes Theorem , Computer Simulation , Gene Flow , Genetic Speciation , Saccharomyces cerevisiae/classification , Saccharomyces cerevisiae/genetics
4.
PLoS Genet ; 13(2): e1006598, 2017 Feb.
Article in English | MEDLINE | ID: mdl-28178269

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pgen.1006006.].

5.
PLoS Genet ; 12(5): e1006006, 2016 May.
Article in English | MEDLINE | ID: mdl-27144273

ABSTRACT

The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation.


Subject(s)
Evolution, Molecular , Models, Genetic , Phylogeny , Bayes Theorem , Computational Biology/methods , Gene Transfer, Horizontal , Genome , Markov Chains
6.
Mol Ecol ; 25(11): 2361-72, 2016 Jun.
Article in English | MEDLINE | ID: mdl-26808290

ABSTRACT

The role of hybridization and subsequent introgression has been demonstrated in an increasing number of species. Recently, Fontaine et al. (Science, 347, 2015, 1258524) conducted a phylogenomic analysis of six members of the Anopheles gambiae species complex. Their analysis revealed a reticulate evolutionary history and pointed to extensive introgression on all four autosomal arms. The study further highlighted the complex evolutionary signals that the co-occurrence of incomplete lineage sorting (ILS) and introgression can give rise to in phylogenomic analyses. While tree-based methodologies were used in the study, phylogenetic networks provide a more natural model to capture reticulate evolutionary histories. In this work, we reanalyse the Anopheles data using a recently devised framework that combines the multispecies coalescent with phylogenetic networks. This framework allows us to capture ILS and introgression simultaneously, and forms the basis for statistical methods for inferring reticulate evolutionary histories. The new analysis reveals a phylogenetic network with multiple hybridization events, some of which differ from those reported in the original study. To elucidate the extent and patterns of introgression across the genome, we devise a new method that quantifies the use of reticulation branches in the phylogenetic network by each genomic region. Applying the method to the mosquito data set reveals the evolutionary history of all the chromosomes. This study highlights the utility of 'network thinking' and the new insights it can uncover, in particular in phylogenomic analyses of large data sets with extensive gene tree incongruence.


Subject(s)
Culicidae/genetics , Evolution, Molecular , Hybridization, Genetic , Animals , Genome, Insect , Models, Genetic , Phylogeny
8.
PLoS One ; 8(9): e75649, 2013.
Article in English | MEDLINE | ID: mdl-24228087

ABSTRACT

Engineered TAL-effector nucleases (TALENs) and TALE-based constructs have become powerful tools for eukaryotic genome editing. Although many methods have been reported, it remains a challenge for the assembly of designer-based TALE repeats in a fast, precise and cost-effective manner. We present an ULtiMATE (USER-based Ligation Mediated Assembly of TAL Effector) system for speedy and accurate assembly of customized TALE constructs. This method takes advantage of uracil-specific excision reagent (USER) to create multiple distinct sticky ends between any neighboring DNA fragments for specific ligation. With pre-assembled templates, multiple TALE DNA-binding domains could be efficiently assembled in order within hours with minimal manual operation. This system has been demonstrated to produce both functional TALENs for effective gene knockout and TALE-mediated gene-specific transcription activation (TALE-TA). The feature of both ease-of-operation and high efficiency of ULtiMATE system makes it not only an ideal method for biologic labs, but also an approach well suited for large-scale assembly of TALENs and any other TALE-based constructions.


Subject(s)
Endonucleases/genetics , Endonucleases/metabolism , Trans-Activators/genetics , Trans-Activators/metabolism , Base Sequence , Binding Sites , Cell Line , Genetic Vectors/genetics , Humans , Molecular Sequence Data , Protein Binding , Protein Engineering
9.
PLoS One ; 8(2): e57482, 2013.
Article in English | MEDLINE | ID: mdl-23468999

ABSTRACT

The concept of microbial consortia is of great attractiveness in synthetic biology. Despite of all its benefits, however, there are still problems remaining for large-scaled multicellular gene circuits, for example, how to reliably design and distribute the circuits in microbial consortia with limited number of well-behaved genetic modules and wiring quorum-sensing molecules. To manage such problem, here we propose a formalized design process: (i) determine the basic logic units (AND, OR and NOT gates) based on mathematical and biological considerations; (ii) establish rules to search and distribute simplest logic design; (iii) assemble assigned basic logic units in each logic operating cell; and (iv) fine-tune the circuiting interface between logic operators. We in silico analyzed gene circuits with inputs ranging from two to four, comparing our method with the pre-existing ones. Results showed that this formalized design process is more feasible concerning numbers of cells required. Furthermore, as a proof of principle, an Escherichia coli consortium that performs XOR function, a typical complex computing operation, was designed. The construction and characterization of logic operators is independent of "wiring" and provides predictive information for fine-tuning. This formalized design process provides guidance for the design of microbial consortia that perform distributed biological computation.


Subject(s)
Bacteria/metabolism , Bacteria/genetics , Synthetic Biology
SELECTION OF CITATIONS
SEARCH DETAIL
...