Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
Add more filters










Publication year range
1.
PLoS One ; 17(11): e0274448, 2022.
Article in English | MEDLINE | ID: mdl-36395273

ABSTRACT

Divide-and-conquer dividing by a half recurrences, of the form [Formula: see text] appear in many areas of applied mathematics, from the analysis of algorithms to the optimization of phylogenetic balance indices. These equations are usually "solved" by means of a Master Theorem that provides a bound for the growing order of xn, but not the solution's explicit expression. In this paper we give a finite explicit expression for this solution, in terms of the binary decomposition of n, when the independent term p(n) is a polynomial in ⌈n/2⌉ and ⌊n/2⌋. As an application, we obtain explicit formulas for several sequences of interest in phylogenetics, combinatorics, and computer science, for which no such formulas were known so far: for instance, for the Total Cophenetic index and the rooted Quartet index of the maximally balanced bifurcating phylogenetic trees with n leaves, and the sum of the bitwise AND operator applied to pairs of complementary numbers up to n.


Subject(s)
Algorithms , Records , Phylogeny
2.
J Comput Biol ; 28(12): 1181-1195, 2021 12.
Article in English | MEDLINE | ID: mdl-34714118

ABSTRACT

The Robinson-Foulds (RF) distance, one of the most widely used metrics for comparing phylogenetic trees, has the advantage of being intuitive, with a natural interpretation in terms of common splits, and it can be computed in linear time, but it has a very low resolution, and it may become trivial for phylogenetic trees with overlapping taxa, that is, phylogenetic trees that share some but not all of their leaf labels. In this article, we study the properties of the Generalized Robinson-Foulds (GRF) distance, a recently proposed metric for comparing any structures that can be described by multisets of multisets of labels, when applied to rooted phylogenetic trees with overlapping taxa, which are described by sets of clusters, that is, by sets of sets of labels. We show that the GRF distance has a very high resolution, it can also be computed in linear time, and it is not (uniformly) equivalent to the RF distance.


Subject(s)
Classification/methods , Computational Biology/methods , Algorithms , Models, Genetic , Phylogeny
3.
Math Biosci ; 331: 108503, 2021 01.
Article in English | MEDLINE | ID: mdl-33253745

ABSTRACT

The Colless index for bifurcating phylogenetic trees, introduced by Colless (1982), is defined as the sum, over all internal nodes v of the tree, of the absolute value of the difference of the sizes of the clades defined by the children of v. It is one of the most popular phylogenetic balance indices, because, in addition to measuring the balance of a tree in a very simple and intuitive way, it turns out to be one of the most powerful and discriminating phylogenetic shape indices. But it has some drawbacks. On the one hand, although its minimum value is reached at the so-called maximally balanced trees, it is almost always reached also at trees that are not maximally balanced. On the other hand, its definition as a sum of absolute values of differences makes it difficult to study analytically its distribution under probabilistic models of bifurcating phylogenetic trees. In this paper we show that if we replace in its definition the absolute values of the differences of clade sizes by the squares of these differences, all these drawbacks are overcome and the resulting index is still more powerful and discriminating than the original Colless index.


Subject(s)
Biological Evolution , Phylogeny , Algorithms , Humans , Mathematical Concepts , Models, Biological , Models, Genetic , Models, Statistical
4.
BMC Bioinformatics ; 21(Suppl 6): 434, 2020 Nov 18.
Article in English | MEDLINE | ID: mdl-33203352

ABSTRACT

BACKGROUND: The alignment of protein-protein interaction networks was recently formulated as an integer quadratic programming problem, along with a linearization that can be solved by integer linear programming software tools. However, the resulting integer linear program has a huge number of variables and constraints, rendering it of no practical use. RESULTS: We present a compact integer linear programming reformulation of the protein-protein interaction network alignment problem, which can be solved using state-of-the-art mathematical modeling and integer linear programming software tools, along with empirical results showing that small biological networks, such as virus-host protein-protein interaction networks, can be aligned in a reasonable amount of time on a personal computer and the resulting alignments are structurally coherent and biologically meaningful. CONCLUSIONS: The implementation of the integer linear programming reformulation using current mathematical modeling and integer linear programming software tools provided biologically meaningful alignments of virus-host protein-protein interaction networks.


Subject(s)
Programming, Linear , Protein Interaction Maps , Software , Algorithms , Models, Theoretical
5.
BMC Bioinformatics ; 21(Suppl 6): 265, 2020 Nov 18.
Article in English | MEDLINE | ID: mdl-33203353

ABSTRACT

BACKGROUND: All molecular functions and biological processes are carried out by groups of proteins that interact with each other. Metaproteomic data continuously generates new proteins whose molecular functions and relations must be discovered. A widely accepted structure to model functional relations between proteins are protein-protein interaction networks (PPIN), and their analysis and alignment has become a key ingredient in the study and prediction of protein-protein interactions, protein function, and evolutionary conserved assembly pathways of protein complexes. Several PPIN aligners have been proposed, but attaining the right balance between network topology and biological information is one of the most difficult and key points in the design of any PPIN alignment algorithm. RESULTS: Motivated by the challenge of well-balanced and efficient algorithms, we have designed and implemented AligNet, a parameter-free pairwise PPIN alignment algorithm aimed at bridging the gap between topologically efficient and biologically meaningful matchings. A comparison of the results obtained with AligNet and with the best aligners shows that AligNet achieves indeed a good balance between topological and biological matching. CONCLUSION: In this paper we present AligNet, a new pairwise global PPIN aligner that produces biologically meaningful alignments, by achieving a good balance between structural matching and protein function conservation, and more efficient computations than state-of-the-art tools.


Subject(s)
Protein Interaction Mapping , Protein Interaction Maps , Proteins , Algorithms , Biological Evolution , Proteins/metabolism
6.
BMC Bioinformatics ; 21(1): 154, 2020 Apr 23.
Article in English | MEDLINE | ID: mdl-32326884

ABSTRACT

BACKGROUND: The Sackin indexS of a rooted phylogenetic tree, defined as the sum of its leaves' depths, is one of the most popular balance indices in phylogenetics, and Sackin's paper (Syst Zool 21:225-6, 1972) is usually cited as the source for this index. However, what Sackin actually proposed in his paper as a measure of the imbalance of a rooted tree was not the sum of its leaves' depths, but their "variation". This proposal was later implemented as the variance of the leaves' depths by Kirkpatrick and Slatkin in (Evolution 47:1171-81, 1993), where they also posed the problem of finding a closed formula for its expected value under the Yule model. Nowadays, Sackin's original proposal seems to have passed into oblivion in the phylogenetics literature, replaced by the index bearing his name, which, in fact, was introduced a decade later by Sokal. RESULTS: In this paper we study the properties of the variance of the leaves' depths, V, as a balance index. Firstly, we prove that the rooted trees with n leaves and maximum V value are exactly the combs with n leaves. But although V achieves its minimum value on every space [Formula: see text] of bifurcating rooted phylogenetic trees with n≤183 leaves at the so-called "maximally balanced trees" with n leaves, this property fails for almost every n≥184. We provide then an algorithm that finds the trees in [Formula: see text] with minimum V value in time O(n log(n)). Secondly, we obtain closed formulas for the expected V value of a bifurcating rooted tree with any number n of leaves under the Yule and the uniform models and, as a by-product of the computations leading to these formulas, we also obtain closed formulas for the variance under the uniform model of the Sackin index and the total cophenetic index (Mir et al., Math Biosci 241:125-36, 2013) of a bifurcating rooted tree, as well as of their covariance, thus filling this gap in the literature. CONCLUSION: The phylogenetics community has been wise in preferring the sum S(T) of the leaves' depths of a phylogenetic tree T over their variance V(T) as a balance index, because the latter does not seem to capture correctly the notion of balance of large bifurcating rooted trees. But it is still a valid and useful shape index.


Subject(s)
Models, Theoretical , Algorithms , Phylogeny
7.
J Math Biol ; 80(7): 1993-2054, 2020 06.
Article in English | MEDLINE | ID: mdl-32266429

ABSTRACT

Measures of tree balance play an important role in the analysis of phylogenetic trees. One of the oldest and most popular indices in this regard is the Colless index for rooted bifurcating trees, introduced by Colless (Syst Zool 31:100-104, 1982). While many of its statistical properties under different probabilistic models for phylogenetic trees have already been established, little is known about its minimum value and the trees that achieve it. In this manuscript, we fill this gap in the literature. To begin with, we derive both recursive and closed expressions for the minimum Colless index of a tree with n leaves. Surprisingly, these expressions show a connection between the minimum Colless index and the so-called Blancmange curve, a fractal curve. We then fully characterize the tree shapes that achieve this minimum value and we introduce both an algorithm to generate them and a recurrence to count them. After focusing on two extremal classes of trees with minimum Colless index (the maximally balanced trees and the greedy from the bottom trees), we conclude by showing that all trees with minimum Colless index also have minimum Sackin index, another popular balance index.


Subject(s)
Models, Biological , Phylogeny , Algorithms , Biological Evolution , Computational Biology , Fractals , Mathematical Concepts , Models, Statistical
8.
J Math Biol ; 79(3): 1105-1148, 2019 08.
Article in English | MEDLINE | ID: mdl-31209515

ABSTRACT

We define a new balance index for rooted phylogenetic trees based on the symmetry of the evolutive history of every set of 4 leaves. This index makes sense for multifurcating trees and it can be computed in time linear in the number of leaves. We determine its maximum and minimum values for arbitrary and bifurcating trees, and we provide exact formulas for its expected value and variance on bifurcating trees under Ford's [Formula: see text]-model and Aldous' [Formula: see text]-model and on arbitrary trees under the [Formula: see text]-[Formula: see text]-model.


Subject(s)
Algorithms , Biological Evolution , Mathematical Concepts , Models, Biological , Phylogeny , Animals , Humans
9.
PLoS One ; 13(9): e0203401, 2018.
Article in English | MEDLINE | ID: mdl-30252858

ABSTRACT

The Colless index is one of the most popular and natural balance indices for bifurcating phylogenetic trees, but it makes no sense for multifurcating trees. In this paper we propose a family of Colless-like balance indices [Formula: see text] that generalize the Colless index to multifurcating phylogenetic trees. Each [Formula: see text] is determined by the choice of a dissimilarity D and a weight function [Formula: see text]. A balance index is sound when the most balanced phylogenetic trees according to it are exactly the fully symmetric ones. Unfortunately, not every Colless-like balance index is sound in this sense. We prove then that taking f(n) = ln(n + e) or f(n) = en as weight functions, the resulting index [Formula: see text] is sound for every dissimilarity D. Next, for each one of these two functions f and for three popular dissimilarities D (the variance, the standard deviation, and the mean deviation from the median), we find the most unbalanced phylogenetic trees according to [Formula: see text] with any given number n of leaves. The results show that the growth pace of the function f influences the notion of "balance" measured by the indices it defines. Finally, we introduce our R package "CollessLike," which, among other functionalities, allows the computation of Colless-like indices of trees and their comparison to their distribution under Chen-Ford-Winkel's α-γ-model for multifurcating phylogenetic trees. As an application, we show that the trees in TreeBASE do not seem to follow either the uniform model for multifurcating trees or the α-γ-model, for any values of α and γ.


Subject(s)
Models, Theoretical
10.
ScientificWorldJournal ; 2018: 1916094, 2018.
Article in English | MEDLINE | ID: mdl-29849509

ABSTRACT

Ford's α-model is one of the most popular random parametric models of bifurcating phylogenetic tree growth, having as specific instances both the uniform and the Yule models. Its general properties have been used to study the behavior of phylogenetic tree shape indices under the probability distribution it defines. But the explicit formulas provided by Ford for the probabilities of unlabeled trees and phylogenetic trees fail in some cases. In this paper we give correct explicit formulas for these probabilities.

11.
J Comput Biol ; 25(3): 348-360, 2018 03.
Article in English | MEDLINE | ID: mdl-29028181

ABSTRACT

The classification of reads from a metagenomic sample using a reference taxonomy is usually based on first mapping the reads to the reference sequences and then classifying each read at a node under the lowest common ancestor of the candidate sequences in the reference taxonomy with the least classification error. However, this taxonomic annotation can be biased by an imbalanced taxonomy and also by the presence of multiple nodes in the taxonomy with the least classification error for a given read. In this article, we show that the Rand index is a better indicator of classification error than the often used area under the receiver operating characteristic (ROC) curve and F-measure for both balanced and imbalanced reference taxonomies, and we also address the second source of bias by reducing the taxonomic annotation problem for a whole metagenomic sample to a set cover problem, for which a logarithmic approximation can be obtained in linear time and an exact solution can be obtained by integer linear programming. Experimental results with a proof-of-concept implementation of the set cover approach to taxonomic annotation in a next release of the TANGO software show that the set cover approach further reduces ambiguity in the taxonomic annotation obtained with TANGO without distorting the relative abundance profile of the metagenomic sample.


Subject(s)
DNA Barcoding, Taxonomic/methods , Metagenome , Phylogeny , Software , DNA Barcoding, Taxonomic/standards , Humans , Microbiota
12.
Math Biosci ; 295: 73-85, 2018 01.
Article in English | MEDLINE | ID: mdl-29155134

ABSTRACT

The cophenetic metrics dφ,p, for p ∈ {0} ∪ [1, ∞), are a recent addition to the kit of available distances for the comparison of phylogenetic trees. Based on a fifty years old idea of Sokal and Rohlf, these metrics compare phylogenetic trees on a same set of taxa by encoding them by means of their vectors of cophenetic values of pairs of taxa and depths of single taxa, and then computing the Lp norm of the difference of the corresponding vectors. In this paper we compute the expected value of the square of dφ,2 on the space of fully resolved rooted phylogenetic trees with n leaves, under the Yule and the uniform probability distributions.


Subject(s)
Phylogeny , Algorithms , Mathematical Concepts , Models, Statistical , Probability , Statistical Distributions
13.
Rev. esp. med. legal ; 43(1): 35-40, ene.-mar. 2017. tab
Article in Spanish | IBECS | ID: ibc-159902

ABSTRACT

La trascendencia social de los derechos de los pacientes ha comportado en España una prolija promulgación legislativa desde el año 2000, que culmina con el desarrollo del documento conocido como testamento vital, de instrucciones previas o de voluntades anticipadas. El fundamento del testamento vital radica en el reconocimiento de la autonomía del paciente. Este respeto va más allá de la situación de competencia del paciente al permitir anticipar situaciones clínicas y sus correspondientes decisiones. El documento de voluntades anticipadas es un documento escrito que refleja un acto de responsabilidad personal, siendo de especial ayuda en enfermos crónicos que pueden evolucionar hacia situaciones de dependencia y deterioro cognitivo. Los profesionales de la salud deben conocer la legislación vigente en materia del principio de autonomía. Resulta esencial que den a conocer a sus pacientes la posibilidad de realizar este procedimiento. Es necesaria una tarea divulgativa y pedagógica ante los profesionales y la población general (AU)


The social significance of the rights of patients has produced in Spain a diffuse legislative enactment since 2000 that culminated with the drafting of the living will document, or advanced directives. The basis of this document is the recognition of the autonomy of the patient. This respect goes beyond the competencies of the patient by enabling clinical situations to be anticipated and related decisions to be taken in advance. Living Will is a written document that reflects an act of personal responsibility. It is a tool for making clinical decisions that is particularly applicable to chronic patients susceptible of developing cognitive impairment and a state of dependency. Health professionals should be aware of current legislation on the principle of autonomy, and must make their patients aware of the option to implement this procedure. An educational and public awareness campaign aimed at professionals and the general public is required (AU)


Subject(s)
Humans , Male , Female , Living Wills/history , Living Wills/legislation & jurisprudence , Living Wills/trends , Informed Consent/legislation & jurisprudence , Physician-Patient Relations , Bioethics/trends , Legislative Decree/methods , 51725 , Patient Care/trends , Personal Autonomy
14.
Algorithms Mol Biol ; 10: 28, 2015.
Article in English | MEDLINE | ID: mdl-26691555

ABSTRACT

BACKGROUND: Lateral, or Horizontal, Gene Transfers are a type of asymmetric evolutionary events where genetic material is transferred from one species to another. In this paper we consider LGT networks, a general model of phylogenetic networks with lateral gene transfers which consist, roughly, of a principal rooted tree with its leaves labelled on a set of taxa, and a set of extra secondary arcs between nodes in this tree representing lateral gene transfers. An LGT network gives rise in a natural way to a principal phylogenetic subtree and a set of secondary phylogenetic subtrees, which, roughly, represent, respectively, the main line of evolution of most genes and the secondary lines of evolution through lateral gene transfers. RESULTS: We introduce a set of simple conditions on an LGT network that guarantee that its principal and secondary phylogenetic subtrees are pairwise different and that these subtrees determine, up to isomorphism, the LGT network. We then give an algorithm that, given a set of pairwise different phylogenetic trees [Formula: see text] on the same set of taxa, outputs, when it exists, the LGT network that satisfies these conditions and such that its principal phylogenetic tree is [Formula: see text] and its secondary phylogenetic trees are [Formula: see text].

15.
ScientificWorldJournal ; 2014: 254279, 2014.
Article in English | MEDLINE | ID: mdl-24982934

ABSTRACT

Several polynomial time computable metrics on the class of semibinary tree-sibling time consistent phylogenetic networks are available in the literature; in particular, the problem of deciding if two networks of this kind are isomorphic is in P. In this paper, we show that if we remove the semibinarity condition, then the problem becomes much harder. More precisely, we prove that the isomorphism problem for generic tree-sibling time consistent phylogenetic networks is polynomially equivalent to the graph isomorphism problem. Since the latter is believed not to belong to P, the chances are that it is impossible to define a metric on the class of all tree-sibling time consistent phylogenetic networks that can be computed in polynomial time.


Subject(s)
Algorithms , Phylogeny , Computational Biology , Humans
16.
Photosynth Res ; 117(1-3): 45-59, 2013 Nov.
Article in English | MEDLINE | ID: mdl-23670217

ABSTRACT

A key objective for sustainable agriculture and forestry is to breed plants with both high carbon gain and water-use efficiency (WUE). At the level of leaf physiology, this implies increasing net photosynthesis (A N) relative to stomatal conductance (g s). Here, we review evidence for CO2 diffusional constraints on photosynthesis and WUE. Analyzing past observations for an extensive pool of crop and wild plant species that vary widely in mesophyll conductance to CO2 (g m), g s, and foliage A N, it was shown that both g s and g m limit A N, although the relative importance of each of the two conductances depends on species and conditions. Based on Fick's law of diffusion, intrinsic WUE (the ratio A N/g s) should correlate on the ratio g m/g s, and not g m itself. Such a correlation is indeed often observed in the data. However, since besides diffusion A N also depends on photosynthetic capacity (i.e., V c,max), this relationship is not always sustained. It was shown that only in a very few cases, genotype selection has resulted in simultaneous increases of both A N and WUE. In fact, such a response has never been observed in genetically modified plants specifically engineered for either reduced g s or enhanced g m. Although increasing g m alone would result in increasing photosynthesis, and potentially increasing WUE, in practice, higher WUE seems to be only achieved when there are no parallel changes in g s. We conclude that for simultaneous improvement of A N and WUE, genetic manipulation of g m should avoid parallel changes in g s, and we suggest that the appropriate trait for selection for enhanced WUE is increased g m/g s.


Subject(s)
Carbon Dioxide/metabolism , Photosynthesis , Plants/metabolism , Water/metabolism , Abscisic Acid/pharmacology , Arabidopsis/drug effects , Arabidopsis/genetics , Arabidopsis/physiology , Diffusion/drug effects , Genotype , Mesophyll Cells/drug effects , Mesophyll Cells/physiology , Photosynthesis/drug effects , Plant Stomata/drug effects , Plant Stomata/physiology , Plants/drug effects , Plants/genetics , Plants, Genetically Modified , Species Specificity , Vitis/drug effects , Vitis/physiology
17.
BMC Bioinformatics ; 14: 3, 2013 Jan 16.
Article in English | MEDLINE | ID: mdl-23323711

ABSTRACT

BACKGROUND: Phylogenetic tree comparison metrics are an important tool in the study of evolution, and hence the definition of such metrics is an interesting problem in phylogenetics. In a paper in Taxon fifty years ago, Sokal and Rohlf proposed to measure quantitatively the difference between a pair of phylogenetic trees by first encoding them by means of their half-matrices of cophenetic values, and then comparing these matrices. This idea has been used several times since then to define dissimilarity measures between phylogenetic trees but, to our knowledge, no proper metric on weighted phylogenetic trees with nested taxa based on this idea has been formally defined and studied yet. Actually, the cophenetic values of pairs of different taxa alone are not enough to single out phylogenetic trees with weighted arcs or nested taxa. RESULTS: For every (rooted) phylogenetic tree T, let its cophenetic vectorφ(T) consist of all pairs of cophenetic values between pairs of taxa in T and all depths of taxa in T. It turns out that these cophenetic vectors single out weighted phylogenetic trees with nested taxa. We then define a family of cophenetic metrics dφ,p by comparing these cophenetic vectors by means of Lp norms, and we study, either analytically or numerically, some of their basic properties: neighbors, diameter, distribution, and their rank correlation with each other and with other metrics. CONCLUSIONS: The cophenetic metrics can be safely used on weighted phylogenetic trees with nested taxa and no restriction on degrees, and they can be computed in O(n2) time, where n stands for the number of taxa. The metrics dφ,1 and dφ,2 have positive skewed distributions, and they show a low rank correlation with the Robinson-Foulds metric and the nodal metrics, and a very high correlation with each other and with the splitted nodal metrics. The diameter of dφ,p, for p⩾1 , is in O(n(p+2)/p), and thus for low p they are more discriminative, having a wider range of values.


Subject(s)
Phylogeny , Biological Evolution
18.
J Math Biol ; 67(6-7): 1833-46, 2013 Dec.
Article in English | MEDLINE | ID: mdl-23117499

ABSTRACT

One of the main applications of balance indices is in tests of nullmodels of evolutionary processes. The knowledge of an exact formula for a statistic of a balance index, holding for any number n of leaves, is necessary in order to use this statistic in tests of this kind involving trees of any size. In this paper we obtain exact formulas for the variance under the Yule model of the Sackin, the Colless and the total cophenetic indices of binary rooted phylogenetic trees with n leaves.


Subject(s)
Biological Evolution , Models, Genetic , Phylogeny
19.
Math Biosci ; 241(1): 125-36, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23142312

ABSTRACT

Several indices that measure the degree of balance of a rooted phylogenetic tree have been proposed so far in the literature. In this work we define and study a new index of this kind, which we call the total cophenetic index: the sum, over all pairs of different leaves, of the depth of their lowest common ancestor. This index makes sense for arbitrary trees, can be computed in linear time and it has a larger range of values and a greater resolution power than other indices like Colless' or Sackin's. We compute its maximum and minimum values for arbitrary and binary trees, as well as exact formulas for its expected value for binary trees under the Yule and the uniform models of evolution. As a byproduct of this study, we obtain an exact formula for the expected value of the Sackin index under the uniform model, a result that seems to be new in the literature.


Subject(s)
Phylogeny , Biological Evolution , Mathematical Concepts , Models, Genetic
20.
Article in English | MEDLINE | ID: mdl-20660951

ABSTRACT

Galled trees, directed acyclic graphs that model evolutionary histories with isolated hybridization events, have become very popular due to both their biological significance and the existence of polynomial-time algorithms for their reconstruction. In this paper, we establish to which extent several distance measures for the comparison of evolutionary networks are metrics for galled trees, and hence, when they can be safely used to evaluate galled tree reconstruction methods.


Subject(s)
Phylogeny , Computational Biology/methods , Evolution, Molecular , Gene Expression Profiling/methods , Hybridization, Genetic , Models, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...