Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
1.
PLoS One ; 15(9): e0227842, 2020.
Article in English | MEDLINE | ID: mdl-32947609

ABSTRACT

Phylogenetic networks construction is one the most important challenge in phylogenetics. These networks can present complex non-treelike events such as gene flow, horizontal gene transfers, recombination or hybridizations. Among phylogenetic networks, rooted structures are commonly used to represent the evolutionary history of a species set, explicitly. Triplets are well known input for constructing the rooted networks. Obtaining an optimal rooted network that contains all given triplets is main problem in network construction. The optimality criteria include minimizing the level or the number of reticulation nodes. The complexity of this problem is known to be NP-hard. In this research, a new algorithm called Netcombin is introduced to construct approximately an optimal network which is consistent with input triplets. The innovation of this algorithm is based on binarization and expanding processes. The binarization process innovatively uses a measure to construct a binary rooted tree T consistent with the approximately maximum number of input triplets. Then T is expanded using a heuristic function by adding minimum number of edges to obtain final network with the approximately minimum number of reticulation nodes. In order to evaluate the proposed algorithm, Netcombin is compared with four state of the art algorithms, RPNCH, NCHB, TripNet, and SIMPLISTIC. The experimental results on simulated data obtained from biologically generated sequences data indicate that by considering the trade-off between speed and precision, the Netcombin outperforms the others.


Subject(s)
Algorithms , Genomics/methods , Heuristics , Models, Genetic , Phylogeny
2.
J Theor Biol ; 489: 110144, 2020 03 21.
Article in English | MEDLINE | ID: mdl-31911141

ABSTRACT

Phylogenetics is a field that studies and models the evolutionary history of currently living species. The rooted phylogenetic network is an important approach that models non-tree-like events between currently living species. Rooted triplets are one type of inputs in constructing rooted phylogenetic networks. Constructing an optimal rooted phylogenetic network that contains all given rooted triplets is a NP-hard problem. To overcome this challenge efficiently, a novel heuristic method called NCHB is introduced in this paper. NCHB produces an optimal rooted phylogenetic network that covers all given rooted triplets. The NCHB optimality criterions in building a rooted phylogenetic network are minimizing the number of reticulation nodes, and minimizing the level of the final network. In NCHB, the two concepts: the height function and the binarization of a network are considered innovatively. In order to study the performance of NCHB, our proposed method is compared with the three state of the art algorithms that are LEV1ATHAN, SIMPLISTIC and TripNet in two scenarios. In the first scenario, triplet sets are generated under biological presumptions and our proposed method is compared with SIMPLISTIC and TripNet. The results show that NCHB outperforms TripNet and SIMPLISTIC according to the optimality criterions. In the second scenario, we designed a software for generating level-k networks. Then all triplets consistent with each network are obtained and are used as input for NCHB, LEV1THAN, SIMPLISTIC, and TripNet. LEV1ATHEN is just applicable for level-1 networks while the other algorithms can be performed to obtain higher level networks. The results show that the NCHB and LEV1ATHAN outputs are almost the same when we are restricted to level-1 networks. Also the results show that NCHB outperforms TripNet and SIMPLISTIC. Moreover NCHB outputs are very close to the generated networks (that are optimal) with respect to the criterions.


Subject(s)
Algorithms , Software , Biological Evolution , Heuristics , Models, Genetic , Phylogeny
3.
PLoS One ; 9(9): e106531, 2014.
Article in English | MEDLINE | ID: mdl-25208028

ABSTRACT

The problem of constructing an optimal rooted phylogenetic network from an arbitrary set of rooted triplets is an NP-hard problem. In this paper, we present a heuristic algorithm called TripNet, which tries to construct a rooted phylogenetic network with the minimum number of reticulation nodes from an arbitrary set of rooted triplets. Despite of current methods that work for dense set of rooted triplets, a key innovation is the applicability of TripNet to non-dense set of rooted triplets. We prove some theorems to clarify the performance of the algorithm. To demonstrate the efficiency of TripNet, we compared TripNet with SIMPLISTIC. It is the only available software which has the ability to return some rooted phylogenetic network consistent with a given dense set of rooted triplets. But the results show that for complex networks with high levels, the SIMPLISTIC running time increased abruptly. However in all cases TripNet outputs an appropriate rooted phylogenetic network in an acceptable time. Also we tetsed TripNet on the Yeast data. The results show that Both TripNet and optimal networks have the same clustering and TripNet produced a level-3 network which contains only one more reticulation node than the optimal network.


Subject(s)
Algorithms , Computational Biology/methods , Phylogeny , Yeasts/classification , Yeasts/genetics
4.
J Theor Biol ; 251(2): 380-7, 2008 Mar 21.
Article in English | MEDLINE | ID: mdl-18177672

ABSTRACT

With large amounts of experimental data, modern molecular biology needs appropriate methods to deal with biological sequences. In this work, we apply a statistical method (Pearson's chi-square test) to recognize the signals appear in the whole genome of the Escherichia coli. To show the effectiveness of the method, we compare the Pearson's chi-square test with linguistic complexity on the complete genome of E. coli. The results suggest that Pearson's chi-square test is an efficient method for distinguishing genes (coding regions) form pseudogenes (noncoding regions). On the other hand, the performance of the linguistic complexity is much lower than the chi-square test method. We also use the Pearson's chi-square test method to determine which parts of the Open Reading Frame (ORF) have significant effect on discriminating genes form pseudogenes. Moreover, different complexity measures and Pearson's chi-square test applied on the genes with high value of Pearson's chi-square statistic. We also compute the measures on homologous of these genes. The results illustrate that there is a region near the start codon with high value of chi-square statistic and low complexity that is conserve between homologous genes.


Subject(s)
Escherichia coli/genetics , Genome, Bacterial , Open Reading Frames , Base Sequence , Chi-Square Distribution , Computational Biology , Conserved Sequence , Molecular Sequence Data , Pseudogenes , Sequence Homology
SELECTION OF CITATIONS
SEARCH DETAIL
...