Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Cybern ; 48(9): 2656-2669, 2018 Sep.
Article in English | MEDLINE | ID: mdl-28945604

ABSTRACT

Fuzzy associative classification has not been widely analyzed in the literature, although associative classifiers (ACs) have proved to be very effective in different real domain applications. The main reason is that learning fuzzy ACs is a very heavy task, especially when dealing with large datasets. To overcome this drawback, in this paper, we propose an efficient distributed fuzzy associative classification approach based on the MapReduce paradigm. The approach exploits a novel distributed discretizer based on fuzzy entropy for efficiently generating fuzzy partitions of the attributes. Then, a set of candidate fuzzy association rules is generated by employing a distributed fuzzy extension of the well-known FP-Growth algorithm. Finally, this set is pruned by using three purposely adapted types of pruning. We implemented our approach on the popular Hadoop framework. Hadoop allows distributing storage and processing of very large data sets on computer clusters built from commodity hardware. We have performed an extensive experimentation and a detailed analysis of the results using six very large datasets with up to 11 000 000 instances. We have also experimented different types of reasoning methods. Focusing on accuracy, model complexity, computation time, and scalability, we compare the results achieved by our approach with those obtained by two distributed nonfuzzy ACs recently proposed in the literature. We highlight that, although the accuracies result to be comparable, the complexity, evaluated in terms of number of rules, of the classifiers generated by the fuzzy distributed approach is lower than the one of the nonfuzzy classifiers.

2.
Bioinformatics ; 32(21): 3252-3259, 2016 11 01.
Article in English | MEDLINE | ID: mdl-27378291

ABSTRACT

MOTIVATION: Riboswitches are cis-regulatory elements in mRNA, mostly found in Bacteria, which exhibit two main secondary structure conformations. Although one of them prevents the gene from being expressed, the other conformation allows its expression, and this switching process is typically driven by the presence of a specific ligand. Although there are a handful of known riboswitches, our knowledge in this field has been greatly limited due to our inability to identify their alternate structures from their sequences. Indeed, current methods are not able to predict the presence of the two functionally distinct conformations just from the knowledge of the plain RNA nucleotide sequence. Whether this would be possible, for which cases, and what prediction accuracy can be achieved, are currently open questions. RESULTS: Here we show that the two alternate secondary structures of riboswitches can be accurately predicted once the 'switching sequence' of the riboswitch has been properly identified. The proposed SwiSpot approach is capable of identifying the switching sequence inside a putative, complete riboswitch sequence, on the basis of pairing behaviors, which are evaluated on proper sets of configurations. Moreover, it is able to model the switching behavior of riboswitches whose generated ensemble covers both alternate configurations. Beyond structural predictions, the approach can also be paired to homology-based riboswitch searches. AVAILABILITY AND IMPLEMENTATION: SwiSpot software, along with the reference dataset files, is available at: http://www.iet.unipi.it/a.bechini/swispot/Supplementary information: Supplementary data are available at Bioinformatics online. CONTACT: a.bechini@ing.unipi.it.


Subject(s)
Bacteria , Riboswitch , Ligands , Models, Genetic , Sequence Analysis, RNA , Software
3.
PLoS One ; 8(3): e59504, 2013.
Article in English | MEDLINE | ID: mdl-23555684

ABSTRACT

models of proteins have been widely used as a practical means to computationally investigate general properties of the system. In lattice models any sterically feasible conformation is represented as a self-avoiding walk on a lattice, and residue types are limited in number. So far, only two- or three-dimensional lattices have been used. The inspection of the neighborhood of alpha carbons in the core of real proteins reveals that also lattices with higher coordination numbers, possibly in higher dimensional spaces, can be adopted. In this paper, a new general parametric lattice model for simplified protein conformations is proposed and investigated. It is shown how the supporting software can be consistently designed to let algorithms that operate on protein structures be implemented in a lattice-agnostic way. The necessary theoretical foundations are developed and organically presented, pinpointing the role of the concept of main directions in lattice-agnostic model handling. Subsequently, the model features across dimensions and lattice types are explored in tests performed on benchmark protein sequences, using a Python implementation. Simulations give insights on the use of square and triangular lattices in a range of dimensions. The trend of potential minimum for sequences of different lengths, varying the lattice dimension, is uncovered. Moreover, an extensive quantitative characterization of the usage of the so-called "move types" is reported for the first time. The proposed general framework for the development of lattice models is simple yet complete, and an object-oriented architecture can be proficiently employed for the supporting software, by designing ad-hoc classes. The proposed framework represents a new general viewpoint that potentially subsumes a number of solutions previously studied. The adoption of the described model pushes to look at protein structure issues from a more general and essential perspective, making computational investigations over simplified models more straightforward as well.


Subject(s)
Models, Molecular , Proteins/chemistry , Software , Algorithms , Protein Conformation , Proteins/metabolism
SELECTION OF CITATIONS
SEARCH DETAIL
...