Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artif Intell Med ; 71: 62-9, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27506132

RESUMO

OBJECTIVE: Proteins are vital biological molecules driving many fundamental cellular processes. They rarely act alone, but form interacting groups called protein complexes. The study of protein complexes is a key goal in systems biology. Recently, large protein-protein interaction (PPI) datasets have been published and a plethora of computational methods that provide new ideas for the prediction of protein complexes have been implemented. However, most of the methods suffer from two major limitations: First, they do not account for proteins participating in multiple functions and second, they are unable to handle weighted PPI graphs. Moreover, the problem remains open as existing algorithms and tools are insufficient in terms of predictive metrics. METHOD: In the present paper, we propose gradually expanding neighborhoods with adjustment (GENA), a new algorithm that gradually expands neighborhoods in a graph starting from highly informative "seed" nodes. GENA considers proteins as multifunctional molecules allowing them to participate in more than one protein complex. In addition, GENA accepts weighted PPI graphs by using a weighted evaluation function for each cluster. RESULTS: In experiments with datasets from Saccharomyces cerevisiae and human, GENA outperformed Markov clustering, restricted neighborhood search and clustering with overlapping neighborhood expansion, three state-of-the-art methods for computationally predicting protein complexes. Seven PPI networks and seven evaluation datasets were used in total. GENA outperformed existing methods in 16 out of 18 experiments achieving an average improvement of 5.5% when the maximum matching ratio metric was used. Our method was able to discover functionally homogeneous protein clusters and uncover important network modules in a Parkinson expression dataset. When used on the human networks, around 47% of the detected clusters were enriched in gene ontology (GO) terms with depth higher than five in the GO hierarchy. CONCLUSIONS: In the present manuscript, we introduce a new method for the computational prediction of protein complexes by making the realistic assumption that proteins participate in multiple protein complexes and cellular functions. Our method can detect accurate and functionally homogeneous clusters.


Assuntos
Algoritmos , Biologia Computacional , Análise por Conglomerados , Humanos , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Saccharomyces cerevisiae
3.
Artigo em Inglês | MEDLINE | ID: mdl-26451829

RESUMO

MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of support vector machines (SVM) with genetic algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.


Assuntos
Algoritmos , MicroRNAs/genética , Reconhecimento Automatizado de Padrão/métodos , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Software , Sequência de Bases , Simulação por Computador , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Modelos Genéticos , Dados de Sequência Molecular , Máquina de Vetores de Suporte
4.
Artif Intell Med ; 63(3): 181-9, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25765008

RESUMO

OBJECTIVE: Proteins are considered to be the most important individual components of biological systems and they combine to form physical protein complexes which are responsible for certain molecular functions. Despite the large availability of protein-protein interaction (PPI) information, not much information is available about protein complexes. Experimental methods are limited in terms of time, efficiency, cost and performance constraints. Existing computational methods have provided encouraging preliminary results, but they phase certain disadvantages as they require parameter tuning, some of them cannot handle weighted PPI data and others do not allow a protein to participate in more than one protein complex. In the present paper, we propose a new fully unsupervised methodology for predicting protein complexes from weighted PPI graphs. METHODS AND MATERIALS: The proposed methodology is called evolutionary enhanced Markov clustering (EE-MC) and it is a hybrid combination of an adaptive evolutionary algorithm and a state-of-the-art clustering algorithm named enhanced Markov clustering. EE-MC was compared with state-of-the-art methodologies when applied to datasets from the human and the yeast Saccharomyces cerevisiae organisms. RESULTS: Using public available datasets, EE-MC outperformed existing methodologies (in some datasets the separation metric was increased by 10-20%). Moreover, when applied to new human datasets its performance was encouraging in the prediction of protein complexes which consist of proteins with high functional similarity. In specific, 5737 protein complexes were predicted and 72.58% of them are enriched for at least one gene ontology (GO) function term. CONCLUSIONS: EE-MC is by design able to overcome intrinsic limitations of existing methodologies such as their inability to handle weighted PPI networks, their constraint to assign every protein in exactly one cluster and the difficulties they face concerning the parameter tuning. This fact was experimentally validated and moreover, new potentially true human protein complexes were suggested as candidates for further validation using experimental techniques.


Assuntos
Análise por Conglomerados , Cadeias de Markov , Mapeamento de Interação de Proteínas/métodos , Algoritmos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Humanos , Mapas de Interação de Proteínas/fisiologia , Saccharomyces cerevisiae
5.
J Biomed Inform ; 46(3): 563-73, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23501016

RESUMO

Traditional biology was forced to restate some of its principles when the microRNA (miRNA) genes and their regulatory role were firstly discovered. Typically, miRNAs are small non-coding RNA molecules which have the ability to bind to the 3'untraslated region (UTR) of their mRNA target genes for cleavage or translational repression. Existing experimental techniques for their identification and the prediction of the target genes share some important limitations such as low coverage, time consuming experiments and high cost reagents. Hence, many computational methods have been proposed for these tasks to overcome these limitations. Recently, many researchers emphasized on the development of computational approaches to predict the participation of miRNA genes in regulatory networks and to analyze their transcription mechanisms. All these approaches have certain advantages and disadvantages which are going to be described in the present survey. Our work is differentiated from existing review papers by updating the methodologies list and emphasizing on the computational issues that arise from the miRNA data analysis. Furthermore, in the present survey, the various miRNA data analysis steps are treated as an integrated procedure whose aims and scope is to uncover the regulatory role and mechanisms of the miRNA genes. This integrated view of the miRNA data analysis steps may be extremely useful for all researchers even if they work on just a single step.


Assuntos
Biologia Computacional , MicroRNAs/genética , Redes Reguladoras de Genes , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...