Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 16: 291, 2015 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-26374744

RESUMO

BACKGROUND: So far many algorithms have been proposed towards the detection of significant genes in microarray analysis problems. Several of those approaches are freely available as R-packages though their engagement in gene expression analysis by non-bioinformaticians is usually a frustrating task. Besides, only some of those packages offer a complete suite of tools starting from initial data import and ending to analysis report. Here we present an R/Bioconductor package that implements a hybrid gene selection method along with a bunch of functions to facilitate a thorough and convenient gene expression profiling analysis. RESULTS: mAPKL is an open-source R/Bioconductor package that implements the mAP-KL hybrid gene selection method. The advantage of this method is that selects a small number of gene exemplars while achieving comparable classification results to other well established algorithms on a variety of datasets and dataset sizes. The mAPKL package is accompanied with extra functionalities including (i) solid data import; (ii) data sampling following a user-defined proportion; (iii) preprocessing through several normalization and transformation alternatives; (iv) classification with the aid of SVM and performance evaluation; (v) network analysis of the significant genes (exemplars), including degree of centrality, closeness, betweeness, clustering coefficient as well as the construction of an edge list table; (vi) gene annotation analysis, (vii) pathway analysis and (viii) auto-generated analysis reporting. CONCLUSIONS: Users are able to run a thorough gene expression analysis in a timely manner starting from raw data and concluding to network characteristics of the selected gene exemplars. Detailed instructions and example data are provided in the R package, which is freely available at Bioconductor under the GPL-2 or later license http://www.bioconductor.org/packages/3.1/bioc/html/mAPKL.html.


Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Anotação de Sequência Molecular , Software , Análise por Conglomerados , Humanos
2.
Artigo em Inglês | MEDLINE | ID: mdl-25380778

RESUMO

The publicly available online database MelGene provides a comprehensive, regularly updated, collection of data from genetic association studies in cutaneous melanoma (CM), including random-effects meta-analysis results of all eligible polymorphisms. The updated database version includes data from 192 publications with information on 1114 significantly associated polymorphisms across 280 genes, along with new front-end and back-end capabilities. Various types of relationships between data are calculated and visualized as networks. We constructed 13 different networks containing the polymorphisms and the genes included in MelGene. We explored the derived network representations under the following questions: (i) are there nodes that deserve consideration regarding their network connectivity characteristics? (ii) What is the relation of either the genome-wide or nominally significant CM polymorphisms/genes with the ones highlighted by the network representation? We show that our network approach using the MelGene data reveals connections between statistically significant genes/ polymorphisms and other genes/polymorphisms acting as 'hubs' in the reconstructed networks. To the best of our knowledge, this is the first database containing data from a comprehensive field synopsis and systematic meta-analyses of genetic polymorphisms in CM that provides user-friendly tools for in-depth molecular network visualization and exploration. The proposed network connections highlight potentially new loci requiring further investigation of their relation to melanoma risk. Database URL: http://www.melgene.org.


Assuntos
Biologia Computacional/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Estudos de Associação Genética , Internet , Melanoma/genética , Humanos , Neoplasias Cutâneas , Melanoma Maligno Cutâneo
3.
BMC Bioinformatics ; 13: 270, 2012 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-23075381

RESUMO

BACKGROUND: A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. RESULTS: We propose a hybrid FS method (mAP-KL), which combines multiple hypothesis testing and affinity propagation (AP)-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0.91. Furthermore, mAP-KL generates concise yet biologically relevant and informative N-gene expression signatures, which can serve as a valuable tool for diagnostic and prognostic purposes, as well as a source of potential disease biomarkers in a broad range of diseases. CONCLUSIONS: mAP-KL is a data-driven and classifier-independent hybrid feature selection method, which applies to any disease classification problem based on microarray data, regardless of the available samples. Combining multiple hypothesis testing and AP leads to subsets of genes, which classify unknown samples from both, small and large patient cohorts with high accuracy.


Assuntos
Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Algoritmos , Simulação por Computador , Humanos , Família Multigênica , Neoplasias/genética , Doenças Neuromusculares/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Tamanho da Amostra , Máquina de Vetores de Suporte
4.
IEEE Trans Inf Technol Biomed ; 15(3): 349-55, 2011 May.
Artigo em Inglês | MEDLINE | ID: mdl-21427026

RESUMO

The discovery of potential microarray markers, which will expedite molecular diagnosis/prognosis and provide reliable results to clinical decision-making and treatment selection for patients, is of paramount importance. Feature selection techniques, which aim at minimizing the dimensionality of the microarray data by keeping the most statistically significant genes, are a powerful approach toward this goal. In this paper, we investigate the minimum required subsets of genes, which best classify neuromuscular disease data. For this purpose, we implemented a methodology pipeline that facilitated the use of multiple feature selection methods and subsequent performance of data classification. Five feature selection methods on datasets from ten different neuromuscular diseases were utilized. Our findings reveal subsets of very small number of genes, which can successfully classify normal/disease samples. Interestingly, we observe that similar classification results may be obtained from different subsets of genes. The proposed methodology can expedite the identification of small gene subsets with high-classification accuracy that could ultimately be used in the genetics clinics for diagnostic, prognostic, and pharmacogenomic purposes.


Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Técnicas de Diagnóstico Molecular/métodos , Doenças Neuromusculares/genética , Inteligência Artificial , Humanos , Doenças Neuromusculares/classificação , Análise de Sequência com Séries de Oligonucleotídeos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...