Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Artif Intell Med ; 53(1): 57-71, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21767937

RESUMO

OBJECTIVE: Gene expression patterns that distinguish clinically significant disease subclasses may not only play a prominent role in diagnosis, but also lead to the therapeutic strategies tailoring the treatment to the particular biology of each disease. Nevertheless, gene expression signatures derived through statistical feature-extraction procedures on population datasets have received rightful criticism, since they share few genes in common, even when derived from the same dataset. We focus on knowledge complementarities conveyed by two or more gene-expression signatures by means of embedded biological processes and pathways, which alternatively form a meta-knowledge platform of analysis towards a more global, robust and powerful solution. METHODS: The main contribution of this work is the introduction and study of an approach for integrating different gene signatures based on the underlying biological knowledge, in an attempt to derive a unified global solution. It is further recognized that one group's signature does not perform well on another group's data, due to incompatibilities of microarray technologies and the experimental design. We assess this cross-platform aspect, showing that a unified solution derived on the basis of both statistical and biological validation may also help in overcoming such inconsistencies. RESULTS: Based on the proposed approach we derived a unified 69-gene signature, which outperforms significantly the performance of the initial signatures succeeding a 0.73 accuracy metric on 234 new patients with 81% sensitivity and 64% specificity. The same signature manages to reveal the two prognostic groups on an additional dataset of 286 new patients obtained through a different experimental protocol and microarray platform. Furthermore, it manages to derive two clusters in a dataset from a different platform, showing remarkable difference on both gene-expression and survival-prediction levels.


Assuntos
Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias da Mama/genética , Bases de Dados Factuais , Feminino , Humanos , Bases de Conhecimento , Sensibilidade e Especificidade
2.
IEEE Trans Inf Technol Biomed ; 15(1): 155-63, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20813648

RESUMO

The concept of gene signature overlap has been addressed previously in a number of research papers. A common conclusion is the absence of significant overlap. In this paper, we verify the aforementioned fact, but we also assess the issue of similarities not on the gene level, but on the biology level hidden underneath a given signature. We proceed by taking into account the biological knowledge that exists among different signatures, and use it as a means of integrating them and refining their statistical significance on the datasets. In this form, by integrating biological knowledge with information stemming from data distributions, we derive a unified signature that is significantly improved over its predecessors in terms of performance and robustness. Our motive behind this approach is to assess the problem of evaluating different signatures not in a competitive but rather in a complementary manner, where one is treated as a pool of knowledge contributing to a global and unified solution.


Assuntos
Biomarcadores Tumorais/genética , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos , Algoritmos , Área Sob a Curva , Neoplasias da Mama/genética , Análise por Conglomerados , Bases de Dados de Ácidos Nucleicos , Feminino , Humanos , Estimativa de Kaplan-Meier , Curva ROC
3.
Artigo em Inglês | MEDLINE | ID: mdl-19963602

RESUMO

The concept of deriving a gene signature in breast cancer has been addressed by different research groups, each one proposing a different solution with minor overlap among them. There is still an open issue of unifying results among different research groups. In this study we evaluate two published signatures, namely the 70 gene signature of Netherlands group and a 57 gene signature published in our previous study and propose an evaluation platform under which the underlined signatures could be compared effectively. After such an evaluation, we proceed with a unified signature and assess its performance with improved efficiency over the initial signatures.


Assuntos
Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos/instrumentação , Algoritmos , Área Sob a Curva , Biologia Computacional/métodos , Feminino , Regulação da Expressão Gênica , Genômica , Humanos , Modelos Biológicos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Prognóstico , Recidiva , Reprodutibilidade dos Testes
4.
BMC Bioinformatics ; 10: 53, 2009 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-19200394

RESUMO

BACKGROUND: Information extraction from microarrays has not yet been widely used in diagnostic or prognostic decision-support systems, due to the diversity of results produced by the available techniques, their instability on different data sets and the inability to relate statistical significance with biological relevance. Thus, there is an urgent need to address the statistical framework of microarray analysis and identify its drawbacks and limitations, which will enable us to thoroughly compare methodologies under the same experimental set-up and associate results with confidence intervals meaningful to clinicians. In this study we consider gene-selection algorithms with the aim to reveal inefficiencies in performance evaluation and address aspects that can reduce uncertainty in algorithmic validation. RESULTS: A computational study is performed related to the performance of several gene selection methodologies on publicly available microarray data. Three basic types of experimental scenarios are evaluated, i.e. the independent test-set and the 10-fold cross-validation (CV) using maximum and average performance measures. Feature selection methods behave differently under different validation strategies. The performance results from CV do not mach well those from the independent test-set, except for the support vector machines (SVM) and the least squares SVM methods. However, these wrapper methods achieve variable (often low) performance, whereas the hybrid methods attain consistently higher accuracies. The use of an independent test-set within CV is important for the evaluation of the predictive power of algorithms. The optimal size of the selected gene-set also appears to be dependent on the evaluation scheme. The consistency of selected genes over variation of the training-set is another aspect important in reducing uncertainty in the evaluation of the derived gene signature. In all cases the presence of outlier samples can seriously affect algorithmic performance. CONCLUSION: Multiple parameters can influence the selection of a gene-signature and its predictive power, thus possible biases in validation methods must always be accounted for. This paper illustrates that independent test-set evaluation reduces the bias of CV, and case-specific measures reveal stability characteristics of the gene-signature over changes of the training set. Moreover, frequency measures on gene selection address the algorithmic consistency in selecting the same gene signature under different training conditions. These issues contribute to the development of an objective evaluation framework and aid the derivation of statistically consistent gene signatures that could eventually be correlated with biological relevance. The benefits of the proposed framework are supported by the evaluation results and methodological comparisons performed for several gene-selection algorithms on three publicly available datasets.


Assuntos
Algoritmos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos
5.
Comput Biol Med ; 38(8): 894-912, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18656182

RESUMO

OBJECTIVE: The problem of marker selection in DNA microarray analysis has been addressed so far by two basic types of approaches, the so-called filter and wrapper methods. Wrapper methods operate in a recursive fashion where feature (gene) weights are re-evaluated and dynamically changing from iteration to iteration, while in filter methods feature weights remain fixed. Our objective in this study is to show that the application of filter criteria in a recursive fashion, where weights are potentially adjusted from cycle to cycle, produces noticeable improvement on the generalization performance measured on independent test sets. METHODS AND MATERIALS: Toward this direction we explore the behavior of two well known and broadly accepted pattern recognition approaches namely the support vector machines (SVM) and a single linear neuron (LN), properly adapted to the problem of marker selection. Within this context we also show how the kernel ability of SVM could be employed in a practical manner to provide alternative ways to approach the problem of reliable marker selection. RESULTS: We explore how the proposed approaches behave in two application domains (breast cancer and leukemia), achieving comparable or even better results than those reported in the related bibliography. An important advantage of these approaches is their ability to derive stable performance without deteriorating due to the complexity of the application domain. Validation is performed using internal leave one out (ILOO) and 10-fold cross validation as well as independent test set evaluation. CONCLUSIONS: Results show that the proposed methodologies achieve remarkable performance and indicate that applying filter criteria in a wrapper fashion ('wrapper filtering criteria') provides a useful tool for marker selection. The contribution of this study is threefold. First it provides a methodology to apply filter criteria in a wrapper way (which is a new approach), second it introduces a fundamental pattern recognition component namely the single neuron (which is a linear estimator) and explores its behavior on marker selection and third it demonstrates an approach to exploit the kernel ability of SVMs in a practical and effective manner.


Assuntos
Neurônios , Algoritmos , Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos
6.
Comput Methods Programs Biomed ; 91(1): 22-35, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18423925

RESUMO

OBJECTIVE: The problem of gene selection has been extensively studied in a number of scientific works using various kinds of methods. However, the application of a linear neuron is a novel approach possessing several advantages. In this work we propose to study the behavior of such a linear neuron, appropriately adapted and trained to the problem of gene selection in the DNA microarray experiment. METHODS AND MATERIALS: We explore the proposed approach in terms of an accuracy evaluation criterion, which is used to assess the performance of the proposed methodology, but we also evaluate the produced results in terms of cluster quality and survival prediction. Cluster quality reflects the ability of the method to select differentially expressed genes, which in turn leads to better clustering and survival prediction. RESULTS: We directly compare the proposed methodology with RFE-SVM, a well known and broadly accepted method demonstrating remarkable performance on various data sets of clinical interest. CONCLUSIONS: Conducted computational experiments show that the proposed approach can be efficiently used within the field of gene selection producing high-quality results in terms of accuracy and robustness.


Assuntos
Biomarcadores Tumorais/análise , Diagnóstico por Computador/métodos , Perfilação da Expressão Gênica/métodos , Proteínas de Neoplasias/análise , Neoplasias/diagnóstico , Neoplasias/mortalidade , Medição de Risco/métodos , Análise de Sobrevida , Humanos , Neoplasias/metabolismo , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão/métodos , Prognóstico , Reprodutibilidade dos Testes , Fatores de Risco , Sensibilidade e Especificidade , Taxa de Sobrevida
7.
Artigo em Inglês | MEDLINE | ID: mdl-18002933

RESUMO

The problem of marker selection in DNA microarray analysis has been mostly addressed by linear methods. RFE-SVM is such a representative method where a linear kernel is used as the basic tool to address the problem. On the other hand a single neuron is known to be a linear estimator. In this study we explore such a single neuron to address the problem of marker selection.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Redes Neurais de Computação , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Biomarcadores Tumorais/biossíntese , Neoplasias da Mama/metabolismo , Feminino , Humanos , Neurônios
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...