RESUMO
Microarray experiments are capable of determining the relative expression of tens of thousands of genes simultaneously, thus resulting in very large databases. The analysis of these databases and the extraction of biologically relevant knowledge from them are challenging tasks. The identification of potential cancer biomarker genes is one of the most important aims for microarray analysis and, as such, has been widely targeted in the literature. However, identifying a set of these genes consistently across different experiments, researches, microarray platforms, or cancer types is still an elusive endeavor. Besides the inherent difficulty of the large and nonconstant variability in these experiments and the incommensurability between different microarray technologies, there is the issue of the users having to adjust a series of parameters that significantly affect the outcome of the analyses and that do not have a biological or medical meaning. In this study, the identification of potential cancer biomarkers from microarray data is casted as a multiple criteria optimization (MCO) problem. The efficient solutions to this problem, found here through data envelopment analysis (DEA), are associated to genes that are proposed as potential cancer biomarkers. The method does not require any parameter adjustment by the user, and thus fosters repeatability. The approach also allows the analysis of different microarray experiments, microarray platforms, and cancer types simultaneously. The results include the analysis of three publicly available microarray databases related to cervix cancer. This study points to the feasibility of modeling the selection of potential cancer biomarkers from microarray data as an MCO problem and solve it using DEA. Using MCO entails a new optic to the identification of potential cancer biomarkers as it does not require the definition of a threshold value to establish significance for a particular gene and the selection of a normalization procedure to compare different experiments is no longer necessary.
Assuntos
Biomarcadores Tumorais/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Neoplasias do Colo do Útero/genética , Feminino , Expressão Gênica/genética , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Humanos , Biologia de Sistemas/métodosRESUMO
OBJECTIVE: A new method using Multiple Criteria Optimization (MCO) proposed by our research group has shown evidence of being able to identify gene-based biomarkers for the detection of cancer using microarray data. Herein, we explore this method, considering more than two conflicting criteria for the MCO problem. Using this method would result in stronger outcomes when using different results from microarray analyses. It would also demonstrate that the method is suitable for carrying out meta-analysis. METHODS: Statistical comparisons between normal and cancer tissues were performed using a colon cancer microarray database. The different comparisons were carried out with a Mann-Whitney non-parametric test using partial permutations of the data. An MCO problem was built using the different p-values obtained. The associated solution was the set of genes reaching the best compromises between the p-values under consideration that were located in the so-called efficient frontier. Data Envelopment Analysis (DEA) was used to find the efficient frontier of the MCO problem. The capacity of DEA was explored using different numbers of p-values (criteria) in the model. RESULTS: The set of identified genes was consistent across the instances using different numbers of p-values in the DEA model, thereby providing evidence of the outcome stability of the proposed strategy. It was also observed that convergence to a larger number of potential biomarkers is faster with additional criteria, i.e., more p-values. CONCLUSION: The MCO problem proposed for the cancer biomarker search using microarray data can be solved efficiently with DEA using more than two conflicting criteria. This approach can result in robust results when using different analyses of microarray data and, indeed, in a faster convergence to highly potential biomarkers.
Assuntos
Neoplasias do Colo/diagnóstico , Neoplasias do Colo/genética , Biomarcadores , Humanos , Análise em MicrossériesRESUMO
Diagnosing cancer using microarray analysis to study differential gene expression has been a recent focus of intense research Although several very sophisticated analysis tools have been developed with this aim in mind, it still remains a challenge to keep these methods free of parametric adjustments as well as maintain their transparency for the final user. Nonparametric methods in general have been associated with these last two characteristics, thus becoming attractive tools for microarray analysis in cancer research. In particular, diagnosing cancer via microarray analysis is an exercise whereby tissue is characterized according to its differential gene expression levels. In this manuscript, two novel nonparametric methods for cancer diagnosis using microarray data are described and their performance assessed against a baseline approach that utilizes the Mann-Whitney test for median differences. Both methods show promising results in terms of their potential use in making diagnoses.