RESUMO
MOTIVATION: This paper gives a new and efficient algorithm for the sparse logistic regression problem. The proposed algorithm is based on the Gauss-Seidel method and is asymptotically convergent. It is simple and extremely easy to implement; it neither uses any sophisticated mathematical programming software nor needs any matrix operations. It can be applied to a variety of real-world problems like identifying marker genes and building a classifier in the context of cancer diagnosis using microarray data. RESULTS: The gene selection method suggested in this paper is demonstrated on two real-world data sets and the results were found to be consistent with the literature. AVAILABILITY: The implementation of this algorithm is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml SUPPLEMENTARY INFORMATION: Supplementary material is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml
Assuntos
Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Testes Genéticos/métodos , Neoplasias/diagnóstico , Neoplasias/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Regressão , Biomarcadores Tumorais/genética , Neoplasias da Mama/classificação , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Neoplasias do Colo/classificação , Neoplasias do Colo/diagnóstico , Neoplasias do Colo/genética , Humanos , Neoplasias/classificação , Reconhecimento Automatizado de Padrão , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
This article extends the well-known SMO algorithm of support vector machines (SVMs) to least-squares SVM formulations that include LS-SVM classification, kernel ridge regression, and a particular form of regularized kernel Fisher discriminant. The algorithm is shown to be asymptotically convergent. It is also extremely easy to implement. Computational experiments show that the algorithm is fast and scales efficiently (quadratically) as a function of the number of examples.
Assuntos
Algoritmos , Análise dos Mínimos QuadradosRESUMO
In this paper we give a new fast iterative algorithm for support vector machine (SVM) classifier design. The basic problem treated is one that does not allow classification violations. The problem is converted to a problem of computing the nearest point between two convex polytopes. The suitability of two classical nearest point algorithms, due to Gilbert, and Mitchell et al., is studied. Ideas from both these algorithms are combined and modified to derive our fast algorithm. For problems which require classification violations to be allowed, the violations are quadratically penalized and an idea due to Cortes and Vapnik and Friess is used to convert it to a problem in which there are no classification violations. Comparative computational evaluation of our algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that our algorithm is very competitive.
RESUMO
This paper points out an important source of inefficiency in Smola and Schölkopf's sequential minimal optimization (SMO) algorithm for support vector machine (SVM) regression that is caused by the use of a single threshold value. Using clues from the KKT conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO for regression. These modified algorithms perform significantly faster than the original SMO on the datasets tried.