Pesquisa | Portal Regional da BVS

Kernel Stability for Model Selection in Kernel-Based Algorithms.

Liu, Yong; Liao, Shizhong; Zhang, Hua; Ren, Wenqi; Wang, Weiping.

IEEE Trans Cybern ; 51(12): 5647-5658, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-31283520

RESUMO

Model selection is one of the fundamental problems in kernel-based algorithms, which is commonly done by minimizing an estimation of generalization error. The notion of stability and cross-validation (CV) error of learning machines consists of two widely used tools for analyzing the generalization performance. However, there are some disadvantages to both tools when applied for model selection: 1) the stability of learning machines is not practical due to the difficulty of the estimation of its specific value and 2) the CV-based estimate of generalization error usually has a relatively high variance, so it is prone to overfitting. To overcome these two limitations, we present a novel notion of kernel stability (KS) for deriving the generalization error bounds and variance bounds of CV and provide an effective approach to the application of KS for practical model selection. Unlike the existing notions of stability of the learning machine, KS is defined on the kernel matrix; hence, it can avoid the difficulty of the estimation of its value. We manifest the relationship between the KS and the popular uniform stability of the learning algorithm, and further propose several KS-based generalization error bounds and variance bounds of CV. By minimizing the proposed bounds, we present two novel KS-based criteria that can ensure good performance. Finally, we empirically analyze the performance of the proposed criteria on many benchmark data, which demonstrates that our KS-based criteria are sound and effective.

Assuntos

Algoritmos

Approximate Kernel Selection via Matrix Approximation.

Ding, Lizhong; Liao, Shizhong; Liu, Yong; Liu, Li; Zhu, Fan; Yao, Yazhou; Shao, Ling; Gao, Xin.

IEEE Trans Neural Netw Learn Syst ; 31(11): 4881-4891, 2020 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-31945003

RESUMO

Kernel selection is of fundamental importance for the generalization of kernel methods. This article proposes an approximate approach for kernel selection by exploiting the approximability of kernel selection and the computational virtue of kernel matrix approximation. We define approximate consistency to measure the approximability of the kernel selection problem. Based on the analysis of approximate consistency, we solve the theoretical problem of whether, under what conditions, and at what speed, the approximate criterion is close to the accurate one, establishing the foundations of approximate kernel selection. We introduce two selection criteria based on error estimation and prove the approximate consistency of the multilevel circulant matrix (MCM) approximation and Nyström approximation under these criteria. Under the theoretical guarantees of the approximate consistency, we design approximate algorithms for kernel selection, which exploits the computational advantages of the MCM and Nyström approximations to conduct kernel selection in a linear or quasi-linear complexity. We experimentally validate the theoretical results for the approximate consistency and evaluate the effectiveness of the proposed kernel selection algorithms.

Fast Cross-Validation for Kernel-Based Algorithms.

Liu, Yong; Liao, Shizhong; Jiang, Shali; Ding, Lizhong; Lin, Hailun; Wang, Weiping.

IEEE Trans Pattern Anal Mach Intell ; 42(5): 1083-1096, 2020 May.

Artigo em Inglês | MEDLINE | ID: mdl-30640598

RESUMO

Cross-validation (CV) is a widely adopted approach for selecting the optimal model. However, the computation of empirical cross-validation error (CVE) has high complexity due to multiple times of learner training. In this paper, we develop a novel approximation theory of CVE and present an approximate approach to CV based on the Bouligand influence function (BIF) for kernel-based algorithms. We first represent the BIF and higher order BIFs in Taylor expansions, and approximate CV via the Taylor expansions. We then derive an upper bound of the discrepancy between the original and approximate CV. Furthermore, we provide a novel computing method to calculate the BIF for general distribution, and evaluate BIF criterion for sample distribution to approximate CV. The proposed approximate CV requires training on the full data set only once and is suitable for a wide variety of kernel-based algorithms. Experimental results demonstrate that the proposed approximate CV is sound and effective.

ROC-Boosting: A Feature Selection Method for Health Identification Using Tongue Image.

Cui, Yan; Liao, Shizhong; Wang, Hongwu.

Comput Math Methods Med ; 2015: 362806, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26543494

RESUMO

OBJECTIVE: To select significant Haar-like features extracted from tongue images for health identification. MATERIALS AND METHODS: 1,322 tongue cases were included in this study. Health information and tongue images of each case were collected. Cases were classified into the following groups: group containing 148 cases diagnosed as health; group containing 332 cases diagnosed as ill based on health information, even though tongue image is normal; and group containing 842 cases diagnosed as ill. Haar-like features were extracted from tongue images. Then, we proposed a new boosting method in the ROC space for selecting significant features from the features extracted from these images. RESULTS: A total of 27 features were obtained from groups A, B, and C. Seven features were selected from groups A and B, while 25 features were selected from groups A and C. CONCLUSIONS: The selected features in this study were mainly obtained from the root, top, and side areas of the tongue. This is consistent with the tongue partitions employed in traditional Chinese medicine. These results provide scientific evidence to TCM tongue diagnosis for health identification.

Assuntos

Algoritmos , Diagnóstico por Imagem/métodos , Indicadores Básicos de Saúde , Língua/patologia , China , Biologia Computacional , Diagnóstico por Imagem/estatística & dados numéricos , Nível de Saúde , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Medicina Tradicional Chinesa/métodos , Curva ROC

Relationship between Hyperuricemia and Haar-Like Features on Tongue Images.

Cui, Yan; Liao, Shizhong; Wang, Hongwu; Liu, Hongyu; Wang, Wenhua; Yin, Liqun.

Biomed Res Int ; 2015: 363216, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25961013

RESUMO

OBJECTIVE: To investigate differences in tongue images of subjects with and without hyperuricemia. MATERIALS AND METHODS: This population-based case-control study was performed in 2012-2013. We collected data from 46 case subjects with hyperuricemia and 46 control subjects, including results of biochemical examinations and tongue images. Symmetrical Haar-like features based on integral images were extracted from tongue images. T-tests were performed to determine the ability of extracted features to distinguish between the case and control groups. We first selected features using the common criterion P < 0.05, then conducted further examination of feature characteristics and feature selection using means and standard deviations of distributions in the case and control groups. RESULTS: A total of 115,683 features were selected using the criterion P < 0.05. The maximum area under the receiver operating characteristic curve (AUC) of these features was 0.877. The sensitivity of the feature with the maximum AUC value was 0.800 and specificity was 0.826 when the Youden index was maximized. Features that performed well were concentrated in the tongue root region. CONCLUSIONS: Symmetrical Haar-like features enabled discrimination of subjects with and without hyperuricemia in our sample. The locations of these discriminative features were in agreement with the interpretation of tongue appearance in traditional Chinese and Western medicine.

Assuntos

Hiperuricemia , Língua/fisiopatologia , Ácido Úrico/sangue , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Casos e Controles , Feminino , Humanos , Hiperuricemia/sangue , Hiperuricemia/fisiopatologia , Masculino , Pessoa de Meia-Idade , Reconhecimento Automatizado de Padrão , Curva ROC

MACT: a manageable minimization allocation system.

Cui, Yan; Bu, Huaien; Wang, Hongwu; Liao, Shizhong.

Comput Math Methods Med ; 2014: 645064, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24701251

RESUMO

BACKGROUND: Minimization is a case allocation method for randomized controlled trials (RCT). Evidence suggests that the minimization method achieves balanced groups with respect to numbers and participant characteristics, and can incorporate more prognostic factors compared to other randomization methods. Although several automatic allocation systems exist (e.g., randoWeb, and MagMin), the minimization method is still difficult to implement, and RCTs seldom employ minimization. Therefore, we developed the minimization allocation controlled trials (MACT) system, a generic manageable minimization allocation system. SYSTEM OUTLINE: The MACT system implements minimization allocation by Web and email. It has a unified interface that manages trials, participants, and allocation. It simultaneously supports multitrials, multicenters, multigrouping, multiple prognostic factors, and multilevels. METHODS: Unlike previous systems, MACT utilizes an optimized database that greatly improves manageability. SIMULATIONS AND RESULTS: MACT was assessed in a series of experiments and evaluations. Relative to simple randomization, minimization produces better balance among groups and similar unpredictability. APPLICATIONS: MACT has been employed in two RCTs that lasted three years. During this period, MACT steadily and simultaneously satisfied the requirements of the trial. CONCLUSIONS: MACT is a manageable, easy-to-use case allocation system. Its outstanding features are attracting more RCTs to use the minimization allocation method.

Assuntos

Ensaios Clínicos Controlados Aleatórios como Assunto , Projetos de Pesquisa , Algoritmos , Simulação por Computador , Bases de Dados Factuais , Humanos , Distribuição Aleatória , Software

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA