Search | VHL Regional Portal

Learning the Relation Between Similarity Loss and Clustering Loss in Self-Supervised Learning.

Ge, Jidong; Liu, Yuxiang; Gui, Jie; Fang, Lanting; Lin, Ming; Kwok, James Tin-Yau; Huang, Liguo; Luo, Bin.

IEEE Trans Image Process ; 32: 3442-3454, 2023.

Article in English | MEDLINE | ID: mdl-37227917

ABSTRACT

Self-supervised learning enables networks to learn discriminative features from massive data itself. Most state-of-the-art methods maximize the similarity between two augmentations of one image based on contrastive learning. By utilizing the consistency of two augmentations, the burden of manual annotations can be freed. Contrastive learning exploits instance-level information to learn robust features. However, the learned information is probably confined to different views of the same instance. In this paper, we attempt to leverage the similarity between two distinct images to boost representation in self-supervised learning. In contrast to instance-level information, the similarity between two distinct images may provide more useful information. Besides, we analyze the relation between similarity loss and feature-level cross-entropy loss. These two losses are essential for most deep learning methods. However, the relation between these two losses is not clear. Similarity loss helps obtain instance-level representation, while feature-level cross-entropy loss helps mine the similarity between two distinct images. We provide theoretical analyses and experiments to show that a suitable combination of these two losses can get state-of-the-art results. Code is available at https://github.com/guijiejie/ICCL.

A Note on the Unification of Adaptive Online Learning.

He, Wenwu; Kwok, James Tin-Yau; Zhu, Ji; Liu, Yang.

IEEE Trans Neural Netw Learn Syst ; 28(5): 1178-1191, 2017 05.

Article in English | MEDLINE | ID: mdl-26929066

ABSTRACT

In online convex optimization, adaptive algorithms, which can utilize the second-order information of the loss function's (sub)gradient, have shown improvements over standard gradient methods. This paper presents a framework Follow the Bregman Divergence Leader that unifies various existing adaptive algorithms from which new insights are revealed. Under the proposed framework, two simple adaptive online algorithms with improvable performance guarantee are derived. Furthermore, a general equation derived from a matrix analysis generalizes the adaptive learning to nonlinear case with kernel trick.

A Hybrid PSO-BFGS Strategy for Global Optimization of Multimodal Functions.

Tsang, I W; Kwok, James Tin-Yau.

IEEE Trans Syst Man Cybern B Cybern ; 41(4): 1003-14, 2011 Aug.

Article in English | MEDLINE | ID: mdl-21278022

ABSTRACT

Particle swarm optimizer (PSO) is a powerful optimization algorithm that has been applied to a variety of problems. It can, however, suffer from premature convergence and slow convergence rate. Motivated by these two problems, a hybrid global optimization strategy combining PSOs with a modified Broyden-Fletcher-Goldfarb-Shanno (BFGS) method is presented in this paper. The modified BFGS method is integrated into the context of the PSOs to improve the particles' local search ability. In addition, in conjunction with the territory technique, a reposition technique to maintain the diversity of particles is proposed to improve the global search ability of PSOs. One advantage of the hybrid strategy is that it can effectively find multiple local solutions or global solutions to the multimodal functions in a box-constrained space. Based on these local solutions, a reconstruction technique can be adopted to further estimate better solutions. The proposed method is compared with several recently developed optimization algorithms on a set of 20 standard benchmark problems. Experimental results demonstrate that the proposed approach can obtain high-quality solutions on multimodal function optimization problems.

Building sparse multiple-kernel SVM classifiers.

Hu, Mingqing; Chen, Yiqiang; Kwok, James Tin-Yau.

IEEE Trans Neural Netw ; 20(5): 827-39, 2009 May.

Article in English | MEDLINE | ID: mdl-19342346

ABSTRACT

The support vector machines (SVMs) have been very successful in many machine learning problems. However, they can be slow during testing because of the possibly large number of support vectors obtained. Recently, Wu (2005) proposed a sparse formulation that restricts the SVM to use a small number of expansion vectors. In this paper, we further extend this idea by integrating with techniques from multiple-kernel learning (MKL). The kernel function in this sparse SVM formulation no longer needs to be fixed but can be automatically learned as a linear combination of kernels. Two formulations of such sparse multiple-kernel classifiers are proposed. The first one is based on a convex combination of the given base kernels, while the second one uses a convex combination of the so-called "equivalent" kernels. Empirically, the second formulation is particularly competitive. Experiments on a large number of toy and real-world data sets show that the resultant classifier is compact and accurate, and can also be easily trained by simply alternating linear program and standard SVM solver.

Generalized core vector machines.

Tsang, Ivor Wai-Hung; Kwok, James Tin-Yau; Zurada, Jacek M.

IEEE Trans Neural Netw ; 17(5): 1126-40, 2006 Sep.

Article in English | MEDLINE | ID: mdl-17001975

ABSTRACT

Kernel methods, such as the support vector machine (SVM), are often formulated as quadratic programming (QP) problems. However, given m training patterns, a naive implementation of the QP solver takes O(m3) training time and at least O(m2) space. Hence, scaling up these QPs is a major stumbling block in applying kernel methods on very large data sets, and a replacement of the naive method for finding the QP solutions is highly desirable. Recently, by using approximation algorithms for the minimum enclosing ball (MEB) problem, we proposed the core vector machine (CVM) algorithm that is much faster and can handle much larger data sets than existing SVM implementations. However, the CVM can only be used with certain kernel functions and kernel methods. For example, the very popular support vector regression (SVR) cannot be used with the CVM. In this paper, we introduce the center-constrained MEB problem and subsequently extend the CVM algorithm. The generalized CVM algorithm can now be used with any linear/nonlinear kernel and can also be applied to kernel methods such as SVR and the ranking SVM. Moreover, like the original CVM, its asymptotic time complexity is again linear in m and its space complexity is independent of m. Experiments show that the generalized CVM has comparable performance with state-of-the-art SVM and SVR implementations, but is faster and produces fewer support vectors on very large data sets.

Subject(s)

Algorithms , Artificial Intelligence , Computing Methodologies , Pattern Recognition, Automated/methods , Cluster Analysis , Neural Networks, Computer

Efficient hyperkernel learning using second-order cone programming.

Tsang, Ivor Wai-hung; Kwok, James Tin-yau.

IEEE Trans Neural Netw ; 17(1): 48-58, 2006 Jan.

Article in English | MEDLINE | ID: mdl-16526475

ABSTRACT

The kernel function plays a central role in kernel methods. Most existing methods can only adapt the kernel parameters or the kernel matrix based on empirical data. Recently, Ong et al. introduced the method of hyperkernels which can be used to learn the kernel function directly in an inductive setting. However, the associated optimization problem is a semidefinite program (SDP), which is very computationally expensive, even with the recent advances in interior point methods. In this paper, we show that this learning problem can be equivalently reformulated as a second-order cone program (SOCP), which can then be solved more efficiently than SDPs. Comparison is also made with the kernel matrix learning method proposed by Lanckriet et aL Experimental results on both classification and regression problems, with toy and real-world data sets, show that our proposed SOCP formulation has significant speedup over the original SDP formulation. Moreover, it yields better generalization than Lanckriet et al.'s method, with a speed that is comparable, or sometimes even faster, than their quadratically constrained quadratic program (QCQP) formulation.

Subject(s)

Artificial Intelligence , Software , Algorithms , Data Interpretation, Statistical , Disease/classification , Normal Distribution , Regression Analysis

The pre-image problem in kernel methods.

Kwok, James Tin-yau; Tsang, Ivor Wai-hung.

IEEE Trans Neural Netw ; 15(6): 1517-25, 2004 Nov.

Article in English | MEDLINE | ID: mdl-15565778

ABSTRACT

In this paper, we address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method which relies on nonlinear optimization, our proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is noniterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering on the USPS data set show much improved performance.

Subject(s)

Algorithms , Artificial Intelligence , Decision Support Techniques , Feedback , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Cluster Analysis , Computer Simulation , Neural Networks, Computer , Principal Component Analysis

Fusing images with different focuses using support vector machines.

Li, Shutao; Kwok, James Tin-Yau; Tsang, Ivor Wai-Hung; Wang, Yaonan.

IEEE Trans Neural Netw ; 15(6): 1555-61, 2004 Nov.

Article in English | MEDLINE | ID: mdl-15565781

ABSTRACT

Many vision-related processing tasks, such as edge detection, image segmentation and stereo matching, can be performed more easily when all objects in the scene are in good focus. However, in practice, this may not be always feasible as optical lenses, especially those with long focal lengths, only have a limited depth of field. One common approach to recover an everywhere-in-focus image is to use wavelet-based image fusion. First, several source images with different focuses of the same scene are taken and processed with the discrete wavelet transform (DWT). Among these wavelet decompositions, the wavelet coefficient with the largest magnitude is selected at each pixel location. Finally, the fused image can be recovered by performing the inverse DWT. In this paper, we improve this fusion procedure by applying the discrete wavelet frame transform (DWFT) and the support vector machines (SVM). Unlike DWT, DWFT yields a translation-invariant signal representation. Using features extracted from the DWFT coefficients, a SVM is trained to select the source image that has the best focus at each pixel location, and the corresponding DWFT coefficients are then incorporated into the composite wavelet representation. Experimental results show that the proposed method outperforms the traditional approach both visually and quantitatively.

Subject(s)

Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Subtraction Technique , Computer Simulation , Computing Methodologies , Fixation, Ocular , Image Enhancement/methods , Neural Networks, Computer

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL