Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
IEEE Trans Neural Netw Learn Syst ; 30(4): 1180-1190, 2019 Apr.
Article in English | MEDLINE | ID: mdl-30176608

ABSTRACT

In most domain adaption approaches, all features are used for domain adaption. However, often, not every feature is beneficial for domain adaption. In such cases, incorrectly involving all features might cause the performance to degrade. In other words, to make the model trained on the source domain work well on the target domain, it is desirable to find invariant features for domain adaption rather than using all features. However, invariant features across domains may lie in a higher order space, instead of in the original feature space. Moreover, the discriminative ability of some invariant features such as shared background information is weak, and needs to be further filtered. Therefore, in this paper, we propose a novel domain adaption algorithm based on an explicit feature map and feature selection. The data are first represented by a kernel-induced explicit feature map, such that high-order invariant features can be revealed. Then, by minimizing the marginal distribution difference, conditional distribution difference, and the model error, the invariant discriminative features are effectively selected. This problem is NP-hard to be solved, and we propose to relax it and solve it by a cutting plane algorithm. Experimental results on six real-world benchmarks have demonstrated the effectiveness and efficiency of the proposed algorithm, which outperforms many state-of-the-art domain adaption approaches.

2.
IEEE Trans Cybern ; 44(7): 1180-90, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24951546

ABSTRACT

We propose a new graph based hashing method called spectral embedded hashing (SEH) for large-scale image retrieval. We first introduce a new regularizer into the objective function of the recent work spectral hashing to control the mismatch between the resultant hamming embedding and the low-dimensional data representation, which is obtained by using a linear regression function. This linear regression function can be employed to effectively handle the out-of-sample data, and the introduction of the new regularizer makes SEH better cope with the data sampled from a nonlinear manifold. Considering that SEH cannot efficiently cope with the high dimensional data, we further extend SEH to kernel SEH (KSEH) to improve the efficiency and effectiveness, in which a nonlinear regression function can also be employed to obtain the low dimensional data representation. We also develop a new method to efficiently solve the approximate solution for the eigenvalue decomposition problem in SEH and KSEH. Moreover, we show that some existing hashing methods are special cases of our KSEH. Our comprehensive experiments on CIFAR, Tiny-580K, NUS-WIDE, and Caltech-256 datasets clearly demonstrate the effectiveness of our methods.


Subject(s)
Image Processing, Computer-Assisted/methods , Algorithms , Animals , Databases, Factual , Humans , Linear Models
3.
IEEE Trans Image Process ; 23(2): 623-34, 2014 Feb.
Article in English | MEDLINE | ID: mdl-24239999

ABSTRACT

This paper targets fine-grained image categorization by learning a category-specific dictionary for each category and a shared dictionary for all the categories. Such category-specific dictionaries encode subtle visual differences among different categories, while the shared dictionary encodes common visual patterns among all the categories. To this end, we impose incoherence constraints among the different dictionaries in the objective of feature coding. In addition, to make the learnt dictionary stable, we also impose the constraint that each dictionary should be self-incoherent. Our proposed dictionary learning formulation not only applies to fine-grained classification, but also improves conventional basic-level object categorization and other tasks such as event recognition. Experimental results on five data sets show that our method can outperform the state-of-the-art fine-grained image categorization frameworks as well as sparse coding based dictionary learning frameworks. All these results demonstrate the effectiveness of our method.

4.
IEEE Trans Pattern Anal Mach Intell ; 35(9): 2051-63, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23868769

ABSTRACT

Feature selection with specific multivariate performance measures is the key to the success of many applications such as image retrieval and text classification. The existing feature selection methods are usually designed for classification error. In this paper, we propose a generalized sparse regularizer. Based on the proposed regularizer, we present a unified feature selection framework for general loss functions. In particular, we study the novel feature selection paradigm by optimizing multivariate performance measures. The resultant formulation is a challenging problem for high-dimensional data. Hence, a two-layer cutting plane algorithm is proposed to solve this problem, and the convergence is presented. In addition, we adapt the proposed method to optimize multivariate measures for multiple-instance learning problems. The analyses by comparing with the state-of-the-art feature selection methods show that the proposed method is superior to others. Extensive experiments on large-scale and high-dimensional real-world datasets show that the proposed method outperforms l1-SVM and SVM-RFE when choosing a small subset of features, and achieves significantly improved performances over SVM(perf) in terms of F1-score.

5.
IEEE Trans Image Process ; 22(4): 1585-97, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23247859

ABSTRACT

Automatic image annotation, which is usually formulated as a multi-label classification problem, is one of the major tools used to enhance the semantic understanding of web images. Many multimedia applications (e.g., tag-based image retrieval) can greatly benefit from image annotation. However, the insufficient performance of image annotation methods prevents these applications from being practical. On the other hand, specific measures are usually designed to evaluate how well one annotation method performs for a specific objective or application, but most image annotation methods do not consider optimization of these measures, so that they are inevitably trapped into suboptimal performance of these objective-specific measures. To address this issue, we first summarize a variety of objective-guided performance measures under a unified representation. Our analysis reveals that macro-averaging measures are very sensitive to infrequent keywords, and hamming measure is easily affected by skewed distributions. We then propose a unified multi-label learning framework, which directly optimizes a variety of objective-specific measures of multi-label learning tasks. Specifically, we first present a multilayer hierarchical structure of learning hypotheses for multi-label problems based on which a variety of loss functions with respect to objective-guided measures are defined. And then, we formulate these loss functions as relaxed surrogate functions and optimize them by structural SVMs. According to the analysis of various measures and the high time complexity of optimizing micro-averaging measures, in this paper, we focus on example-based measures that are tailor-made for image annotation tasks but are seldom explored in the literature. Experiments show consistency with the formal analysis on two widely used multi-label datasets, and demonstrate the superior performance of our proposed method over state-of-the-art baseline methods in terms of example-based measures on four image annotation datasets.

6.
IEEE Trans Pattern Anal Mach Intell ; 35(1): 92-104, 2013 Jan.
Article in English | MEDLINE | ID: mdl-22392702

ABSTRACT

Sparse coding exhibits good performance in many computer vision applications. However, due to the overcomplete codebook and the independent coding process, the locality and the similarity among the instances to be encoded are lost. To preserve such locality and similarity information, we propose a Laplacian sparse coding (LSc) framework. By incorporating the similarity preserving term into the objective of sparse coding, our proposed Laplacian sparse coding can alleviate the instability of sparse codes. Furthermore, we propose a Hypergraph Laplacian sparse coding (HLSc), which extends our Laplacian sparse coding to the case where the similarity among the instances defined by a hypergraph. Specifically, this HLSc captures the similarity among the instances within the same hyperedge simultaneously, and also makes the sparse codes of them be similar to each other. Both Laplacian sparse coding and Hypergraph Laplacian sparse coding enhance the robustness of sparse coding. We apply the Laplacian sparse coding to feature quantization in Bag-of-Words image representation, and it outperforms sparse coding and achieves good performance in solving the image classification problem. The Hypergraph Laplacian sparse coding is also successfully used to solve the semi-auto image tagging problem. The good performance of these applications demonstrates the effectiveness of our proposed formulations in locality and similarity preservation.


Subject(s)
Algorithms , Artificial Intelligence , Data Compression/methods , Decision Support Techniques , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods
7.
IEEE Trans Image Process ; 22(2): 423-34, 2013 Feb.
Article in English | MEDLINE | ID: mdl-23014744

ABSTRACT

Recent research has shown the initial success of sparse coding (Sc) in solving many computer vision tasks. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which helps in finding a sparse representation of nonlinear features, we propose kernel sparse representation (KSR). Essentially, KSR is a sparse coding technique in a high dimensional feature space mapped by an implicit mapping function. We apply KSR to feature coding in image classification, face recognition, and kernel matrix approximation. More specifically, by incorporating KSR into spatial pyramid matching (SPM), we develop KSRSPM, which achieves a good performance for image classification. Moreover, KSR-based feature coding can be shown as a generalization of efficient match kernel and an extension of Sc-based SPM. We further show that our proposed KSR using a histogram intersection kernel (HIK) can be considered a soft assignment extension of HIK-based feature quantization in the feature coding process. Besides feature coding, comparing with sparse coding, KSR can learn more discriminative sparse codes and achieve higher accuracy for face recognition. Moreover, KSR can also be applied to kernel matrix approximation in large scale learning tasks, and it demonstrates its robustness to kernel matrix approximation, especially when a small fraction of the data is used. Extensive experimental results demonstrate promising results of KSR in image classification, face recognition, and kernel matrix approximation. All these applications prove the effectiveness of KSR in computer vision and machine learning tasks.


Subject(s)
Algorithms , Biometric Identification/methods , Image Processing, Computer-Assisted/methods , Databases, Factual , Female , Humans , Male
8.
IEEE Trans Neural Netw Learn Syst ; 23(3): 504-18, 2012 Mar.
Article in English | MEDLINE | ID: mdl-24808555

ABSTRACT

In this paper, we propose a new framework called domain adaptation machine (DAM) for the multiple source domain adaption problem. Under this framework, we learn a robust decision function (referred to as target classifier) for label prediction of instances from the target domain by leveraging a set of base classifiers which are prelearned by using labeled instances either from the source domains or from the source domains and the target domain. With the base classifiers, we propose a new domain-dependent regularizer based on smoothness assumption, which enforces that the target classifier shares similar decision values with the relevant base classifiers on the unlabeled instances from the target domain. This newly proposed regularizer can be readily incorporated into many kernel methods (e.g., support vector machines (SVM), support vector regression, and least-squares SVM (LS-SVM)). For domain adaptation, we also develop two new domain adaptation methods referred to as FastDAM and UniverDAM. In FastDAM, we introduce our proposed domain-dependent regularizer into LS-SVM as well as employ a sparsity regularizer to learn a sparse target classifier with the support vectors only from the target domain, which thus makes the label prediction on any test instance very fast. In UniverDAM, we additionally make use of the instances from the source domains as Universum to further enhance the generalization ability of the target classifier. We evaluate our two methods on the challenging TRECIVD 2005 dataset for the large-scale video concept detection task as well as on the 20 newsgroups and email spam datasets for document retrieval. Comprehensive experiments demonstrate that FastDAM and UniverDAM outperform the existing multiple source domain adaptation methods for the two applications.

9.
IEEE Trans Pattern Anal Mach Intell ; 34(9): 1667-80, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22201057

ABSTRACT

We propose a visual event recognition framework for consumer videos by leveraging a large amount of loosely labeled web videos (e.g., from YouTube). Observing that consumer videos generally contain large intraclass variations within the same type of events, we first propose a new method, called Aligned Space-Time Pyramid Matching (ASTPM), to measure the distance between any two video clips. Second, we propose a new transfer learning method, referred to as Adaptive Multiple Kernel Learning (A-MKL), in order to 1) fuse the information from multiple pyramid levels and features (i.e., space-time features and static SIFT features) and 2) cope with the considerable variation in feature distributions between videos from two domains (i.e., web video domain and consumer video domain). For each pyramid level and each type of local features, we first train a set of SVM classifiers based on the combined training set from two domains by using multiple base kernels from different kernel types and parameters, which are then fused with equal weights to obtain a prelearned average classifier. In A-MKL, for each event class we learn an adapted target classifier based on multiple base kernels and the prelearned average classifiers from this event class or all the event classes by minimizing both the structural risk functional and the mismatch between data distributions of two domains. Extensive experiments demonstrate the effectiveness of our proposed framework that requires only a small number of labeled consumer videos by leveraging web data. We also conduct an in-depth investigation on various aspects of the proposed method A-MKL, such as the analysis on the combination coefficients on the prelearned classifiers, the convergence of the learning algorithm, and the performance variation by using different proportions of labeled consumer videos. Moreover, we show that A-MKL using the prelearned classifiers from all the event classes leads to better performance when compared with A-MK- using the prelearned classifiers only from each individual event class.


Subject(s)
Human Activities/classification , Internet , Pattern Recognition, Automated/methods , Support Vector Machine , Video Recording/methods , Humans , Reproducibility of Results
10.
Int J Comput Biol Drug Des ; 4(2): 194-215, 2011.
Article in English | MEDLINE | ID: mdl-21712568

ABSTRACT

Most phenotype-identification methods in cell-based screening assume prior knowledge about expected phenotypes or involve intricate parameter-setting. They are useful for analysis targeting known phenotype properties; but need exists to explore, with minimum presumptions, the potentially-interesting phenotypes derivable from data. We present a method for this exploration, using clustering to eliminate phenotype-labelling requirement and GUI visualisation to facilitate parameter-setting. The steps are: outlier-removal, cell clustering and interactive visualisation for phenotypes refinement. For drug-siRNA study, we introduce an auto-merging procedure to reduce phenotype redundancy. We validated the method on two Golgi apparatus screens and showcase its contribution for better understanding of screening-images.


Subject(s)
Drug Evaluation, Preclinical/statistics & numerical data , RNA, Small Interfering/genetics , Cluster Analysis , Data Interpretation, Statistical , Databases, Factual , Golgi Apparatus/drug effects , Golgi Apparatus/genetics , HeLa Cells , Humans , Phenotype , Systems Biology/statistics & numerical data , User-Computer Interface
11.
IEEE Trans Image Process ; 20(11): 3280-90, 2011 Nov.
Article in English | MEDLINE | ID: mdl-21659029

ABSTRACT

Given a textual query in traditional text-based image retrieval (TBIR), relevant images are to be reranked using visual features after the initial text-based search. In this paper, we propose a new bag-based reranking framework for large-scale TBIR. Specifically, we first cluster relevant images using both textual and visual features. By treating each cluster as a "bag" and the images in the bag as "instances," we formulate this problem as a multi-instance (MI) learning problem. MI learning methods such as mi-SVM can be readily incorporated into our bag-based reranking framework. Observing that at least a certain portion of a positive bag is of positive instances while a negative bag might also contain positive instances, we further use a more suitable generalized MI (GMI) setting for this application. To address the ambiguities on the instance labels in the positive and negative bags under this GMI setting, we develop a new method referred to as GMI-SVM to enhance retrieval performance by propagating the labels from the bag level to the instance level. To acquire bag annotations for (G)MI learning, we propose a bag ranking method to rank all the bags according to the defined bag ranking score. The top ranked bags are used as pseudopositive training bags, while pseudonegative training bags can be obtained by randomly sampling a few irrelevant images that are not associated with the textual query. Comprehensive experiments on the challenging real-world data set NUS-WIDE demonstrate our framework with automatic bag annotation can achieve the best performances compared with existing image reranking methods. Our experiments also demonstrate that GMI-SVM can achieve better performances when using the manually labeled training bags obtained from relevance feedback.


Subject(s)
Algorithms , Image Enhancement/methods , Internet , Pattern Recognition, Automated/methods , Databases, Factual , Image Interpretation, Computer-Assisted/methods , Semantics
12.
IEEE Trans Pattern Anal Mach Intell ; 33(5): 1022-36, 2011 May.
Article in English | MEDLINE | ID: mdl-20714015

ABSTRACT

The rapid popularization of digital cameras and mobile phone cameras has led to an explosive growth of personal photo collections by consumers. In this paper, we present a real-time textual query-based personal photo retrieval system by leveraging millions of Web images and their associated rich textual descriptions (captions, categories, etc.). After a user provides a textual query (e.g., "water"), our system exploits the inverted file to automatically find the positive Web images that are related to the textual query "water" as well as the negative Web images that are irrelevant to the textual query. Based on these automatically retrieved relevant and irrelevant Web images, we employ three simple but effective classification methods, k-Nearest Neighbor (kNN), decision stumps, and linear SVM, to rank personal photos. To further improve the photo retrieval performance, we propose two relevance feedback methods via cross-domain learning, which effectively utilize both the Web images and personal images. In particular, our proposed crossdomain learning methods can learn robust classifiers with only a very limited amount of labeled personal photos from the user by leveraging the prelearned linear SVM classifiers in real time. We further propose an incremental cross-domain learning method in order to significantly accelerate the relevance feedback process on large consumer photo databases. Extensive experiments on two consumer photo data sets demonstrate the effectiveness and efficiency of our system, which is also inherently not limited by any predefined lexicon.

13.
IEEE Trans Image Process ; 19(7): 1921-32, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20215078

ABSTRACT

We propose a unified manifold learning framework for semi-supervised and unsupervised dimension reduction by employing a simple but effective linear regression function to map the new data points. For semi-supervised dimension reduction, we aim to find the optimal prediction labels F for all the training samples X, the linear regression function h(X) and the regression residue F(0) = F - h(X) simultaneously. Our new objective function integrates two terms related to label fitness and manifold smoothness as well as a flexible penalty term defined on the residue F(0). Our Semi-Supervised learning framework, referred to as flexible manifold embedding (FME), can effectively utilize label information from labeled data as well as a manifold structure from both labeled and unlabeled data. By modeling the mismatch between h(X) and F, we show that FME relaxes the hard linear constraint F = h(X) in manifold regularization (MR), making it better cope with the data sampled from a nonlinear manifold. In addition, we propose a simplified version (referred to as FME/U) for unsupervised dimension reduction. We also show that our proposed framework provides a unified view to explain and understand many semi-supervised, supervised and unsupervised dimension reduction techniques. Comprehensive experiments on several benchmark databases demonstrate the significant improvement over existing dimension reduction algorithms.

14.
IEEE Trans Neural Netw ; 18(3): 778-85, 2007 May.
Article in English | MEDLINE | ID: mdl-17526343

ABSTRACT

Single-class minimax probability machines (MPMs) offer robust novelty detection with distribution-free worst case bounds on the probability that a pattern will fall inside the normal region. However, in practice, they are too cautious in labeling patterns as outlying and so have a high false negative rate (FNR). In this paper, we propose a more aggressive version of the single-class MPM that bounds the best case probability that a pattern will fall inside the normal region. These two MPMs can then be used together to delimit the solution space. By using the hyperplane lying in the middle of this pair of MPMs, a better compromise between false positives (FPs) and false negatives (FNs), and between recall and precision can be obtained. Experiments on the real-world data sets show encouraging results.


Subject(s)
Algorithms , Artificial Intelligence , Creativity , Decision Support Techniques , Game Theory , Models, Statistical , Pattern Recognition, Automated/methods , Computer Simulation , Information Storage and Retrieval/methods , Neural Networks, Computer
15.
IEEE Trans Neural Netw ; 17(5): 1126-40, 2006 Sep.
Article in English | MEDLINE | ID: mdl-17001975

ABSTRACT

Kernel methods, such as the support vector machine (SVM), are often formulated as quadratic programming (QP) problems. However, given m training patterns, a naive implementation of the QP solver takes O(m3) training time and at least O(m2) space. Hence, scaling up these QPs is a major stumbling block in applying kernel methods on very large data sets, and a replacement of the naive method for finding the QP solutions is highly desirable. Recently, by using approximation algorithms for the minimum enclosing ball (MEB) problem, we proposed the core vector machine (CVM) algorithm that is much faster and can handle much larger data sets than existing SVM implementations. However, the CVM can only be used with certain kernel functions and kernel methods. For example, the very popular support vector regression (SVR) cannot be used with the CVM. In this paper, we introduce the center-constrained MEB problem and subsequently extend the CVM algorithm. The generalized CVM algorithm can now be used with any linear/nonlinear kernel and can also be applied to kernel methods such as SVR and the ranking SVM. Moreover, like the original CVM, its asymptotic time complexity is again linear in m and its space complexity is independent of m. Experiments show that the generalized CVM has comparable performance with state-of-the-art SVM and SVR implementations, but is faster and produces fewer support vectors on very large data sets.


Subject(s)
Algorithms , Artificial Intelligence , Computing Methodologies , Pattern Recognition, Automated/methods , Cluster Analysis , Neural Networks, Computer
16.
IEEE Trans Neural Netw ; 17(1): 48-58, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16526475

ABSTRACT

The kernel function plays a central role in kernel methods. Most existing methods can only adapt the kernel parameters or the kernel matrix based on empirical data. Recently, Ong et al. introduced the method of hyperkernels which can be used to learn the kernel function directly in an inductive setting. However, the associated optimization problem is a semidefinite program (SDP), which is very computationally expensive, even with the recent advances in interior point methods. In this paper, we show that this learning problem can be equivalently reformulated as a second-order cone program (SOCP), which can then be solved more efficiently than SDPs. Comparison is also made with the kernel matrix learning method proposed by Lanckriet et aL Experimental results on both classification and regression problems, with toy and real-world data sets, show that our proposed SOCP formulation has significant speedup over the original SDP formulation. Moreover, it yields better generalization than Lanckriet et al.'s method, with a speed that is comparable, or sometimes even faster, than their quadratically constrained quadratic program (QCQP) formulation.


Subject(s)
Artificial Intelligence , Software , Algorithms , Data Interpretation, Statistical , Disease/classification , Normal Distribution , Regression Analysis
17.
IEEE Trans Neural Netw ; 15(6): 1517-25, 2004 Nov.
Article in English | MEDLINE | ID: mdl-15565778

ABSTRACT

In this paper, we address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method which relies on nonlinear optimization, our proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is noniterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering on the USPS data set show much improved performance.


Subject(s)
Algorithms , Artificial Intelligence , Decision Support Techniques , Feedback , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Cluster Analysis , Computer Simulation , Neural Networks, Computer , Principal Component Analysis
18.
IEEE Trans Neural Netw ; 15(6): 1555-61, 2004 Nov.
Article in English | MEDLINE | ID: mdl-15565781

ABSTRACT

Many vision-related processing tasks, such as edge detection, image segmentation and stereo matching, can be performed more easily when all objects in the scene are in good focus. However, in practice, this may not be always feasible as optical lenses, especially those with long focal lengths, only have a limited depth of field. One common approach to recover an everywhere-in-focus image is to use wavelet-based image fusion. First, several source images with different focuses of the same scene are taken and processed with the discrete wavelet transform (DWT). Among these wavelet decompositions, the wavelet coefficient with the largest magnitude is selected at each pixel location. Finally, the fused image can be recovered by performing the inverse DWT. In this paper, we improve this fusion procedure by applying the discrete wavelet frame transform (DWFT) and the support vector machines (SVM). Unlike DWT, DWFT yields a translation-invariant signal representation. Using features extracted from the DWFT coefficients, a SVM is trained to select the source image that has the best focus at each pixel location, and the corresponding DWFT coefficients are then incorporated into the composite wavelet representation. Experimental results show that the proposed method outperforms the traditional approach both visually and quantitatively.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Subtraction Technique , Computer Simulation , Computing Methodologies , Fixation, Ocular , Image Enhancement/methods , Neural Networks, Computer
SELECTION OF CITATIONS
SEARCH DETAIL
...