Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
Article in English | MEDLINE | ID: mdl-37342945

ABSTRACT

Data augmentation is an effective way to improve the generalization of deep learning models. However, the underlying augmentation methods mainly rely on handcrafted operations, such as flipping and cropping for image data. These augmentation methods are often designed based on human expertise or repeated trials. Meanwhile, automated data augmentation (AutoDA) is a promising research direction that frames the data augmentation process as a learning task and finds the most effective way to augment the data. In this survey, we categorize recent AutoDA methods into the composition-, mixing-, and generation-based approaches and analyze each category in detail. Based on the analysis, we discuss the challenges and future prospects as well as provide guidelines for applying AutoDA methods by considering the dataset, computation effort, and availability of domain-specific transformations. It is hoped that this article can provide a useful list of AutoDA methods and guidelines for data partitioners when deploying AutoDA in practice. The survey can also serve as a reference for further study by researchers in this emerging research area.

2.
IEEE Trans Cybern ; 46(1): 27-38, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26208374

ABSTRACT

In recent years, multimedia retrieval has sparked much research interest in the multimedia, pattern recognition, and data mining communities. Although some attempts have been made along this direction, performing fast multimodal search at very large scale still remains a major challenge in the area. While hashing-based methods have recently achieved promising successes in speeding-up large-scale similarity search, most existing methods are only designed for uni-modal data, making them unsuitable for multimodal multimedia retrieval. In this paper, we propose a new hashing-based method for fast multimodal multimedia retrieval. The method is based on spectral analysis of the correlation matrix of different modalities. We also develop an efficient algorithm that learns some parameters from the data distribution for obtaining the binary codes. We empirically compare our method with some state-of-the-art methods on two real-world multimedia data sets.

3.
IEEE Trans Syst Man Cybern B Cybern ; 42(2): 308-19, 2012 Apr.
Article in English | MEDLINE | ID: mdl-22010154

ABSTRACT

Traditional learning algorithms use only labeled data for training. However, labeled examples are often difficult or time consuming to obtain since they require substantial human labeling efforts. On the other hand, unlabeled data are often relatively easy to collect. Semisupervised learning addresses this problem by using large quantities of unlabeled data with labeled data to build better learning algorithms. In this paper, we use the manifold regularization approach to formulate the semisupervised learning problem where a regularization framework which balances a tradeoff between loss and penalty is established. We investigate different implementations of the loss function and identify the methods which have the least computational expense. The regularization hyperparameter, which determines the balance between loss and penalty, is crucial to model selection. Accordingly, we derive an algorithm that can fit the entire path of solutions for every value of the hyperparameter. Its computational complexity after preprocessing is quadratic only in the number of labeled examples rather than the total number of labeled and unlabeled examples.


Subject(s)
Algorithms , Artificial Intelligence , Models, Theoretical , Biometric Identification , Databases, Factual , Humans , Pattern Recognition, Automated/methods
4.
IEEE Trans Neural Netw ; 22(8): 1207-17, 2011 Aug.
Article in English | MEDLINE | ID: mdl-21724506

ABSTRACT

Generalized discriminant analysis (GDA) is a commonly used method for dimensionality reduction. In its general form, it seeks a nonlinear projection that simultaneously maximizes the between-class dissimilarity and minimizes the within-class dissimilarity to increase class separability. In real-world applications where labeled data are scarce, GDA may not work very well. However, unlabeled data are often available in large quantities at very low cost. In this paper, we propose a novel GDA algorithm which is abbreviated as semisupervised generalized discriminant analysis (SSGDA). We utilize unlabeled data to maximize an optimality criterion of GDA and formulate the problem as an optimization problem that is solved using the constrained concave-convex procedure. The optimization procedure leads to estimation of the class labels for the unlabeled data. We propose a novel confidence measure and a method for selecting those unlabeled data points whose labels are estimated with high confidence. The selected unlabeled data can then be used to augment the original labeled dataset for performing GDA. We also propose a variant of SSGDA, called M-SSGDA, which adopts the manifold assumption to utilize the unlabeled data. Extensive experiments on many benchmark datasets demonstrate the effectiveness of our proposed methods.


Subject(s)
Algorithms , Databases, Factual/classification , Discriminant Analysis , Databases, Factual/economics
5.
IEEE Trans Neural Netw ; 19(10): 1753-67, 2008 Oct.
Article in English | MEDLINE | ID: mdl-18842479

ABSTRACT

In this paper, regularization path algorithms were proposed as a novel approach to the model selection problem by exploring the path of possibly all solutions with respect to some regularization hyperparameter in an efficient way. This approach was later extended to a support vector regression (SVR) model called epsilon-SVR. However, the method requires that the error parameter epsilon be set a priori. This is only possible if the desired accuracy of the approximation can be specified in advance. In this paper, we analyze the solution space for epsilon-SVR and propose a new solution path algorithm, called epsilon-path algorithm, which traces the solution path with respect to the hyperparameter epsilon rather than lambda. Although both two solution path algorithms possess the desirable piecewise linearity property, our epsilon-path algorithm overcomes some limitations of the original lambda-path algorithm and has more advantages. It is thus more appealing for practical use.


Subject(s)
Algorithms , Artificial Intelligence , Models, Statistical , Neural Networks, Computer , Numerical Analysis, Computer-Assisted , Computer Simulation , Feedback , Regression Analysis
6.
Neural Comput ; 20(11): 2839-61, 2008 Nov.
Article in English | MEDLINE | ID: mdl-18439136

ABSTRACT

In recent years, metric learning in the semisupervised setting has aroused a lot of research interest. One type of semisupervised metric learning utilizes supervisory information in the form of pairwise similarity or dissimilarity constraints. However, most methods proposed so far are either limited to linear metric learning or unable to scale well with the data set size. In this letter, we propose a nonlinear metric learning method based on the kernel approach. By applying low-rank approximation to the kernel matrix, our method can handle significantly larger data sets. Moreover, our low-rank approximation scheme can naturally lead to out-of-sample generalization. Experiments performed on both artificial and real-world data show very promising results.


Subject(s)
Algorithms , Artificial Intelligence , Learning , Pattern Recognition, Automated/methods , Computer Simulation , Humans
7.
IEEE Trans Neural Netw ; 18(1): 141-9, 2007 Jan.
Article in English | MEDLINE | ID: mdl-17278468

ABSTRACT

While distance function learning for supervised learning tasks has a long history, extending it to learning tasks with weaker supervisory information has only been studied recently. In particular, some methods have been proposed for semisupervised metric learning based on pairwise similarity or dissimilarity information. In this paper, we propose a kernel approach for semisupervised metric learning and present in detail two special cases of this kernel approach. The metric learning problem is thus formulated as an optimization problem for kernel learning. An attractive property of the optimization problem is that it is convex and, hence, has no local optima. While a closed-form solution exists for the first special case, the second case is solved using an iterative majorization procedure to estimate the optimal solution asymptotically. Experimental results based on both synthetic and real-world data show that this new kernel approach is promising for nonlinear metric learning.


Subject(s)
Algorithms , Artificial Intelligence , Computing Methodologies , Models, Theoretical , Pattern Recognition, Automated/methods , Cluster Analysis , Computer Simulation
SELECTION OF CITATIONS
SEARCH DETAIL
...