Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
Add more filters










Publication year range
1.
Article in English | MEDLINE | ID: mdl-37022881

ABSTRACT

Motivated by both the commonly used "from wholly coarse to locally fine" cognitive behavior and the recent finding that simple yet interpretable linear regression model should be a basic component of a classifier, a novel hybrid ensemble classifier called hybrid Takagi-Sugeno-Kang fuzzy classifier (H-TSK-FC) and its residual sketch learning (RSL) method are proposed. H-TSK-FC essentially shares the virtues of both deep and wide interpretable fuzzy classifiers and simultaneously has both feature-importance-based and linguistic-based interpretabilities. RSL method is featured as follows: 1) a global linear regression subclassifier on all original features of all training samples is generated quickly by the sparse representation-based linear regression subclassifier training procedure to identify/understand the importance of each feature and partition the output residuals of the incorrectly classified training samples into several residual sketches; 2) by using both the enhanced soft subspace clustering method (ESSC) for the linguistically interpretable antecedents of fuzzy rules and the least learning machine (LLM) for the consequents of fuzzy rules on residual sketches, several interpretable Takagi-Sugeno-Kang (TSK) fuzzy subclassifiers are stacked in parallel through residual sketches and accordingly generated to achieve local refinements; and 3) the final predictions are made to further enhance H-TSK-FC's generalization capability and decide which interpretable prediction route should be used by taking the minimal-distance-based priority for all the constructed subclassifiers. In contrast to existing deep or wide interpretable TSK fuzzy classifiers, benefiting from the use of feature-importance-based interpretability, H-TSK-FC has been experimentally witnessed to have faster running speed and better linguistic interpretability (i.e., fewer rules and/or TSK fuzzy subclassifiers and smaller model complexities) yet keep at least comparable generalization capability.

2.
IEEE Trans Cybern ; 52(7): 6857-6871, 2022 Jul.
Article in English | MEDLINE | ID: mdl-33284765

ABSTRACT

While input or output-perturbation-based adversarial training techniques have been exploited to enhance the generalization capability of a variety of nonfuzzy and fuzzy classifiers by means of dynamic regularization, their performance may perhaps be very sensitive to some inappropriate adversarial samples. In order to avoid this weakness and simultaneously ensure enhanced generalization capability, this work attempts to explore a novel knowledge adversarial attack model for the zero-order Tagaki-Sugeno-Kang (TSK) fuzzy classifiers. The proposed model is motivated by exploiting the existence of special knowledge adversarial attacks from the perspective of the human-like thinking process when training an interpretable zero-order TSK fuzzy classifier. Without any direct use of adversarial samples, which is different from input or output perturbation-based adversarial attacks, the proposed model considers adversarial perturbations of interpretable zero-order fuzzy rules in a knowledge-oblivion and/or knowledge-bias or their ensemble to mimic the robust use of knowledge in the human thinking process. Through dynamic regularization, the proposed model is theoretically justified for its strong generalization capability. Accordingly, a novel knowledge adversarial training method called KAT is devised to achieve promising generalization performance, interpretability, and fast training for zero-order TSK fuzzy classifiers. The effectiveness of KAT is manifested by the experimental results on 15 benchmarking UCI and KEEL datasets.


Subject(s)
Fuzzy Logic , Neural Networks, Computer , Algorithms , Humans
3.
IEEE Trans Neural Netw Learn Syst ; 32(5): 1935-1948, 2021 May.
Article in English | MEDLINE | ID: mdl-32497008

ABSTRACT

Network embedding is a highly effective method to learn low-dimensional node vector representations with original network structures being well preserved. However, existing network embedding algorithms are mostly developed for a single network, which fails to learn generalized feature representations across different networks. In this article, we study a cross-network node classification problem, which aims at leveraging the abundant labeled information from a source network to help classify the unlabeled nodes in a target network. To succeed in such a task, transferable features should be learned for nodes across different networks. To this end, a novel cross-network deep network embedding (CDNE) model is proposed to incorporate domain adaptation into deep network embedding in order to learn label-discriminative and network-invariant node vector representations. On the one hand, CDNE leverages network structures to capture the proximities between nodes within a network, by mapping more strongly connected nodes to have more similar latent vector representations. On the other hand, node attributes and labels are leveraged to capture the proximities between nodes across different networks by making the same labeled nodes across networks have aligned latent vector representations. Extensive experiments have been conducted, demonstrating that the proposed CDNE model significantly outperforms the state-of-the-art network embedding algorithms in cross-network node classification.

4.
IEEE Trans Cybern ; 50(4): 1556-1568, 2020 Apr.
Article in English | MEDLINE | ID: mdl-30307885

ABSTRACT

Network embedding has attracted an increasing attention over the past few years. As an effective approach to solve graph mining problems, network embedding aims to learn a low-dimensional feature vector representation for each node of a given network. The vast majority of existing network embedding algorithms, however, are only designed for unsigned networks, and the signed networks containing both positive and negative links, have pretty distinct properties from the unsigned counterpart. In this paper, we propose a deep network embedding model to learn the low-dimensional node vector representations with structural balance preservation for the signed networks. The model employs a semisupervised stacked auto-encoder to reconstruct the adjacency connections of a given signed network. As the adjacency connections are overwhelmingly positive in the real-world signed networks, we impose a larger penalty to make the auto-encoder focus more on reconstructing the scarce negative links than the abundant positive links. In addition, to preserve the structural balance property of signed networks, we design the pairwise constraints to make the positively connected nodes much closer than the negatively connected nodes in the embedding space. Based on the network representations learned by the proposed model, we conduct link sign prediction and community detection in signed networks. Extensive experimental results in real-world datasets demonstrate the superiority of the proposed model over the state-of-the-art network embedding algorithms for graph representation learning in signed networks.

5.
IEEE Trans Neural Syst Rehabil Eng ; 27(4): 630-642, 2019 04.
Article in English | MEDLINE | ID: mdl-30872235

ABSTRACT

Electroencephalogram (EEG) signal recognition based on machine learning models is becoming more and more attractive in epilepsy detection. For multiclass epileptic EEG signal recognition tasks including the detection of epileptic EEG signals from different blends of different background data and epilepsy EEG data and the classification of different types of seizures, we may perhaps encounter two serious challenges: (1) a large amount of EEG signal data for training are not available and (2) the models for epileptic EEG signal recognition are often so complicated that they are not as easy to explain as a linear model. In this paper, we utilize the proposed transfer learning technique to circumvent the first challenge and then design a novel linear model to circumvent the second challenge. Concretely, we originally combine γ -LSR with transfer learning to propose a novel knowledge and label space inductive transfer learning model for multiclass EEG signal recognition. By transferring both knowledge and the proposed generalized label space from the source domain to the target domain, the proposed model achieves enhanced classification performance on the target domain without the use of kernel trick. In contrast to the other inductive transfer learning methods, the method uses the generalized linear model such that it becomes simpler and more interpretable. Experimental results indicate the effectiveness of the proposed method for multiclass epileptic EEG signal recognition.


Subject(s)
Electroencephalography/methods , Epilepsy/physiopathology , Algorithms , Databases, Factual , Epilepsy/diagnosis , Humans , Linear Models , Machine Learning , Seizures/diagnosis , Seizures/physiopathology , Support Vector Machine , Transfer, Psychology
6.
IEEE Trans Neural Syst Rehabil Eng ; 25(12): 2270-2284, 2017 12.
Article in English | MEDLINE | ID: mdl-28880184

ABSTRACT

Recognition of epileptic seizures from offline EEG signals is very important in clinical diagnosis of epilepsy. Compared with manual labeling of EEG signals by doctors, machine learning approaches can be faster and more consistent. However, the classification accuracy is usually not satisfactory for two main reasons: the distributions of the data used for training and testing may be different, and the amount of training data may not be enough. In addition, most machine learning approaches generate black-box models that are difficult to interpret. In this paper, we integrate transductive transfer learning, semi-supervised learning and TSK fuzzy system to tackle these three problems. More specifically, we use transfer learning to reduce the discrepancy in data distribution between the training and testing data, employ semi-supervised learning to use the unlabeled testing data to remedy the shortage of training data, and adopt TSK fuzzy system to increase model interpretability. Two learning algorithms are proposed to train the system. Our experimental results show that the proposed approaches can achieve better performance than many state-of-the-art seizure classification algorithms.


Subject(s)
Electroencephalography/classification , Fuzzy Logic , Seizures/classification , Supervised Machine Learning , Transfer, Psychology , Algorithms , Epilepsy/diagnosis , Humans , Models, Statistical , Pattern Recognition, Automated , Reproducibility of Results , Software , Support Vector Machine
7.
Neural Netw ; 93: 256-266, 2017 Sep.
Article in English | MEDLINE | ID: mdl-28715693

ABSTRACT

Guessing what user may like is now a typical interface for video recommendation. Nowadays, the highly popular user generated content sites provide various sources of information such as tags for recommendation tasks. Motivated by a real world online video recommendation problem, this work targets at the long tail phenomena of user behavior and the sparsity of item features. A personalized compound recommendation framework for online video recommendation called Dirichlet mixture probit model for information scarcity (DPIS) is hence proposed. Assuming that each clicking sample is generated from a representation of user preferences, DPIS models the sample level topic proportions as a multinomial item vector, and utilizes topical clustering on the user part for recommendation through a probit classifier. As demonstrated by the real-world application, the proposed DPIS achieves better performance in accuracy, perplexity as well as diversity in coverage than traditional methods.


Subject(s)
Neural Networks, Computer , Cluster Analysis , Datasets as Topic/standards , Datasets as Topic/statistics & numerical data , Internet
8.
Neural Netw ; 75: 110-25, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26773824

ABSTRACT

In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like µ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets.


Subject(s)
Support Vector Machine , Algorithms , Artificial Intelligence , Databases, Factual/statistics & numerical data
9.
IEEE Trans Cybern ; 46(12): 2924-2937, 2016 Dec.
Article in English | MEDLINE | ID: mdl-26571545

ABSTRACT

Many traditional semi-supervised learning algorithms not only train on the labeled samples but also incorporate the unlabeled samples in the training sets through an automated labeling process such as manifold preserving. If some labeled samples are falsely labeled, the automated labeling process will generally propagate negative impact on the classifier in quite a serious manner. In order to avoid such an error propagating effect, the unlabeled samples should not be directly incorporated into the training sets during the automated labeling strategy. In this paper, a new semi-supervised support vector machine with extended hidden features (SSVM-EHF) is presented to address this issue. According to the maximum margin principle and the minimum integrated squared error between the probability distributions of the labeled and unlabeled samples, the dimensionality of the labeled and unlabeled samples is extended through an orthonormal transformation to generate the corresponding hidden features shared by the labeled and unlabeled samples. After doing so, the last step in the process of training of SSVM-EHF is done only on the labeled samples with their original and hidden features, and the unlabeled samples are no longer explicitly used. Experimental results confirm the effectiveness of the proposed method.

10.
IEEE Trans Cybern ; 45(4): 688-701, 2015 Apr.
Article in English | MEDLINE | ID: mdl-25069132

ABSTRACT

Clustering with multiview data is becoming a hot topic in data mining, pattern recognition, and machine learning. In order to realize an effective multiview clustering, two issues must be addressed, namely, how to combine the clustering result from each view and how to identify the importance of each view. In this paper, based on a newly proposed objective function which explicitly incorporates two penalty terms, a basic multiview fuzzy clustering algorithm, called collaborative fuzzy c-means (Co-FCM), is firstly proposed. It is then extended into its weighted view version, called weighted view collaborative fuzzy c-means (WV-Co-FCM), by identifying the importance of each view. The WV-Co-FCM algorithm indeed tackles the above two issues simultaneously. Its relationship with the latest multiview fuzzy clustering algorithm Collaborative Fuzzy K-Means (Co-FKM) is also revealed. Extensive experimental results on various multiview datasets indicate that the proposed WV-Co-FCM algorithm outperforms or is at least comparable to the existing state-of-the-art multitask and multiview clustering algorithms and the importance of different views of the datasets can be effectively identified.

11.
IEEE Trans Cybern ; 45(9): 1953-66, 2015 Sep.
Article in English | MEDLINE | ID: mdl-25423663

ABSTRACT

When facing multitask-learning problems, it is desirable that the learning method could find the correct input-output features and share the commonality among multiple domains and also scale-up for large multitask datasets. We introduce the multitask coupled logistic regression (LR) framework called LR-based multitask classification learning algorithm (MTC-LR), which is a new method for generating each classifier for each task, capable of sharing the commonality among multitask domains. The basic idea of MTC-LR is to use all individual LR based classifiers, each one appropriate for each task domain, but in contrast to other support vector machine (SVM)-based proposals, learning all the parameter vectors of all individual classifiers by using the conjugate gradient method, in a global way and without the use of kernel trick, and being easily extended into its scaled version. We theoretically show that the addition of a new term in the cost function of the set of LRs (that penalizes the diversity among multiple tasks) produces a coupling of multiple tasks that allows MTC-LR to improve the learning performance in a LR way. This finding can make us easily integrate it with a state-of-the-art fast LR algorithm called dual coordinate descent method (CDdual) to develop its fast version MTC-LR-CDdual for large multitask datasets. The proposed algorithm MTC-LR-CDdual is also theoretically analyzed. Our experimental results on artificial and real-datasets indicate the effectiveness of the proposed algorithm MTC-LR-CDdual in classification accuracy, speed, and robustness.

12.
IEEE Trans Neural Netw Learn Syst ; 26(9): 2005-18, 2015 Sep.
Article in English | MEDLINE | ID: mdl-25376045

ABSTRACT

Recently, a time-adaptive support vector machine (TA-SVM) is proposed for handling nonstationary datasets. While attractive performance has been reported and the new classifier is distinctive in simultaneously solving several SVM subclassifiers locally and globally by using an elegant SVM formulation in an alternative kernel space, the coupling of subclassifiers brings in the computation of matrix inversion, thus resulting to suffer from high computational burden in large nonstationary dataset applications. To overcome this shortcoming, an improved TA-SVM (ITA-SVM) is proposed using a common vector shared by all the SVM subclassifiers involved. ITA-SVM not only keeps an SVM formulation, but also avoids the computation of matrix inversion. Thus, we can realize its fast version, that is, improved time-adaptive core vector machine (ITA-CVM) for large nonstationary datasets by using the CVM technique. ITA-CVM has the merit of asymptotic linear time complexity for large nonstationary datasets as well as inherits the advantage of TA-SVM. The effectiveness of the proposed classifiers ITA-SVM and ITA-CVM is also experimentally confirmed.

13.
IEEE Trans Cybern ; 45(3): 548-61, 2015 Mar.
Article in English | MEDLINE | ID: mdl-24988602

ABSTRACT

The classical fuzzy system modeling methods implicitly assume data generated from a single task, which is essentially not in accordance with many practical scenarios where data can be acquired from the perspective of multiple tasks. Although one can build an individual fuzzy system model for each task, the result indeed tells us that the individual modeling approach will get poor generalization ability due to ignoring the intertask hidden correlation. In order to circumvent this shortcoming, we consider a general framework for preserving the independent information among different tasks and mining hidden correlation information among all tasks in multitask fuzzy modeling. In this framework, a low-dimensional subspace (structure) is assumed to be shared among all tasks and hence be the hidden correlation information among all tasks. Under this framework, a multitask Takagi-Sugeno-Kang (TSK) fuzzy system model called MTCS-TSK-FS (TSK-FS for multiple tasks with common hidden structure), based on the classical L2-norm TSK fuzzy system, is proposed in this paper. The proposed model can not only take advantage of independent sample information from the original space for each task, but also effectively use the intertask common hidden structure among multiple tasks to enhance the generalization performance of the built fuzzy systems. Experiments on synthetic and real-world datasets demonstrate the applicability and distinctive performance of the proposed multitask fuzzy system model in multitask regression learning scenarios.

14.
Neural Netw ; 49: 96-106, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24211342

ABSTRACT

Most dimensionality reduction techniques are based on one metric or one kernel, hence it is necessary to select an appropriate kernel for kernel-based dimensionality reduction. Multiple kernel learning for dimensionality reduction (MKL-DR) has been recently proposed to learn a kernel from a set of base kernels which are seen as different descriptions of data. As MKL-DR does not involve regularization, it might be ill-posed under some conditions and consequently its applications are hindered. This paper proposes a multiple kernel learning framework for dimensionality reduction based on regularized trace ratio, termed as MKL-TR. Our method aims at learning a transformation into a space of lower dimension and a corresponding kernel from the given base kernels among which some may not be suitable for the given data. The solutions for the proposed framework can be found based on trace ratio maximization. The experimental results demonstrate its effectiveness in benchmark datasets, which include text, image and sound datasets, for supervised, unsupervised as well as semi-supervised settings.


Subject(s)
Artificial Intelligence , Algorithms , Cluster Analysis , Discriminant Analysis
15.
IEEE Trans Cybern ; 44(1): 1-20, 2014 Jan.
Article in English | MEDLINE | ID: mdl-23797315

ABSTRACT

Kernel methods such as the standard support vector machine and support vector regression trainings take O(N(3)) time and O(N(2)) space complexities in their naïve implementations, where N is the training set size. It is thus computationally infeasible in applying them to large data sets, and a replacement of the naive method for finding the quadratic programming (QP) solutions is highly desirable. By observing that many kernel methods can be linked up with kernel density estimate (KDE) which can be efficiently implemented by some approximation techniques, a new learning method called fast KDE (FastKDE) is proposed to scale up kernel methods. It is based on establishing a connection between KDE and the QP problems formulated for kernel methods using an entropy-based integrated-squared-error criterion. As a result, FastKDE approximation methods can be applied to solve these QP problems. In this paper, the latest advance in fast data reduction via KDE is exploited. With just a simple sampling strategy, the resulted FastKDE method can be used to scale up various kernel methods with a theoretical guarantee that their performance does not degrade a lot. It has a time complexity of O(m(3)) where m is the number of the data points sampled from the training set. Experiments on different benchmarking data sets demonstrate that the proposed method has comparable performance with the state-of-art method and it is effective for a wide range of kernel methods to achieve fast learning in large data sets.

16.
Artif Intell Med ; 57(1): 59-71, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23177025

ABSTRACT

OBJECTIVE: Detecting discords in time series is a special novelty detection task that has found many interesting applications. Unlike the traditional novelty detection methods which can make use of a separate set of normal samples to build up the model, discord detection is often provided with mixed data containing both normal and abnormal data. The objective of this work is to present an effective method to detect discords in unsynchronized periodic time series data. METHODS: The task of discord detection is considered as a problem of unsupervised learning with noise data. A new clustering algorithm named weighted spherical 1-mean with phase shift (PS-WS1M) is proposed in this work. It introduces a phase adjustment procedure into the iterative clustering process and produces a set of anomaly scores based upon which an unsupervised approach is employed to locate the discords automatically. A theoretical analysis on the robustness and convergence of PS-WS1M is also given. RESULTS: The proposed algorithm is evaluated via real-world electrocardiograms datasets extracted from the MIT-BIH database. The experimental results show that the proposed algorithm is effective and competitive for the problem of discord detection in periodic time series. Meanwhile, the robustness of PS-WS1M is also experimentally verified. As compared to some of the other discord detection methods, the proposed algorithm can always achieve ideal FScore values with most of which exceeding 0.98. CONCLUSION: The proposed PS-WS1M algorithm allows the integration of a phase adjustment procedure into the iterative clustering process and it can be successfully applied to detect discords in time series.


Subject(s)
Algorithms , Artificial Intelligence , Electrocardiography , Heart Rate , Periodicity , Signal Processing, Computer-Assisted , Cluster Analysis , Computer Simulation , Humans , Models, Theoretical , Predictive Value of Tests , Reproducibility of Results , Time Factors
17.
Neural Netw ; 36: 120-8, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23103971

ABSTRACT

Recent research indicates that the standard Minimum Enclosing Ball (MEB) or the center-constrained MEB can be used for effective training on large datasets by employing the core vector machine (CVM) or generalized CVM (GCVM). However, for another extensively-used MEB, i.e., MEB with total soft margin (T-MEB for brevity), we cannot directly employ the CVM or GCVM to realize its fast training for large datasets due to the fact that the involved inequality constraint is violated. In this paper, a fast learning algorithm called FL-TMEB for scaling up T-MEB is presented. First, FL-TMEB slightly relaxes the constraints in TMEB such that it can be equivalent to the corresponding center-constrained MEB, which can be solved with the corresponding Core Set (CS) by CVM. Then, with the help of the sub-optimal solution theorem about T-MEB, FL-TMEB attempts to obtain the extended core set (ECS) by including the neighbors of some samples in the CS into the ECS. Finally, FL-TMEB takes the optimal weights of ECS as the approximation solution of T-MEB. Experimental results on UCI and USPS datasets demonstrate that the proposed method is effective.


Subject(s)
Image Processing, Computer-Assisted , Support Vector Machine , Classification
18.
IEEE Trans Syst Man Cybern B Cybern ; 42(3): 672-87, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22318491

ABSTRACT

Although graph-based relaxed clustering (GRC) is one of the spectral clustering algorithms with straightforwardness and self-adaptability, it is sensitive to the parameters of the adopted similarity measure and also has high time complexity O(N(3)) which severely weakens its usefulness for large data sets. In order to overcome these shortcomings, after introducing certain constraints for GRC, an enhanced version of GRC [constrained GRC (CGRC)] is proposed to increase the robustness of GRC to the parameters of the adopted similarity measure, and accordingly, a novel algorithm called fast GRC (FGRC) based on CGRC is developed in this paper by using the core-set-based minimal enclosing ball approximation. A distinctive advantage of FGRC is that its asymptotic time complexity is linear with the data set size N. At the same time, FGRC also inherits the straightforwardness and self-adaptability from GRC, making the proposed FGRC a fast and effective clustering algorithm for large data sets. The advantages of FGRC are validated by various benchmarking and real data sets.


Subject(s)
Algorithms , Artificial Intelligence , Cluster Analysis , Databases, Factual , Information Storage and Retrieval/methods , Models, Theoretical , Pattern Recognition, Automated/methods , Computer Simulation
19.
Neural Netw ; 27: 60-73, 2012 Mar.
Article in English | MEDLINE | ID: mdl-22057092

ABSTRACT

Although pattern classification has been extensively studied in the past decades, how to effectively solve the corresponding training on large datasets is a problem that still requires particular attention. Many kernelized classification methods, such as SVM and SVDD, can be formulated as the corresponding quadratic programming (QP) problems, but computing the associated kernel matrices requires O(n2)(or even up to O(n3)) computational complexity, where n is the size of the training patterns, which heavily limits the applicability of these methods for large datasets. In this paper, a new classification method called the maximum vector-angular margin classifier (MAMC) is first proposed based on the vector-angular margin to find an optimal vector c in the pattern feature space, and all the testing patterns can be classified in terms of the maximum vector-angular margin ρ, between the vector c and all the training data points. Accordingly, it is proved that the kernelized MAMC can be equivalently formulated as the kernelized Minimum Enclosing Ball (MEB), which leads to a distinctive merit of MAMC, i.e., it has the flexibility of controlling the sum of support vectors like v-SVC and may be extended to a maximum vector-angular margin core vector machine (MAMCVM) by connecting the core vector machine (CVM) method with MAMC such that the corresponding fast training on large datasets can be effectively achieved. Experimental results on artificial and real datasets are provided to validate the power of the proposed methods.


Subject(s)
Artificial Intelligence , Neural Networks, Computer , Pattern Recognition, Automated , Computer Simulation
20.
Neural Netw ; 24(4): 360-9, 2011 May.
Article in English | MEDLINE | ID: mdl-21353975

ABSTRACT

As we may know well, uniqueness of the Support Vector Machines (SVM) solution has been solved. However, whether Support Vector Data Description (SVDD), another best-known machine learning method, has a unique solution or not still remains unsolved. Due to the fact that the primal optimization of SVDD is not a convex programming problem, it is difficult for us to theoretically analyze the SVDD solution in an analogous way to SVM. In this paper, we concentrate on the theoretical analysis for the solution to the primal optimization problem of SVDD. We first reformulate equivalently the primal optimization problem of SVDD into a convex programming problem, and then prove that the optimal solution with respect to the sphere center is unique, derive the necessary and sufficient conditions of non-uniqueness of the optimal solution with respect to the sphere radius in the primal optimization problem of SVDD. Moreover, we also explore the property of the SVDD solution from the perspective of the SVDD dual form. Furthermore, according to the geometric interpretation of SVDD, a method of computing the sphere radius is proposed when the optimal solution with respect to the sphere radius in the primal optimization problem is non-unique. Finally, we have several examples to illustrate these findings.


Subject(s)
Algorithms , Databases, Factual , Neural Networks, Computer , Pattern Recognition, Automated/methods , Humans , Models, Statistical
SELECTION OF CITATIONS
SEARCH DETAIL
...