Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
1.
IEEE Trans Neural Netw Learn Syst ; 33(9): 4785-4799, 2022 09.
Article in English | MEDLINE | ID: mdl-33684046

ABSTRACT

In a modern e-commerce recommender system, it is important to understand the relationships among products. Recognizing product relationships-such as complements or substitutes-accurately is an essential task for generating better recommendation results, as well as improving explainability in recommendation. Products and their associated relationships naturally form a product graph, yet existing efforts do not fully exploit the product graph's topological structure. They usually only consider the information from directly connected products. In fact, the connectivity of products a few hops away also contains rich semantics and could be utilized for improved relationship prediction. In this work, we formulate the problem as a multilabel link prediction task and propose a novel graph neural network-based framework, item relationship graph neural network (IRGNN), for discovering multiple complex relationships simultaneously. We incorporate multihop relationships of products by recursively updating node embeddings using the messages from their neighbors. An edge relational network is designed to effectively capture relational information between products. Extensive experiments are conducted on real-world product data, validating the effectiveness of IRGNN, especially on large and sparse product graphs.


Subject(s)
Algorithms , Neural Networks, Computer , Commerce , Semantics
2.
IEEE Trans Cybern ; 49(11): 3844-3858, 2019 Nov.
Article in English | MEDLINE | ID: mdl-29994699

ABSTRACT

Images are uploaded to the Internet over time which makes concept drifting and distribution change in semantic classes unavoidable. Current hashing methods being trained using a given static database may not be suitable for nonstationary semantic image retrieval problems. Moreover, directly retraining a whole hash table to update knowledge coming from new arriving image data may not be efficient. Therefore, this paper proposes a new incremental hash-bit learning method. At the arrival of new data, hash bits are selected from both existing and newly trained hash bits by an iterative maximization of a 3-component objective function. This objective function is also used to weight selected hash bits to re-rank retrieved images for better semantic image retrieval results. The three components evaluate a hash bit in three different angles: 1) information preservation; 2) partition balancing; and 3) bit angular difference. The proposed method combines knowledge retained from previously trained hash bits and new semantic knowledge learned from the new data by training new hash bits. In comparison to table-based incremental hashing, the proposed method automatically adjusts the number of bits from old data and new data according to the concept drifting in the given data via the maximization of the objective function. Experimental results show that the proposed method outperforms existing stationary hashing methods, table-based incremental hashing, and online hashing methods in 15 different simulated nonstationary data environments.

3.
IEEE Trans Cybern ; 47(11): 3814-3826, 2017 Nov.
Article in English | MEDLINE | ID: mdl-27390201

ABSTRACT

A very large volume of images is uploaded to the Internet daily. However, current hashing methods for image retrieval are designed for static databases only. They fail to consider the fact that the distribution of images can change when new images are added to the database over time. The changes in the distribution of images include both discovery of a new class and a distribution of images within a class owing to concept drift. Retraining of hash tables using all images in the database requires a large computation effort. This is also biased to old data owing to the huge volume of old images which leads to a poor retrieval performance over time. In this paper, we propose the incremental hashing (ICH) method to deal with the two aforementioned types of changes in the data distribution. The ICH uses a multihashing to retain knowledge coming from images arriving over time and a weight-based ranking to make the retrieval results adaptive to the new data environment. Experimental results show that the proposed method is effective in dealing with changes in the database.

4.
IEEE Trans Cybern ; 46(3): 766-77, 2016 Mar.
Article in English | MEDLINE | ID: mdl-25910268

ABSTRACT

Pattern recognition and machine learning techniques have been increasingly adopted in adversarial settings such as spam, intrusion, and malware detection, although their security against well-crafted attacks that aim to evade detection by manipulating data at test time has not yet been thoroughly assessed. While previous work has been mainly focused on devising adversary-aware classification algorithms to counter evasion attempts, only few authors have considered the impact of using reduced feature sets on classifier security against the same attacks. An interesting, preliminary result is that classifier security to evasion may be even worsened by the application of feature selection. In this paper, we provide a more detailed investigation of this aspect, shedding some light on the security properties of feature selection against evasion attacks. Inspired by previous work on adversary-aware classifiers, we propose a novel adversary-aware feature selection model that can improve classifier security against evasion attacks, by incorporating specific assumptions on the adversary's data manipulation strategy. We focus on an efficient, wrapper-based implementation of our approach, and experimentally validate its soundness on different application examples, including spam and malware detection.

5.
IEEE Trans Neural Netw Learn Syst ; 27(5): 978-92, 2016 May.
Article in English | MEDLINE | ID: mdl-26054075

ABSTRACT

The training of a multilayer perceptron neural network (MLPNN) concerns the selection of its architecture and the connection weights via the minimization of both the training error and a penalty term. Different penalty terms have been proposed to control the smoothness of the MLPNN for better generalization capability. However, controlling its smoothness using, for instance, the norm of weights or the Vapnik-Chervonenkis dimension cannot distinguish individual MLPNNs with the same number of free parameters or the same norm. In this paper, to enhance generalization capabilities, we propose a stochastic sensitivity measure (ST-SM) to realize a new penalty term for MLPNN training. The ST-SM determines the expectation of the squared output differences between the training samples and the unseen samples located within their Q -neighborhoods for a given MLPNN. It provides a direct measurement of the MLPNNs output fluctuations, i.e., smoothness. We adopt a two-phase Pareto-based multiobjective training algorithm for minimizing both the training error and the ST-SM as biobjective functions. Experiments on 20 UCI data sets show that the MLPNNs trained by the proposed algorithm yield better accuracies on testing data than several recent and classical MLPNN training methods.

6.
IEEE Trans Cybern ; 45(11): 2402-12, 2015 Nov.
Article in English | MEDLINE | ID: mdl-25474818

ABSTRACT

Undersampling is a widely adopted method to deal with imbalance pattern classification problems. Current methods mainly depend on either random resampling on the majority class or resampling at the decision boundary. Random-based undersampling fails to take into consideration informative samples in the data while resampling at the decision boundary is sensitive to class overlapping. Both techniques ignore the distribution information of the training dataset. In this paper, we propose a diversified sensitivity-based undersampling method. Samples of the majority class are clustered to capture the distribution information and enhance the diversity of the resampling. A stochastic sensitivity measure is applied to select samples from both clusters of the majority class and the minority class. By iteratively clustering and sampling, a balanced set of samples yielding high classifier sensitivity is selected. The proposed method yields a good generalization capability for 14 UCI datasets.

7.
IEEE Trans Neural Netw ; 21(7): 1149-57, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20659864

ABSTRACT

A novel adaptive filter is proposed using a recurrent cerebellar-model-articulation-controller (CMAC). The proposed locally recurrent globally feedforward recurrent CMAC (RCMAC) has favorable properties of small size, good generalization, rapid learning, and dynamic response, thus it is more suitable for high-speed signal processing. To provide fast training, an efficient parameter learning algorithm based on the normalized gradient descent method is presented, in which the learning rates are on-line adapted. Then the Lyapunov function is utilized to derive the conditions of the adaptive learning rates, so the stability of the filtering error can be guaranteed. To demonstrate the performance of the proposed adaptive RCMAC filter, it is applied to a nonlinear channel equalization system and an adaptive noise cancelation system. The advantages of the proposed filter over other adaptive filters are verified through simulations.


Subject(s)
Algorithms , Artificial Intelligence , Cerebellum/physiology , Nonlinear Dynamics , Animals , Biomimetics , Computer Simulation , Nerve Net/physiology
8.
IEEE Trans Neural Netw ; 18(5): 1294-305, 2007 Sep.
Article in English | MEDLINE | ID: mdl-18220181

ABSTRACT

The generalization error bounds found by current error models using the number of effective parameters of a classifier and the number of training samples are usually very loose. These bounds are intended for the entire input space. However, support vector machine (SVM), radial basis function neural network (RBFNN), and multilayer perceptron neural network (MLPNN) are local learning machines for solving problems and treat unseen samples near the training samples to be more important. In this paper, we propose a localized generalization error model which bounds from above the generalization error within a neighborhood of the training samples using stochastic sensitivity measure. It is then used to develop an architecture selection technique for a classifier with maximal coverage of unseen samples by specifying a generalization error threshold. Experiments using 17 University of California at Irvine (UCI) data sets show that, in comparison with cross validation (CV), sequential learning, and two other ad hoc methods, our technique consistently yields the best testing classification accuracy with fewer hidden neurons and less training time.


Subject(s)
Algorithms , Models, Statistical , Neural Networks, Computer , Pattern Recognition, Automated/methods , Computer Simulation , Reproducibility of Results , Sensitivity and Specificity
9.
IEEE Trans Neural Netw ; 18(5): 1453-62, 2007 Sep.
Article in English | MEDLINE | ID: mdl-18220193

ABSTRACT

The support vector machine (SVM) has been demonstrated to be a very effective classifier in many applications, but its performance is still limited as the data distribution information is underutilized in determining the decision hyperplane. Most of the existing kernels employed in nonlinear SVMs measure the similarity between a pair of pattern images based on the Euclidean inner product or the Euclidean distance of corresponding input patterns, which ignores data distribution tendency and makes the SVM essentially a "local" classifier. In this paper, we provide a step toward a paradigm of kernels by incorporating data specific knowledge into existing kernels. We first find the data structure for each class adaptively in the input space via agglomerative hierarchical clustering (AHC), and then construct the weighted Mahalanobis distance (WMD) kernels using the detected data distribution information. In WMD kernels, the similarity between two pattern images is determined not only by the Mahalanobis distance (MD) between their corresponding input patterns but also by the sizes of the clusters they reside in. Although WMD kernels are not guaranteed to be positive definite (pd) or conditionally positive definite (cpd), satisfactory classification results can still be achieved because regularizers in SVMs with WMD kernels are empirically positive in pseudo-Euclidean (pE) spaces. Experimental results on both synthetic and real-world data sets show the effectiveness of "plugging" data structure into existing kernels.


Subject(s)
Algorithms , Artificial Intelligence , Models, Statistical , Pattern Recognition, Automated/methods , Computer Simulation , Reproducibility of Results , Sensitivity and Specificity
10.
IEEE Trans Syst Man Cybern B Cybern ; 36(6): 1283-95, 2006 Dec.
Article in English | MEDLINE | ID: mdl-17186805

ABSTRACT

The one-class classification problem aims to distinguish a target class from outliers. The spherical one-class classifier (SOCC) solves this problem by finding a hypersphere with minimum volume that contains the target data while keeping outlier samples outside. SOCC achieves satisfactory performance only when the target samples have the same distribution tendency in all orientations. Therefore, the performance of the SOCC is limited in the way that many superfluous outliers might be mistakenly enclosed. The authors propose to exploit target data structures obtained via unsupervised methods such as agglomerative hierarchical clustering and use them in calculating a set of hyperellipsoidal separating boundaries. This method is named the structured one-class classifier (TOCC). The optimization problem in TOCC can be formulated as a series of second-order cone programming problems that can be solved with acceptable efficiency by primal-dual interior-point methods. The experimental results on artificially generated data sets and benchmark data sets demonstrate the advantages of TOCC.

11.
Conf Proc IEEE Eng Med Biol Soc ; 2005: 5896-9, 2005.
Article in English | MEDLINE | ID: mdl-17281602

ABSTRACT

A nonlinear canonical correlation analysis (CCA) for detecting neural activation in fMRI data is proposed in this paper. We use the BOLD response based on the HDR models with various parameters as reference signals. Instead of characterizing the relationship between the paradigm and time series using the oversimplified linear model, we employ the kernel trick that maps the intensities of the voxels within a small cubic at each time point into a high-dimensional kernel space, where the linear combinations correspond to nonlinear ones in the original space. The experimental results show that the proposed nonlinear CCA can improve the detection performance of traditional linear CCA.

12.
Article in English | MEDLINE | ID: mdl-16685892

ABSTRACT

In this paper, we propose a new approach to detect activated time series in functional MRI using support vector clustering (SVC). We extract Fourier coefficients as the features of fMRI time series and cluster these features by SVC. In SVC, these features are mapped from their original feature space to a very high dimensional kernel space. By finding a compact sphere that encloses the mapped features in the kernel space, one achieves a set of cluster boundaries in the feature space. The SVC is an effective and robust fMRI activation detection method because of its advantages in (1) better discovery of real data structure since there is no cluster shape restriction, (2) high quality detection results without explicitly specifying the number of clusters, (3) the stronger robustness due to the mechanism in outlier elimination. Experimental results on simulated and real fMRI data demonstrate the effectiveness of SVC.


Subject(s)
Algorithms , Artificial Intelligence , Brain Mapping/methods , Brain/physiology , Evoked Potentials/physiology , Image Interpretation, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Pattern Recognition, Automated/methods , Brain/anatomy & histology , Humans , Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
13.
IEEE Trans Syst Man Cybern B Cybern ; 34(5): 1979-87, 2004 Oct.
Article in English | MEDLINE | ID: mdl-15503494

ABSTRACT

When fuzzy production rules are used to approximate reasoning, interaction exists among rules that have the same consequent. Due to this interaction, the weighted average model frequently used in approximate reasoning does not work well in many real-world problems. In order to model and handle this interaction, this paper proposes to use a nonadditive nonnegative set function to replace the weights assigned to rules having the same consequent, and to draw the reasoning conclusion based on an integral with respect to the nonadditive nonnegative set function, rather than on the weighted average model. Handling interaction in fuzzy production rule reasoning in this way can lead to a good understanding of the rules base and an improvement of reasoning accuracy. This paper also investigates how to determine from data the nonadditive set function that cannot be specified by a domain expert.


Subject(s)
Algorithms , Artificial Intelligence , Decision Support Systems, Clinical , Decision Support Techniques , Diagnosis, Computer-Assisted/methods , Fuzzy Logic , Pattern Recognition, Automated , Expert Systems , Humans
14.
IEEE Trans Syst Man Cybern B Cybern ; 34(1): 409-18, 2004 Feb.
Article in English | MEDLINE | ID: mdl-15369082

ABSTRACT

Fuzzy production rules (FPRs) have been used for years to capture and represent fuzzy, vague, imprecise and uncertain domain knowledge in many fuzzy systems. There have been a lot of researches on how to generate or obtain FPRs. There exist two methods to obtain FPRs. One is by painstakingly, repeatedly and time-consuming interviewing domain experts to extract the domain knowledge. The other is by using some machine learning techniques to generate and extract FPRs from some training samples. These extracted rules, however, are found to be nonoptimal and sometimes redundant. Furthermore, these generated rules suffer from the problem of low accuracy of classifying or recognizing unseen examples. The reasons for having these problems are 1) the FPRs generated are not powerful enough to represent the domain knowledge, 2) the techniques used to generate FPRs are pre-matured, ad-hoc or may not be suitable for the problem, and 3) further refinement of the extracted rules has not been done. In this paper we look into the solutions of the above problems by 1) enhancing the representation power of FPRs by including local and global weights, 2) developing a fuzzy neural network (FNN) with enhanced learning algorithm, and 3) using this FNN to refine the local and global weights of FPRs. By experimenting our method with some existing benchmark examples, the proposed method is found to have high accuracy in classifying unseen samples without increasing the number of the FPRs extracted and the time required to consult with domain experts is greatly reduced.

15.
IEEE Trans Syst Man Cybern B Cybern ; 34(2): 834-44, 2004 Apr.
Article in English | MEDLINE | ID: mdl-15376833

ABSTRACT

This paper introduces a rough set technique for solving the problem of mining Pinyin-to-character (PTC) conversion rules. It first presents a text-structuring method by constructing a language information table from a corpus for each pinyin, which it will then apply to a free-form textual corpus. Data generalization and rule extraction algorithms can then be used to eliminate redundant information and extract consistent PTC conversion rules. The design of our model also addresses a number of important issues such as the long-distance dependency problem, the storage requirements of the rule base, and the consistency of the extracted rules, while the performance of the extracted rules as well as the effects of different model parameters are evaluated experimentally. These results show that by the smoothing method, high precision conversion (0.947) and recall rates (0.84) can be achieved even for rules represented directly by pinyin rather than words. A comparison with the baseline tri-gram model also shows good complement between our method and the tri-gram language model.

16.
Neural Comput ; 15(1): 183-212, 2003 Jan.
Article in English | MEDLINE | ID: mdl-12590825

ABSTRACT

The sensitivity of a neural network's output to its input perturbation is an important issue with both theoretical and practical values. In this article, we propose an approach to quantify the sensitivity of the most popular and general feedforward network: multilayer perceptron (MLP). The sensitivity measure is defined as the mathematical expectation of output deviation due to expected input deviation with respect to overall input patterns in a continuous interval. Based on the structural characteristics of the MLP, a bottom-up approach is adopted. A single neuron is considered first, and algorithms with approximately derived analytical expressions that are functions of expected input deviation are given for the computation of its sensitivity. Then another algorithm is given to compute the sensitivity of the entire MLP network. Computer simulations are used to verify the derived theoretical formulas. The agreement between theoretical and experimental results is quite good. The sensitivity measure can be used to evaluate the MLP's performance.


Subject(s)
Neural Networks, Computer , Sensitivity and Specificity , Evaluation Studies as Topic
SELECTION OF CITATIONS
SEARCH DETAIL
...