Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add filters








Year range
1.
Indian J Med Sci ; 2018 SEP; 70(3): 5-12
Article | IMSEAR | ID: sea-196499

ABSTRACT

Introduction:A semi-supervised clustering algorithm is proposed that combines the benefits of supervised and unsupervised learningmethods. The approach allows unlabeled data with no known class to be used to improve classification accuracy [2]. The objectivefunction of an unsupervised technique, e.g. K-means clustering, is modified to minimize both the cluster dispersion of the inputattributes and a measure of cluster impurity based on the class labels. Minimizing the cluster dispersion of the examples is a form ofcapacity control to prevent over fitting [4]. For the output labels, impurity measures from decision tree algorithms such as the Gini indexcan be used. A genetic algorithm optimizes the objective function to produce clusters. Experimental results show that using classinformation improves the generalization ability compared to unsupervised methods based only on the input attributes [6]. Trainingusing information from unlabeled data can improve classification accuracy on that data as well. Genetic Algorithms (GAs) have beenwidely used in optimization problems for their high ability in seeking better and acceptable solutions within limited time. Clusteringensemble has emerged as another flavour of optimal solutions for generating more stable and robust partition from existing clusters [1].GAs has proved a major contribution to find consensus cluster partitions during clustering ensemble. Currently, web videocategorization has been an ever challenging research area with the popularity of the social web. In this paper, we propose a framework forweb video categorization using their textual features, video relations and web support [3]. There are three contributions in this researchwork. First, we expand the traditional Vector Space Model (VSM) in a more generic manner as Semantic VSM (S-VSM) by including thesemantic similarity between the features terms [5]. This new model has improved the clustering quality in terms of compactness (highintra-cluster similarity) and clearness (low inter-cluster similarity). Second, we optimize the clustering ensemble process with the helpof GA using a novel approach of the fitness function. We define a new measure, Pre-Paired Percentage (PPP), to be used as the fitnessfunction during the genetic cycle for optimization of clustering ensemble process [7]. Third, the most important and crucial step of theGA is to define the genetic operators, crossover and mutation. We express these operators by an intelligent mechanism of clusteringensemble. This approach has produced more logical offspring solutions [9]. Above stated all three contributions have shown remarkableresults in their corresponding areas. Experiments on real world social-web data have been performed to validate our new incrementalnovelties [8]

SELECTION OF CITATIONS
SEARCH DETAIL