RESUMO
Using a nonconvex nonsmooth optimization approach, we introduce a model for semisupervised clustering (SSC) with pairwise constraints. In this model, the objective function is represented as a sum of three terms: the first term reflects the clustering error for unlabeled data points, the second term expresses the error for data points with must-link (ML) constraints, and the third term represents the error for data points with cannot-link (CL) constraints. This function is nonconvex and nonsmooth. To find its optimal solutions, we introduce an adaptive SSC (A-SSC) algorithm. This algorithm is based on the combination of the nonsmooth optimization method and an incremental approach, which involves the auxiliary SSC problem. The algorithm constructs clusters incrementally starting from one cluster and gradually adding one cluster center at each iteration. The solutions to the auxiliary SSC problem are utilized as starting points for solving the nonconvex SSC problem. The discrete gradient method (DGM) of nonsmooth optimization is applied to solve the underlying nonsmooth optimization problems. This method does not require subgradient evaluations and uses only function values. The performance of the A-SSC algorithm is evaluated and compared with four benchmarking SSC algorithms on one synthetic and 12 real-world datasets. Results demonstrate that the proposed algorithm outperforms the other four algorithms in identifying compact and well-separated clusters while satisfying most constraints.
RESUMO
Community detection problem is a projection of data clustering where the network's topological properties are only considered for measuring similarities among nodes. Also, finding communities' kernel nodes and expanding a community from kernel will certainly help us to find optimal communities. Among the existing community detection approaches, the affinity propagation (AP)-based method has been showing promising results and does not require any predefined information such as the number of clusters (communities). AP is an exemplar-based clustering method that defines the negative real-valued similarity measure sim(i, k) between data point i and exemplar k as the probability of k being the exemplar of data point i. According to our intuition, the value of sim(i, k) should not be identical to sim(k, i). In this study, a new version of AP using an adaptive similarity matrix, namely affinity propagation with adaptive similarity (APAS) matrix, is proposed, which could efficiently show the leadership probabilities between data points. APAS can adaptively transform the symmetric similarity matrix into an asymmetric one. It outperforms AP method in terms of accuracy. Extensive experiments conducted on artificial and real-world networks demonstrate the effectiveness of our approach.