Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 80
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-39028595

RESUMO

Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different clusters without human annotations, is a fundamental yet challenging task. However, we observe that the existing methods suffer from the representation collapse problem and tend to encode samples with different classes into the same latent embedding. Consequently, the discriminative capability of nodes is limited, resulting in suboptimal clustering performance. To address this problem, we propose a novel deep graph clustering algorithm termed improved dual correlation reduction network (IDCRN) through improving the discriminative capability of samples. Specifically, by approximating the cross-view feature correlation matrix to an identity matrix, we reduce the redundancy between different dimensions of features, thus improving the discriminative capability of the latent space explicitly. Meanwhile, the cross-view sample correlation matrix is forced to approximate the designed clustering-refined adjacency matrix to guide the learned latent representation to recover the affinity matrix even across views, thus enhancing the discriminative capability of features implicitly. Moreover, we avoid the collapsed representation caused by the oversmoothing issue in graph convolutional networks (GCNs) through an introduced propagation regularization term, enabling IDCRN to capture the long-range information with the shallow network structure. Extensive experimental results on six benchmarks have demonstrated the effectiveness and efficiency of IDCRN compared with the existing state-of-the-art deep graph clustering algorithms. The code of IDCRN is released at https://github.com/yueliu1999/IDCRN. Besides, we share a collection of deep graph clustering, including papers, codes, and datasets at https://github.com/yueliu1999/Awesome-Deep-Graph-Clustering.

2.
Adv Sci (Weinh) ; : e2404095, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39041896

RESUMO

Compositional complex alloys, including high and medium-entropy alloys (HEAs/MEAs) have displayed significant potential as efficient electrocatalysts for the oxygen evolution reaction (OER), but their structure-activity relationship remains unclear. In particular, the basic question of which crystal facets are more active, especially considering the surface reconstructions, has yet to be answered. This study demonstrates that the lowest index {100} facets of FeCoNiCr MEAs exhibit the highest activity. The underlying mechanism associated with the {100} facet's low in-plane density, making it easier to surface reconstruction and form amorphous structures containing the true active species is uncovered. These results are validated by experiments on single crystals and polycrystal MEAs, as well as DFT calculations. The discoveries contribute to a fundamental comprehension of MEAs in electrocatalysis and offer physics-based strategies for developing electrocatalysts.

3.
Artigo em Inglês | MEDLINE | ID: mdl-38941209

RESUMO

Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering, recommendation systems, and etc. According to the graph types, existing KGR models can be roughly divided into three categories, i.e., static models, temporal models, and multi-modal models. Early works in this domain mainly focus on static KGR, and recent works try to leverage the temporal and multi-modal information, which are more practical and closer to real-world. However, no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a first survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the models are reviewed based on bi-level taxonomy, i.e., top-level (graph types) and base-level (techniques and scenarios). Besides, the performances, as well as datasets, are summarized and presented. Moreover, we point out the challenges and potential opportunities to enlighten the readers. The corresponding open-source repository is shared on GitHub https://github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.

4.
Artigo em Inglês | MEDLINE | ID: mdl-38889020

RESUMO

Since the rapid progress in multimedia and sensor technologies, multiview clustering (MVC) has become a prominent research area within machine learning and data mining, experiencing significant advancements over recent decades. MVC is distinguished from single-view clustering by its ability to integrate complementary information from multiple distinct data perspectives and enhance clustering performance. However, the efficacy of MVC methods is predicated on the availability of complete views for all samples-an assumption that frequently fails in practical scenarios where data views are often incomplete. To surmount this challenge, various approaches to incomplete MVC (IMVC) have been proposed, with deep neural networks emerging as a favored technique for their representation learning ability. Despite their promise, previous methods commonly adopt sample-level (e.g., features) or affinity-level (e.g., graphs) guidance, neglecting the discriminative label-level guidance (i.e., pseudo-labels). In this work, we propose a novel deep IMVC method termed pseudo-label propagation for deep IMVC (PLP-IMVC), which integrates high-quality pseudo-labels from the complete subset of incomplete data with deep label propagation networks to obtain improved clustering results. In particular, we first design a local model (PLP-L) that leverages pseudo-labels to their fullest extent. Then, we propose a global model (PLP-G) that exploits manifold regularization to mitigate the label noises, promote view-level information fusion, and learn discriminative unified representations. Experimental results across eight public benchmark datasets and three evaluation metrics prove our method's efficacy, demonstrating superior performance compared to 18 advanced baseline methods.

5.
Artigo em Inglês | MEDLINE | ID: mdl-38717883

RESUMO

While humans can excel at image classification tasks by comparing a few images, existing metric-based few-shot classification methods are still not well adapted to novel tasks. Performance declines rapidly when encountering new patterns, as feature embeddings cannot effectively encode discriminative properties. Moreover, existing matching methods inadequately utilize support set samples, focusing only on comparing query samples to category prototypes without exploiting contrastive relationships across categories for discriminative features. In this work, we propose a method where query samples select their most category-representative features for matching, making feature embeddings adaptable and category-related. We introduce a category alignment mechanism (CAM) to align query image features with different categories. CAM ensures features chosen for matching are distinct and strongly correlated to intra-and inter-contrastive relationships within categories, making extracted features highly related to their respective categories. CAM is parameter-free, requires no extra training to adapt to new tasks, and adjusts features for matching when task categories change. We also implement a cross-validation-based feature selection technique for support samples, generating more discriminative category prototypes. We implement two versions of inductive and transductive inference and conduct extensive experiments on six datasets to demonstrate the effectiveness of our algorithm. The results indicate that our method consistently yields performance improvements on benchmark tasks and surpasses the current state-of-the-art methods.

6.
IEEE Trans Image Process ; 33: 2995-3008, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38640047

RESUMO

Multi-view clustering (MVC) has attracted broad attention due to its capacity to exploit consistent and complementary information across views. This paper focuses on a challenging issue in MVC called the incomplete continual data problem (ICDP). Specifically, most existing algorithms assume that views are available in advance and overlook the scenarios where data observations of views are accumulated over time. Due to privacy considerations or memory limitations, previous views cannot be stored in these situations. Some works have proposed ways to handle this problem, but all of them fail to address incomplete views. Such an incomplete continual data problem (ICDP) in MVC is difficult to solve since incomplete information with continual data increases the difficulty of extracting consistent and complementary knowledge among views. We propose Fast Continual Multi-View Clustering with Incomplete Views (FCMVC-IV) to address this issue. Specifically, the method maintains a scalable consensus coefficient matrix and updates its knowledge with the incoming incomplete view rather than storing and recomputing all the data matrices. Considering that the given views are incomplete, the newly collected view might contain samples that have yet to appear; two indicator matrices and a rotation matrix are developed to match matrices with different dimensions. In addition, we design a three-step iterative algorithm to solve the resultant problem with linear complexity and proven convergence. Comprehensive experiments conducted on various datasets demonstrate the superiority of FCMVC-IV over the competing approaches. The code is publicly available at https://github.com/wanxinhang/FCMVC-IV.

7.
Artigo em Inglês | MEDLINE | ID: mdl-38557633

RESUMO

Multi-View clustering has attracted broad attention due to its capacity to utilize consistent and complementary information among views. Although tremendous progress has been made recently, most existing methods undergo high complexity, preventing them from being applied to large-scale tasks. Multi-View clustering via matrix factorization is a representative to address this issue. However, most of them map the data matrices into a fixed dimension, limiting the model's expressiveness. Moreover, a range of methods suffers from a two-step process, i.e., multimodal learning and the subsequent k -means, inevitably causing a suboptimal clustering result. In light of this, we propose a one-step multi-view clustering with diverse representation (OMVCDR) method, which incorporates multi-view learning and k -means into a unified framework. Specifically, we first project original data matrices into various latent spaces to attain comprehensive information and auto-weight them in a self-supervised manner. Then, we directly use the information matrices under diverse dimensions to obtain consensus discrete clustering labels. The unified work of representation learning and clustering boosts the quality of the final results. Furthermore, we develop an efficient optimization algorithm with proven convergence to solve the resultant problem. Comprehensive experiments on various datasets demonstrate the promising clustering performance of our proposed method. The code is publicly available at https://github.com/wanxinhang/OMVCDR.

8.
Artigo em Inglês | MEDLINE | ID: mdl-38602855

RESUMO

Existing multiple kernel clustering (MKC) algorithms have two ubiquitous problems. From the theoretical perspective, most MKC algorithms lack sufficient theoretical analysis, especially the consistency of learned parameters, such as the kernel weights. From the practical perspective, the high complexity makes MKC unable to handle large-scale datasets. This paper tries to address the above two issues. We first make a consistency analysis of an influential MKC method named Simple Multiple Kernel k-Means (SimpleMKKM). Specifically, suppose that ∧γn are the kernel weights learned by SimpleMKKM from the training samples. We also define the expected version of SimpleMKKM and denote its solution as γ*. We establish an upper bound of ||∧γn-γ*||∞ in the order of ~O(1/√n), where n is the sample number. Based on this result, we also derive its excess clustering risk calculated by a standard clustering loss function. For the large-scale extension, we replace the eigen decomposition of SimpleMKKM with singular value decomposition (SVD). Consequently, the complexity can be decreased to O(n) such that SimpleMKKM can be implemented on large-scale datasets. We then deduce several theoretical results to verify the approximation ability of the proposed SVD-based method. The results of comprehensive experiments demonstrate the superiority of the proposed method. The code is publicly available at https://github.com/weixuan-liang/SVD-based-SimpleMKKM.

9.
Artigo em Inglês | MEDLINE | ID: mdl-38648135

RESUMO

Temporal graph learning aims to generate high-quality representations for graph-based tasks with dynamic information, which has recently garnered increasing attention. In contrast to static graphs, temporal graphs are typically organized as node interaction sequences over continuous time rather than an adjacency matrix. Most temporal graph learning methods model current interactions by incorporating historical neighborhood. However, such methods only consider first-order temporal information while disregarding crucial high-order structural information, resulting in suboptimal performance. To address this issue, we propose a self-supervised method called S2T for temporal graph learning, which extracts both temporal and structural information to learn more informative node representations. Notably, the initial node representations combine first-order temporal and high-order structural information differently to calculate two conditional intensities. An alignment loss is then introduced to optimize the node representations, narrowing the gap between the two intensities and making them more informative. Concretely, in addition to modeling temporal information using historical neighbor sequences, we further consider structural knowledge at both local and global levels. At the local level, we generate structural intensity by aggregating features from high-order neighbor sequences. At the global level, a global representation is generated based on all nodes to adjust the structural intensity according to the active statuses on different nodes. Extensive experiments demonstrate that the proposed model S2T achieves at most 10.13% performance improvement compared with the state-of-the-art competitors on several datasets.

10.
Artigo em Inglês | MEDLINE | ID: mdl-38315591

RESUMO

Few-shot relation reasoning on knowledge graphs (FS-KGR) is an important and practical problem that aims to infer long-tail relations and has drawn increasing attention these years. Among all the proposed methods, self-supervised learning (SSL) methods, which effectively extract the hidden essential inductive patterns relying only on the support sets, have achieved promising performance. However, the existing SSL methods simply cut down connections between high-frequency and long-tail relations, which ignores the fact, i.e., the two kinds of information could be highly related to each other. Specifically, we observe that relations with similar contextual meanings, called aliasing relations (ARs), may have similar attributes. In other words, the ARs of the target long-tail relation could be in high-frequency, and leveraging such attributes can largely improve the reasoning performance. Based on the interesting observation above, we proposed a novel Self-supervised learning model by leveraging Aliasing Relations to assist FS-KGR, termed . Specifically, we propose a graph neural network (GNN)-based AR-assist module to encode the ARs. Besides, we further provide two fusion strategies, i.e., simple summation and learnable fusion, to fuse the generated representations, which contain extra abundant information underlying the ARs, into the self-supervised reasoning backbone for performance enhancement. Extensive experiments on three few-shot benchmarks demonstrate that achieves state-of-the-art (SOTA) performance compared with other methods in most cases.

11.
Artigo em Inglês | MEDLINE | ID: mdl-38236668

RESUMO

The success of multiview raw data mining relies on the integrity of attributes. However, each view faces various noises and collection failures, which leads to a condition that attributes are only partially available. To make matters worse, the attributes in multiview raw data are composed of multiple forms, which makes it more difficult to explore the structure of the data especially in multiview clustering task. Due to the missing data in some views, the clustering task on incomplete multiview data confronts the following challenges, namely: 1) mining the topology of missing data in multiview is an urgent problem to be solved; 2) most approaches do not calibrate the complemented representations with common information of multiple views; and 3) we discover that the cluster distributions obtained from incomplete views have a cluster distribution unaligned problem (CDUP) in the latent space. To solve the above issues, we propose a deep clustering framework based on subgraph propagation and contrastive calibration (SPCC) for incomplete multiview raw data. First, the global structural graph is reconstructed by propagating the subgraphs generated by the complete data of each view. Then, the missing views are completed and calibrated under the guidance of the global structural graph and contrast learning between views. In the latent space, we assume that different views have a common cluster representation in the same dimension. However, in the unsupervised condition, the fact that the cluster distributions of different views do not correspond affects the information completion process to use information from other views. Finally, the complemented cluster distributions for different views are aligned by contrastive learning (CL), thus solving the CDUP in the latent space. Our method achieves advanced performance on six benchmarks, which validates the effectiveness and superiority of our SPCC.

12.
Artigo em Inglês | MEDLINE | ID: mdl-38215316

RESUMO

With the development of various applications, such as recommendation systems and social network analysis, graph data have been ubiquitous in the real world. However, graphs usually suffer from being absent during data collection due to copyright restrictions or privacy-protecting policies. The graph absence could be roughly grouped into attribute-incomplete and attribute-missing cases. Specifically, attribute-incomplete indicates that a portion of the attribute vectors of all nodes are incomplete, while attribute-missing indicates that all attribute vectors of partial nodes are missing. Although various graph imputation methods have been proposed, none of them is custom-designed for a common situation where both types of graph absence exist simultaneously. To fill this gap, we develop a novel graph imputation network termed revisiting initializing then refining (RITR), where both attribute-incomplete and attribute-missing samples are completed under the guidance of a novel initializing-then-refining imputation criterion. Specifically, to complete attribute-incomplete samples, we first initialize the incomplete attributes using Gaussian noise before network learning, and then introduce a structure-attribute consistency constraint to refine incomplete values by approximating a structure-attribute correlation matrix to a high-order structure matrix. To complete attribute-missing samples, we first adopt structure embeddings of attribute-missing samples as the embedding initialization, and then refine these initial values by adaptively aggregating the reliable information of attribute-incomplete samples according to a dynamic affinity structure. To the best of our knowledge, this newly designed method is the first end-to-end unsupervised framework dedicated to handling hybrid-absent graphs. Extensive experiments on six datasets have verified that our methods consistently outperform the existing state-of-the-art competitors. Our source code is available at https://github.com/WxTu/RITR.

13.
Artigo em Inglês | MEDLINE | ID: mdl-37971916

RESUMO

Inductive link prediction on temporal networks aims to predict the future links associated with node(s) unseen in the historical timestamps. Existing methods generate the predictions mainly by learning node representation from the node/edge attributes as well as the network dynamics or by measuring the distance between nodes on the temporal network structure. However, the attribute information is unavailable in many realistic applications and the structure-aware methods highly rely on nodes' common neighbors, which are difficult to accurately detect, especially in sparse temporal networks. Thus, we propose a distance-aware learning (DEAL) approach for inductive link prediction on temporal networks. Specifically, we first design an adaptive sampling method to extract temporal adaptive walks for nodes, increasing the probability of including the common neighbors between nodes. Then, we design a dual-channel distance measuring component, which simultaneously measures the distance between nodes in the embedding space and on the dynamic graph structure for predicting future inductive edges. Extensive experiments are conducted on three public temporal network datasets, i.e., MathOverflow, AskUbuntu, and StackOverflow. The experimental results validate the superiority of DEAL over the state-of-the-art baselines in terms of accuracy, area under the ROC curve (AUC), and average precision (AP), where the improvements are especially obvious in scenarios with only limited data.

14.
Artigo em Inglês | MEDLINE | ID: mdl-37956011

RESUMO

Stance detection on social media aims to identify if an individual is in support of or against a specific target. Most existing stance detection approaches primarily rely on modeling the contextual semantic information in sentences and neglect to explore the pragmatics dependency information of words, thus degrading performance. Although several single-task learning methods have been proposed to capture richer semantic representation information, they still suffer from semantic sparsity problems caused by short texts on social media. This article proposes a novel multigraph sparse interaction network (MG-SIN) by using multitask learning (MTL) to identify the stances and classify the sentiment polarities of tweets simultaneously. Our basic idea is to explore the pragmatics dependency relationship between tasks at the word level by constructing two types of heterogeneous graphs, including task-specific and task-related graphs (tr-graphs), to boost the learning of task-specific representations. A graph-aware module is proposed to adaptively facilitate information sharing between tasks via a novel sparse interaction mechanism among heterogeneous graphs. Through experiments on two real-world datasets, compared with the state-of-the-art baselines, the extensive results exhibit that MG-SIN achieves competitive improvements of up to 2.1% and 2.42% for the stance detection task, and 5.26% and 3.93% for the sentiment analysis task, respectively.

15.
Artigo em Inglês | MEDLINE | ID: mdl-37934640

RESUMO

Graph anomaly detection (GAD) has gained increasing attention in various attribute graph applications, i.e., social communication and financial fraud transaction networks. Recently, graph contrastive learning (GCL)-based methods have been widely adopted as the mainstream for GAD with remarkable success. However, existing GCL strategies in GAD mainly focus on node-node and node-subgraph contrast and fail to explore subgraph-subgraph level comparison. Furthermore, the different sizes or component node indices of the sampled subgraph pairs may cause the "nonaligned" issue, making it difficult to accurately measure the similarity of subgraph pairs. In this article, we propose a novel subgraph-aligned multiview contrastive approach for graph anomaly detection, named SAMCL, which fills the subgraph-subgraph contrastive-level blank for GAD tasks. Specifically, we first generate the multiview augmented subgraphs by capturing different neighbors of target nodes forming contrasting subgraph pairs. Then, to fulfill the nonaligned subgraph pair contrast, we propose a subgraph-aligned strategy that estimates similarities with the Earth mover's distance (EMD) of both considering the node embedding distributions and typology awareness. With the newly established similarity measure for subgraphs, we conduct the interview subgraph-aligned contrastive learning module to better detect changes for nodes with different local subgraphs. Moreover, we conduct intraview node-subgraph contrastive learning to supplement richer information on abnormalities. Finally, we also employ the node reconstruction task for the masked subgraph to measure the local change of the target node. Finally, the anomaly score for each node is jointly calculated by these three modules. Extensive experiments conducted on benchmark datasets verify the effectiveness of our approach compared to existing state-of-the-art (SOTA) methods with significant performance gains (up to 6.36% improvement on ACM). Our code can be verified at https://github.com/hujingtao/SAMCL.

16.
Artigo em Inglês | MEDLINE | ID: mdl-37991915

RESUMO

Anchor technology is popularly employed in multi-view subspace clustering (MVSC) to reduce the complexity cost. However, due to the sampling operation being performed on each individual view independently and not considering the distribution of samples in all views, the produced anchors are usually slightly distinguishable, failing to characterize the whole data. Moreover, it is necessary to fuse multiple separated graphs into one, which leads to the final clustering performance heavily subject to the fusion algorithm adopted. What is worse, existing MVSC methods generate dense bipartite graphs, where each sample is associated with all anchor candidates. We argue that this dense-connected mechanism will fail to capture the essential local structures and degrade the discrimination of samples belonging to the respective near anchor clusters. To alleviate these issues, we devise a clustering framework named SL-CAUBG. Specifically, we do not utilize sampling strategy but optimize to generate the consensus anchors within all views so as to explore the information between different views. Based on the consensus anchors, we skip the fusion stage and directly construct the unified bipartite graph across views. Most importantly, l1 norm and Laplacian-rank constraints employed on the unified bipartite graph make it capture both local and global structures simultaneously. l1 norm helps eliminate the scatters between anchors and samples by constructing sparse links and guarantees our graph to be with clear anchor-sample affinity relationship. Laplacian-rank helps extract the global characteristics by measuring the connectivity of unified bipartite graph. To deal with the nondifferentiable objective function caused by l1 norm, we adopt an iterative re-weighted method and the Newton's method. To handle the nonconvex Laplacian-rank, we equivalently transform it as a convex trace constraint. We also devise a four-step alternate method with linear complexity to solve the resultant problem. Substantial experiments show the superiority of our SL-CAUBG.

17.
Artigo em Inglês | MEDLINE | ID: mdl-37819821

RESUMO

Multiview spectral clustering, renowned for its spatial learning capability, has garnered significant attention in the data mining field. However, existing methods assume that the optimal consensus adjacency matrix is confined within the space spanned by each view's adjacency matrix. This constraint restricts the feasible domain of the algorithm and hinders the exploration of the optimal consensus adjacency matrix. To address this limitation, we propose a novel and convex strategy, termed the consensus neighbor strategy, for learning the optimal consensus adjacency matrix. This approach constructs the optimal consensus adjacency matrix by capturing the consensus local structure of each sample across all views, thereby expanding the search space and facilitating the discovery of the optimal consensus adjacency matrix. Furthermore, we introduce the concept of a correlation measuring matrix to prevent trivial solution. We develop an efficient iterative algorithm to solve the resulting optimization problem, benefitting from the convex nature of our model, which ensures convergence to a global optimum. Experimental results on 16 multiview datasets demonstrate that our proposed algorithm surpasses state-of-the-art methods in terms of its robust consensus representation learning capability. The code of this article is uploaded to https://github.com/PhdJiayiTang/Consensus-Neighbor-Strategy.git.

18.
IEEE Trans Image Process ; 32: 5197-5208, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37669186

RESUMO

Recently, metric-based meta-learning methods have been effectively applied to few-shot image classification. These methods classify images based on the relationship between samples in an embedding space, avoiding over-fitting that can occur when training classifiers with limited samples. However, finding an embedding space with good generalization properties remains a challenge. Our work highlights that having an initial manifold space that preserves sample neighbor relationships can prevent the metric model from reaching a suboptimal solution. We propose a feature learning method that leverages Instance Neighbor Constraints (INC). This theory is thoroughly evaluated and analyzed through experiments, demonstrating its effectiveness in improving the efficiency of learning and the overall performance of the model. We further integrate the INC into an alternate optimization training framework (AOT) that leverages both batch learning and episode learning to better optimize the metric-based model. We conduct extensive experiments on 5-way 1-shot and 5-way 5-shot settings on four popular few-shot image benchmarks: miniImageNet, tieredImageNet, Fewshot-CIFAR100 (FC100), and Caltech-UCSD Birds-200-2011(CUB). Results show that our method achieves consistent performance gains on benchmarks and state-of-the-art performance. Our findings suggest that initializing the embedding space appropriately and leveraging both batch and episode learning can significantly improve few-shot learning performance.

19.
Artigo em Inglês | MEDLINE | ID: mdl-37738196

RESUMO

Multiview clustering has attracted increasing attention to automatically divide instances into various groups without manual annotations. Traditional shadow methods discover the internal structure of data, while deep multiview clustering (DMVC) utilizes neural networks with clustering-friendly data embeddings. Although both of them achieve impressive performance in practical applications, we find that the former heavily relies on the quality of raw features, while the latter ignores the structure information of data. To address the above issue, we propose a novel method termed iterative deep structural graph contrast clustering (IDSGCC) for multiview raw data consisting of topology learning (TL), representation learning (RL), and graph structure contrastive learning to achieve better performance. The TL module aims to obtain a structured global graph with constraint structural information and then guides the RL to preserve the structural information. In the RL module, graph convolutional network (GCN) takes the global structural graph and raw features as inputs to aggregate the samples of the same cluster and keep the samples of different clusters away. Unlike previous methods performing contrastive learning at the representation level of the samples, in the graph contrastive learning module, we conduct contrastive learning at the graph structure level by imposing a regularization term on the similarity matrix. The credible neighbors of the samples are constructed as positive pairs through the credible graph, and other samples are constructed as negative pairs. The three modules promote each other and finally obtain clustering-friendly embedding. Also, we set up an iterative update mechanism to update the topology to obtain a more credible topology. Impressive clustering results are obtained through the iterative mechanism. Comparative experiments on eight multiview datasets show that our model outperforms the state-of-the-art traditional and deep clustering competitors.

20.
Artigo em Inglês | MEDLINE | ID: mdl-37738197

RESUMO

Recently, graph anomaly detection on attributed networks has attracted growing attention in data mining and machine learning communities. Apart from attribute anomalies, graph anomaly detection also aims at suspicious topological-abnormal nodes that exhibit collective anomalous behavior. Closely connected uncorrelated node groups form uncommonly dense substructures in the network. However, existing methods overlook that the topology anomaly detection performance can be improved by recognizing such a collective pattern. To this end, we propose a new graph anomaly detection framework on attributed networks via substructure awareness (ARISE). Unlike previous algorithms, we focus on the substructures in the graph to discern abnormalities. Specifically, we establish a region proposal module to discover high-density substructures in the network as suspicious regions. The average node-pair similarity can be regarded as the topology anomaly degree of nodes within substructures. Generally, the lower the similarity, the higher the probability that internal nodes are topology anomalies. To distill better embeddings of node attributes, we further introduce a graph contrastive learning scheme, which observes attribute anomalies in the meantime. In this way, ARISE can detect both topology and attribute anomalies. Ultimately, extensive experiments on benchmark datasets show that ARISE greatly improves detection performance (up to 7.30% AUC and 17.46% AUPRC gains) compared to state-of-the-art attributed networks anomaly detection (ANAD) algorithms.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...