Search | VHL Regional Portal

1.

Rumor detection based on Attention Graph Adversarial Dual Contrast Learning.

Zhang, Bing; Liu, Tao; Ke, Zunwang; Li, Yanbing; Silamu, Wushour.

PLoS One ; 19(4): e0290291, 2024.

Article in English | MEDLINE | ID: mdl-38648224

ABSTRACT

It is becoming harder to tell rumors from non-rumors as social media becomes a key news source, which invites malicious manipulation that could do harm to the public's health or cause financial loss. When faced with situations when the session structure of comment sections is deliberately disrupted, traditional models do not handle them adequately. In order to do this, we provide a novel rumor detection architecture that combines dual comparison learning, adversarial training, and attention filters. We suggest the attention filter module to achieve the filtering of some dangerous comments as well as the filtering of some useless comments, allowing the nodes to enter the GAT graph neural network with greater structural information. The adversarial training module (ADV) simulates the occurrence of malicious comments through perturbation, giving the comments some defense against malicious comments. It also serves as a hard negative sample to aid double contrast learning (DCL), which aims to learn the differences between various comments, and incorporates the final loss in the form of a loss function to strengthen the model. According to experimental findings, our AGAD (Attention Graph Adversarial Dual Contrast Learning) model outperforms other cutting-edge algorithms on a number of rumor detection tasks. The code is available at https://github.com/icezhangGG/AGAD.git.

Subject(s)

Algorithms , Neural Networks, Computer , Social Media , Humans , Attention , Machine Learning

2.

Display-Semantic Transformer for Scene Text Recognition.

Yang, Xinqi; Silamu, Wushour; Xu, Miaomiao; Li, Yanbing.

Sensors (Basel) ; 23(19)2023 Sep 28.

Article in English | MEDLINE | ID: mdl-37836989

ABSTRACT

Linguistic knowledge helps a lot in scene text recognition by providing semantic information to refine the character sequence. The visual model only focuses on the visual texture of characters without actively learning linguistic information, which leads to poor model recognition rates in some noisy (distorted and blurry, etc.) images. In order to address the aforementioned issues, this study builds upon the most recent findings of the Vision Transformer, and our approach (called Display-Semantic Transformer, or DST for short) constructs a masked language model and a semantic visual interaction module. The model can mine deep semantic information from images to assist scene text recognition and improve the robustness of the model. The semantic visual interaction module can better realize the interaction between semantic information and visual features. In this way, the visual features can be enhanced by the semantic information so that the model can achieve a better recognition effect. The experimental results show that our model improves the average recognition accuracy on six benchmark test sets by nearly 2% compared to the baseline. Our model retains the benefits of having a small number of parameters and allows for fast inference speed. Additionally, it attains a more optimal balance between accuracy and speed.

3.

Scene Uyghur Recognition Based on Visual Prediction Enhancement.

Liu, Yaqi; Kong, Fanjie; Xu, Miaomiao; Silamu, Wushour; Li, Yanbing.

Sensors (Basel) ; 23(20)2023 Oct 20.

Article in English | MEDLINE | ID: mdl-37896702

ABSTRACT

Aiming at the problems of Uyghur oblique deformation, character adhesion and character similarity in scene images, this paper proposes a scene Uyghur recognition model with enhanced visual prediction. First, the content-aware correction network TPS++ is used to perform feature-level correction for skewed text. Then, ABINet is used as the basic recognition network, and the U-Net structure in the vision model is improved to aggregate horizontal features, suppress multiple activation phenomena, better describe the spatial characteristics of character positions, and alleviate the problem of character adhesion. Finally, a visual masking semantic awareness (VMSA) module is added to guide the vision model to consider the language information in the visual space by masking the corresponding visual features on the attention map to obtain more accurate visual prediction. This module can not only alleviate the correction load of the language model, but also distinguish similar characters using the language information. The effectiveness of the improved method is verified by ablation experiments, and the model is compared with common scene text recognition methods and scene Uyghur recognition methods on the self-built scene Uyghur dataset.

4.

Knowledge-enhanced prototypical network with class cluster loss for few-shot relation classification.

Liu, Tao; Ke, Zunwang; Li, Yanbing; Silamu, Wushour.

PLoS One ; 18(6): e0286915, 2023.

Article in English | MEDLINE | ID: mdl-37289767

ABSTRACT

Few-shot Relation Classification identifies the relation between target entity pairs in unstructured natural language texts by training on a small number of labeled samples. Recent prototype network-based studies have focused on enhancing the prototype representation capability of models by incorporating external knowledge. However, the majority of these works constrain the representation of class prototypes implicitly through complex network structures, such as multi-attention mechanisms, graph neural networks, and contrastive learning, which constrict the model's ability to generalize. In addition, most models with triplet loss disregard intra-class compactness during model training, thereby limiting the model's ability to handle outlier samples with low semantic similarity. Therefore, this paper proposes a non-weighted prototype enhancement module that uses the feature-level similarity between prototypes and relation information as a gate to filter and complete features. Meanwhile, we design a class cluster loss that samples difficult positive and negative samples and explicitly constrains both intra-class compactness and inter-class separability to learn a metric space with high discriminability. Extensive experiments were done on the publicly available dataset FewRel 1.0 and 2.0, and the results show the effectiveness of the proposed model.

Subject(s)

Knowledge , Language , Learning , Neural Networks, Computer , Semantics

5.

Attention-Based Scene Text Detection on Dual Feature Fusion.

Li, Yuze; Silamu, Wushour; Wang, Zhenchao; Xu, Miaomiao.

Sensors (Basel) ; 22(23)2022 Nov 23.

Article in English | MEDLINE | ID: mdl-36501774

ABSTRACT

The segmentation-based scene text detection algorithm has advantages in scene text detection scenarios with arbitrary shape and extreme aspect ratio, depending on its pixel-level description and fine post-processing. However, the insufficient use of semantic and spatial information in the network limits the classification and positioning capabilities of the network. Existing scene text detection methods have the problem of losing important feature information in the process of extracting features from each network layer. To solve this problem, the Attention-based Dual Feature Fusion Model (ADFM) is proposed. The Bi-directional Feature Fusion Pyramid Module (BFM) first adds stronger semantic information to the higher-resolution feature maps through a top-down process and then reduces the aliasing effects generated by the previous process through a bottom-up process to enhance the representation of multi-scale text semantic information. Meanwhile, a position-sensitive Spatial Attention Module (SAM) is introduced in the intermediate process of two-stage feature fusion. It focuses on the one feature map with the highest resolution and strongest semantic features generated in the top-down process and weighs the spatial position weight by the relevance of text features, thus improving the sensitivity of the text detection network to text regions. The effectiveness of each module of ADFM was verified by ablation experiments and the model was compared with recent scene text detection methods on several publicly available datasets.

Subject(s)

Algorithms , Semantics

6.

A Robust Method: Arbitrary Shape Text Detection Combining Semantic and Position Information.

Wang, Zhenchao; Silamu, Wushour; Li, Yuze; Xu, Miaomiao.

Sensors (Basel) ; 22(24)2022 Dec 18.

Article in English | MEDLINE | ID: mdl-36560350

ABSTRACT

There is a growing interest in scene text detection for arbitrary shapes. The effectiveness of text detection has also evolved from horizontal text detection to the ability to perform text detection in multiple directions and arbitrary shapes. However, scene text detection is still a challenging task due to significant differences in size and aspect ratio and diversity in shape, as well as orientation, coarse annotations, and other factors. Regression-based methods are inspired by object detection and have limitations in fitting the edges of arbitrarily shaped text due to the characteristics of their methods. Segmentation-based methods, on the other hand, perform prediction at the pixel level and thus can fit arbitrarily shaped text better. However, the inaccuracy of image text annotations and the distribution characteristics of text pixels, which contain a large number of background pixels and misclassified pixels, degrades the performance of segmentation-based text detection methods to some extent. Usually, considering whether a pixel belongs to a text region is highly dependent on the strength of the semantic information it has and the position of the pixel in the text area. Based on the above two points, we propose an innovative and robust method for scene text detection combining position and semantic information. First, we add position information to the images using a position encoding module (PosEM) to help the model learn the implicit feature relationships associated with the position. Second, we use the semantic enhancement module (SEM) to enhance the model's focus on the semantic information in the image during feature extraction. Then, to minimize the effect of noise due to inaccurate image text annotations and the distribution characteristics of text pixels, we convert the detection results into a probability map that can more reasonably represent the text distribution. Finally, we reconstruct and filter the text instances using a post-processing algorithm to reduce false positives. The experimental results show that our model improves significantly on the Total-Text, MSRA-TD500, and CTW1500 datasets, outperforming most previous advanced algorithms.

Subject(s)

Algorithms , Semantics , Learning , Probability , Upper Extremity

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL