Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38980784

ABSTRACT

Most existing few-shot image classification methods employ global pooling to aggregate class-relevant local features in a data-drive manner. Due to the difficulty and inaccuracy in locating class-relevant regions in complex scenarios, as well as the large semantic diversity of local features, the class-irrelevant information could reduce the robustness of the representations obtained by performing global pooling. Meanwhile, the scarcity of labeled images exacerbates the difficulties of data-hungry deep models in identifying class-relevant regions. These issues severely limit deep models' few-shot learning ability. In this work, we propose to remove the class-irrelevant information by making local features class relevant, thus bypassing the big challenge of identifying which local features are class irrelevant. The resulting class-irrelevant feature removal (CIFR) method consists of three phases. First, we employ the masked image modeling strategy to build an understanding of images' internal structures that generalizes well. Second, we design a semantic-complementary feature propagation module to make local features class relevant. Third, we introduce a weighted dense-connected similarity measure, based on which a loss function is raised to fine-tune the entire pipeline, with the aim of further enhancing the semantic consistency of the class-relevant local features. Visualization results show that CIFR achieves the removal of class-irrelevant information by making local features related to classes. Comparison results on four benchmark datasets indicate that CIFR yields very promising performance.

2.
3.
Front Neurorobot ; 16: 1091361, 2022.
Article in English | MEDLINE | ID: mdl-36590083

ABSTRACT

Graph convolution networks (GCNs) have been widely used in the field of skeleton-based human action recognition. However, it is still difficult to improve recognition performance and reduce parameter complexity. In this paper, a novel multi-scale attention spatiotemporal GCN (MSA-STGCN) is proposed for human violence action recognition by learning spatiotemporal features from four different skeleton modality variants. Firstly, the original joint data are preprocessed to obtain joint position, bone vector, joint motion and bone motion datas as inputs of recognition framework. Then, a spatial multi-scale graph convolution network based on the attention mechanism is constructed to obtain the spatial features from joint nodes, while a temporal graph convolution network in the form of hybrid dilation convolution is designed to enlarge the receptive field of the feature map and capture multi-scale context information. Finally, the specific relationship in the different skeleton data is explored by fusing the information of multi-stream related to human joints and bones. To evaluate the performance of the proposed MSA-STGCN, a skeleton violence action dataset: Filtered NTU RGB+D was constructed based on NTU RGB+D120. We conducted experiments on constructed Filtered NTU RGB+D and Kinetics Skeleton 400 datasets to verify the performance of the proposed recognition framework. The proposed method achieves an accuracy of 95.3% on the Filtered NTU RGB+D with the parameters 1.21M, and an accuracy of 36.2% (Top-1) and 58.5% (Top-5) on the Kinetics Skeleton 400, respectively. The experimental results on these two skeleton datasets show that the proposed recognition framework can effectively recognize violence actions without adding parameters.

SELECTION OF CITATIONS
SEARCH DETAIL
...