Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
IEEE Trans Image Process ; 33: 177-190, 2024.
Article in English | MEDLINE | ID: mdl-38055358

ABSTRACT

Interactive image segmentation (IIS) has been widely used in various fields, such as medicine, industry, etc. However, some core issues, such as pixel imbalance, remain unresolved so far. Different from existing methods based on pre-processing or post-processing, we analyze the cause of pixel imbalance in depth from the two perspectives of pixel number and pixel difficulty. Based on this, a novel and unified Click-pixel Cognition Fusion network with Balanced Cut (CCF-BC) is proposed in this paper. On the one hand, the Click-pixel Cognition Fusion (CCF) module, inspired by the human cognition mechanism, is designed to increase the number of click-related pixels (namely, positive pixels) being correctly segmented, where the click and visual information are fully fused by using a progressive three-tier interaction strategy. On the other hand, a general loss, Balanced Normalized Focal Loss (BNFL), is proposed. Its core is to use a group of control coefficients related to sample gradients and forces the network to pay more attention to positive and hard-to-segment pixels during training. As a result, BNFL always tends to obtain a balanced cut of positive and negative samples in the decision space. Theoretical analysis shows that the commonly used Focal and BCE losses can be regarded as special cases of BNFL. Experiment results of five well-recognized datasets have shown the superiority of the proposed CCF-BC method compared to other state-of-the-art methods. The source code is publicly available at https://github.com/lab206/CCF-BC.

2.
Article in English | MEDLINE | ID: mdl-38090866

ABSTRACT

Real-time semantic segmentation plays an important role in auto vehicles. However, most real-time small object segmentation methods fail to obtain satisfactory performance on small objects, such as cars and sign symbols, since the large objects usually tend to devote more to the segmentation result. To solve this issue, we propose an efficient and effective architecture, termed small objects segmentation network (SOSNet), to improve the segmentation performance of small objects. The SOSNet works from two perspectives: methodology and data. Specifically, with the former, we propose a dual-branch hierarchical decoder (DBHD) which is viewed as a small-object sensitive segmentation head. The DBHD consists of a top segmentation head that predicts whether the pixels belong to a small object class and a bottom one that estimates the pixel class. In this situation, the latent correlation among small objects can be fully explored. With the latter, we propose a small object example mining (SOEM) algorithm for balancing examples between small objects and large objects automatically. The core idea of the proposed SOEM is that most of the hard examples on small-object classes are reserved for training while most of the easy examples on large-object classes are banned. Experiments on three commonly used datasets show that the proposed SOSNet architecture greatly improves the accuracy compared to the existing real-time semantic segmentation methods while keeping efficiency. The code will be available at https://github.com/StuLiu/SOSNet.

3.
Article in English | MEDLINE | ID: mdl-37279125

ABSTRACT

Visible-infrared object detection aims to improve the detector performance by fusing the complementarity of visible and infrared images. However, most existing methods only use local intramodality information to enhance the feature representation while ignoring the efficient latent interaction of long-range dependence between different modalities, which leads to unsatisfactory detection performance under complex scenes. To solve these problems, we propose a feature-enhanced long-range attention fusion network (LRAF-Net), which improves detection performance by fusing the long-range dependence of the enhanced visible and infrared features. First, a two-stream CSPDarknet53 network is used to extract the deep features from visible and infrared images, in which a novel data augmentation (DA) method is designed to reduce the bias toward a single modality through asymmetric complementary masks. Then, we propose a cross-feature enhancement (CFE) module to improve the intramodality feature representation by exploiting the discrepancy between visible and infrared images. Next, we propose a long-range dependence fusion (LDF) module to fuse the enhanced features by associating the positional encoding of multimodality features. Finally, the fused features are fed into a detection head to obtain the final detection results. Experiments on several public datasets, i.e., VEDAI, FLIR, and LLVIP, show that the proposed method obtains state-of-the-art performance compared with other methods.

4.
Article in English | MEDLINE | ID: mdl-37018602

ABSTRACT

Hyperspectral image (HSI) classification methods have made great progress in recent years. However, most of these methods are rooted in the closed-set assumption that the class distribution in the training and testing stages is consistent, which cannot handle the unknown class in open-world scenes. In this work, we propose a feature consistency-based prototype network (FCPN) for open-set HSI classification, which is composed of three steps. First, a three-layer convolutional network is designed to extract the discriminative features, where a contrastive clustering module is introduced to enhance the discrimination. Then, the extracted features are used to construct a scalable prototype set. Finally, a prototype-guided open-set module (POSM) is proposed to identify the known samples and unknown samples. Extensive experiments reveal that our method achieves remarkable classification performance over other state-of-the-art classification techniques.

5.
Sensors (Basel) ; 22(21)2022 Nov 04.
Article in English | MEDLINE | ID: mdl-36366196

ABSTRACT

Hyperspectral image classification has received a lot of attention in the remote sensing field. However, most classification methods require a large number of training samples to obtain satisfactory performance. In real applications, it is difficult for users to label sufficient samples. To overcome this problem, in this work, a novel multi-scale superpixel-guided structural profile method is proposed for the classification of hyperspectral images. First, the spectral number (of the original image) is reduced with an averaging fusion method. Then, multi-scale structural profiles are extracted with the help of the superpixel segmentation method. Finally, the extracted multi-scale structural profiles are fused with an unsupervised feature selection method followed by a spectral classifier to obtain classification results. Experiments on several hyperspectral datasets verify that the proposed method can produce outstanding classification effects in the case of limited samples compared to other advanced classification methods. The classification accuracies obtained by the proposed method on the Salinas dataset are increased by 43.25%, 31.34%, and 46.82% in terms of overall accuracy (OA), average accuracy (AA), and Kappa coefficient compared to recently proposed deep learning methods.

SELECTION OF CITATIONS
SEARCH DETAIL
...