Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38717882

RESUMO

Recently, low-rank tensor regularization has received more and more attention in hyperspectral and multispectral fusion (HMF). However, these methods often suffer from inflexible low-rank tensor definition and are highly sensitive to the permutation of tensor modes, which hinder their performance. To tackle this problem, we propose a novel generalized tensor nuclear norm (GTNN)-based approach for the HMF. First, we define a novel GTNN by extending the existing third-mode-based tensor nuclear norm (TNN) to arbitrary mode, which conducts the Fourier transform on an arbitrary single mode and then computes the TNN for each mode. In this way, we can not only capture more extensive correlations for the three modes of a tensor, and also omit the adverse effect of permutation of tensor modes. To utilize the correlations among spectral bands, the high-resolution hyperspectral image (HSI) is approximated as low-rank spectral basis multiplication by coefficients, and we estimate the spectral basis by conducting singular-value decomposition (SVD) on HSI. Then, the coefficients are estimated by addressing the proposed GTNN regularized optimization. In specific, to exploit the non-local similarities of the HSI, we first cluster the patches of the coefficient into a 3-D, which contains spatial, spectral, and non-local modes. Since the collected tensor contains the strong non-local spatial-spectral similarities of the HSI, the proposed low-rank tensor regularization is imposed on these collected tensors, which fully model the non-local self-similarities. Fusion experiments on both simulated and real datasets prove the advantages of this approach. The code is available at https://github.com/renweidian/GTNN.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38466604

RESUMO

Spectral super-resolution has attracted the attention of more researchers for obtaining hyperspectral images (HSIs) in a simpler and cheaper way. Although many convolutional neural network (CNN)-based approaches have yielded impressive results, most of them ignore the low-rank prior of HSIs resulting in huge computational and storage costs. In addition, the ability of CNN-based methods to capture the correlation of global information is limited by the receptive field. To surmount the problem, we design a novel low-rank tensor reconstruction network (LTRN) for spectral super-resolution. Specifically, we treat the features of HSIs as 3-D tensors with low-rank properties due to their spectral similarity and spatial sparsity. Then, we combine canonical-polyadic (CP) decomposition with neural networks to design an adaptive low-rank prior learning (ALPL) module that enables feature learning in a 1-D space. In this module, there are two core modules: the adaptive vector learning (AVL) module and the multidimensionwise multihead self-attention (MMSA) module. The AVL module is designed to compress an HSI into a 1-D space by using a vector to represent its information. The MMSA module is introduced to improve the ability to capture the long-range dependencies in the row, column, and spectral dimensions, respectively. Finally, our LTRN, mainly cascaded by several ALPL modules and feedforward networks (FFNs), achieves high-quality spectral super-resolution with fewer parameters. To test the effect of our method, we conduct experiments on two datasets: the CAVE dataset and the Harvard dataset. Experimental results show that our LTRN not only is as effective as state-of-the-art methods but also has fewer parameters. The code is available at https://github.com/renweidian/LTRN.

3.
IEEE Trans Image Process ; 33: 177-190, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38055358

RESUMO

Interactive image segmentation (IIS) has been widely used in various fields, such as medicine, industry, etc. However, some core issues, such as pixel imbalance, remain unresolved so far. Different from existing methods based on pre-processing or post-processing, we analyze the cause of pixel imbalance in depth from the two perspectives of pixel number and pixel difficulty. Based on this, a novel and unified Click-pixel Cognition Fusion network with Balanced Cut (CCF-BC) is proposed in this paper. On the one hand, the Click-pixel Cognition Fusion (CCF) module, inspired by the human cognition mechanism, is designed to increase the number of click-related pixels (namely, positive pixels) being correctly segmented, where the click and visual information are fully fused by using a progressive three-tier interaction strategy. On the other hand, a general loss, Balanced Normalized Focal Loss (BNFL), is proposed. Its core is to use a group of control coefficients related to sample gradients and forces the network to pay more attention to positive and hard-to-segment pixels during training. As a result, BNFL always tends to obtain a balanced cut of positive and negative samples in the decision space. Theoretical analysis shows that the commonly used Focal and BCE losses can be regarded as special cases of BNFL. Experiment results of five well-recognized datasets have shown the superiority of the proposed CCF-BC method compared to other state-of-the-art methods. The source code is publicly available at https://github.com/lab206/CCF-BC.

4.
Artigo em Inglês | MEDLINE | ID: mdl-37819819

RESUMO

In recent years, deep-learning-based pixel-level unified image fusion methods have received more and more attention due to their practicality and robustness. However, they usually require a complex network to achieve more effective fusion, leading to high computational cost. To achieve more efficient and accurate image fusion, a lightweight pixel-level unified image fusion (L-PUIF) network is proposed. Specifically, the information refinement and measurement process are used to extract the gradient and intensity information and enhance the feature extraction capability of the network. In addition, these information are converted into weights to guide the loss function adaptively. Thus, more effective image fusion can be achieved while ensuring the lightweight of the network. Extensive experiments have been conducted on four public image fusion datasets across multimodal fusion, multifocus fusion, and multiexposure fusion. Experimental results show that L-PUIF can achieve better fusion efficiency and has a greater visual effect compared with state-of-the-art methods. In addition, the practicability of L-PUIF in high-level computer vision tasks, i.e., object detection and image segmentation, has been verified.

5.
Artigo em Inglês | MEDLINE | ID: mdl-37738195

RESUMO

To obtain a high-resolution hyperspectral image (HR-HSI), fusing a low-resolution hyperspectral image (LR-HSI) and a high-resolution multispectral image (HR-MSI) is a prominent approach. Numerous approaches based on convolutional neural networks (CNNs) have been presented for hyperspectral image (HSI) and multispectral image (MSI) fusion. Nevertheless, these CNN-based methods may ignore the global relevant features from the input image due to the geometric limitations of convolutional kernels. To obtain more accurate fusion results, we provide a spatial-spectral transformer-based U-net (SSTF-Unet). Our SSTF-Unet can capture the association between distant features and explore the intrinsic information of images. More specifically, we use the spatial transformer block (SATB) and spectral transformer block (SETB) to calculate the spatial and spectral self-attention, respectively. Then, SATB and SETB are connected in parallel to form the spatial-spectral fusion block (SSFB). Inspired by the U-net architecture, we build up our SSTF-Unet through stacking several SSFBs for multiscale spatial-spectral feature fusion. Experimental results on public HSI datasets demonstrate that the designed SSTF-Unet achieves better performance than other existing HSI and MSI fusion approaches.

6.
Natl Sci Rev ; 10(6): nwad130, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37347038

RESUMO

This paper reports the background and results of the Surface Defect Detection Competition with Bio-inspired Vision Sensor, as well as summarizes the champion solutions, current challenges and future directions.

7.
Artigo em Inglês | MEDLINE | ID: mdl-37279125

RESUMO

Visible-infrared object detection aims to improve the detector performance by fusing the complementarity of visible and infrared images. However, most existing methods only use local intramodality information to enhance the feature representation while ignoring the efficient latent interaction of long-range dependence between different modalities, which leads to unsatisfactory detection performance under complex scenes. To solve these problems, we propose a feature-enhanced long-range attention fusion network (LRAF-Net), which improves detection performance by fusing the long-range dependence of the enhanced visible and infrared features. First, a two-stream CSPDarknet53 network is used to extract the deep features from visible and infrared images, in which a novel data augmentation (DA) method is designed to reduce the bias toward a single modality through asymmetric complementary masks. Then, we propose a cross-feature enhancement (CFE) module to improve the intramodality feature representation by exploiting the discrepancy between visible and infrared images. Next, we propose a long-range dependence fusion (LDF) module to fuse the enhanced features by associating the positional encoding of multimodality features. Finally, the fused features are fed into a detection head to obtain the final detection results. Experiments on several public datasets, i.e., VEDAI, FLIR, and LLVIP, show that the proposed method obtains state-of-the-art performance compared with other methods.

8.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 12650-12666, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37235456

RESUMO

Fusing hyperspectral images (HSIs) with multispectral images (MSIs) of higher spatial resolution has become an effective way to sharpen HSIs. Recently, deep convolutional neural networks (CNNs) have achieved promising fusion performance. However, these methods often suffer from the lack of training data and limited generalization ability. To address the above problems, we present a zero-shot learning (ZSL) method for HSI sharpening. Specifically, we first propose a novel method to quantitatively estimate the spectral and spatial responses of imaging sensors with high accuracy. In the training procedure, we spatially subsample the MSI and HSI based on the estimated spatial response and use the downsampled HSI and MSI to infer the original HSI. In this way, we can not only exploit the inherent information in the HSI and MSI, but the trained CNN can also be well generalized to the test data. In addition, we take the dimension reduction on the HSI, which reduces the model size and storage usage without sacrificing fusion accuracy. Furthermore, we design an imaging model-based loss function for CNN, which further boosts the fusion performance. The experimental results show the significantly high efficiency and accuracy of our approach.

9.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 7939-7954, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37015605

RESUMO

Recently, fusing a low-resolution hyperspectral image (LR-HSI) with a high-resolution multispectral image (HR-MSI) of different satellites has become an effective way to improve the resolution of an HSI. However, due to different imaging satellites, different illumination, and adjacent imaging time, the LR-HSI and HR-MSI may not satisfy the observation models established by existing works, and the LR-HSI and HR-MSI are hard to be registered. To solve the above problems, we establish new observation models for LR-HSIs and HR-MSIs from different satellites, then a deep-learning-based framework is proposed to solve the key steps in multi-satellite HSI fusion, including image registration, blur kernel learning, and image fusion. Specifically, we first construct a convolutional neural network (CNN), called RegNet, to produce pixel-wise offsets between LR-HSI and HR-MSI, which are utilized to register the LR-HSI. Next, according to the new observation models, a tiny network, called BKLNet, is built to learn the spectral and spatial blur kernels, where the BKLNet and RegNet can be trained jointly. In the fusion part, we further train a FusNet by downsampling the registered data with the learned spatial blur kernel. Extensive experiments demonstrate the superiority of the proposed framework in HSI registration and fusion accuracy.

10.
Artigo em Inglês | MEDLINE | ID: mdl-37022225

RESUMO

Spectral super-resolution, which reconstructs a hyperspectral image (HSI) from a single red-green-blue (RGB) image, has acquired more and more attention. Recently, convolution neural networks (CNNs) have achieved promising performance. However, they often fail to simultaneously exploit the imaging model of the spectral super-resolution and complex spatial and spectral characteristics of the HSI. To tackle the above problems, we build a novel cross fusion (CF)-based model-guided network (called SSRNet) for spectral super-resolution. In specific, based on the imaging model, we unfold the spectral super-resolution into the HSI prior learning (HPL) module and imaging model guiding (IMG) module. Instead of just modeling one kind of image prior, the HPL module is composed of two subnetworks with different structures, which can effectively learn the complex spatial and spectral priors of the HSI, respectively. Furthermore, a CF strategy is used to establish the connection between the two subnetworks, which further improves the learning performance of the CNN. The IMG module results in solving a strong convex optimization problem, which adaptively optimizes and merges the two features learned by the HPL module by exploiting the imaging model. The two modules are alternately connected to achieve optimal HSI reconstruction performance. Experiments on both the simulated and real data demonstrate that the proposed method can achieve superior spectral reconstruction results with relatively small model size. The code will be available at https://github.com/renweidian.

11.
IEEE Trans Image Process ; 32: 2267-2278, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37067971

RESUMO

Camouflaged object detection (COD) aims to discover objects that blend in with the background due to similar colors or textures, etc. Existing deep learning methods do not systematically illustrate the key tasks in COD, which seriously hinders the improvement of its performance. In this paper, we introduce the concept of focus areas that represent some regions containing discernable colors or textures, and develop a two-stage focus scanning network for camouflaged object detection. Specifically, a novel encoder-decoder module is first designed to determine a region where the focus areas may appear. In this process, a multi-layer Swin transformer is deployed to encode global context information between the object and the background, and a novel cross-connection decoder is proposed to fuse cross-layer textures or semantics. Then, we utilize the multi-scale dilated convolution to obtain discriminative features with different scales in focus areas. Meanwhile, the dynamic difficulty aware loss is designed to guide the network paying more attention to structural details. Extensive experimental results on the benchmarks, including CAMO, CHAMELEON, COD10K, and NC4K, illustrate that the proposed method performs favorably against other state-of-the-art methods.

12.
IEEE Trans Neural Netw Learn Syst ; 32(3): 1124-1135, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-32310788

RESUMO

Hyperspectral image (HSI) and multispectral image (MSI) fusion, which fuses a low-spatial-resolution HSI (LR-HSI) with a higher resolution multispectral image (MSI), has become a common scheme to obtain high-resolution HSI (HR-HSI). This article presents a novel HSI and MSI fusion method (called as CNN-Fus), which is based on the subspace representation and convolutional neural network (CNN) denoiser, i.e., a well-trained CNN for gray image denoising. Our method only needs to train the CNN on the more accessible gray images and can be directly used for any HSI and MSI data sets without retraining. First, to exploit the high correlations among the spectral bands, we approximate the desired HR-HSI with the low-dimensional subspace multiplied by the coefficients, which can not only speed up the algorithm but also lead to more accurate recovery. Since the spectral information mainly exists in the LR-HSI, we learn the subspace from it via singular value decomposition. Due to the powerful learning performance and high speed of CNN, we use the well-trained CNN for gray image denoising to regularize the estimation of coefficients. Specifically, we plug the CNN denoiser into the alternating direction method of multipliers (ADMM) algorithm to estimate the coefficients. Experiments demonstrate that our method has superior performance over the state-of-the-art fusion methods.

13.
IEEE Trans Cybern ; 50(10): 4469-4480, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-31794410

RESUMO

Combining a high-spatial-resolution multispectral image (HR-MSI) with a low-spatial-resolution hyperspectral image (LR-HSI) has become a common way to enhance the spatial resolution of the HSI. The existing state-of-the-art LR-HSI and HR-MSI fusion methods are mostly based on the matrix factorization, where the matrix data representation may be hard to fully make use of the inherent structures of 3-D HSI. We propose a nonlocal sparse tensor factorization approach, called the NLSTF_SMBF, for the semiblind fusion of HSI and MSI. The proposed method decomposes the HSI into smaller full-band patches (FBPs), which, in turn, are factored as dictionaries of the three HSI modes and a sparse core tensor. This decomposition allows to solve the fusion problem as estimating a sparse core tensor and three dictionaries for each FBP. Similar FBPs are clustered together, and they are assumed to share the same dictionaries to make use of the nonlocal self-similarities of the HSI. For each group, we learn the dictionaries from the observed HR-MSI and LR-HSI. The corresponding sparse core tensor of each FBP is computed via tensor sparse coding. Two distinctive features of NLSTF_SMBF are that: 1) it is blind with respect to the point spread function (PSF) of the hyperspectral sensor and 2) it copes with spatially variant PSFs. The experimental results provide the evidence of the advantages of the NLSTF_SMBF method over the existing state-of-the-art methods, namely, in semiblind scenarios.

14.
Artigo em Inglês | MEDLINE | ID: mdl-31107646

RESUMO

Recently, combining a low spatial resolution hyperspectral image (LR-HSI) with a high spatial resolution multispectral image (HR-MSI) into an HR-HSI has become a popular scheme to enhance the spatial resolution of HSI. We propose a novel subspace-based low tensor multi-rank regularization method for the fusion, which fully exploits the spectral correlations and non-local similarities in the HR-HSI. To make use of high spectral correlations, the HR-HSI is approximated by spectral subspace and coefficients. We first learn the spectral subspace from the LR-HSI via singular value decomposition, and then estimate the coefficients via the low tensor multi-rank prior. More specifically, based on the learned cluster structure in the HR-MSI, the patches in coefficients are grouped. We collect the coefficients in the same cluster into a three-dimensional tensor and impose the low tensor multi-rank prior on these collected tensors, which fully model the non-local self-similarities in the HR-HSI. The coefficients optimization is solved by the alternating direction method of multipliers. Experiments on two public HSI datasets demonstrate the advantages of tour method.

15.
IEEE Trans Neural Netw Learn Syst ; 30(9): 2672-2683, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-30624229

RESUMO

Hyperspectral images (HSIs) with high spectral resolution only have the low spatial resolution. On the contrary, multispectral images (MSIs) with much lower spectral resolution can be obtained with higher spatial resolution. Therefore, fusing the high-spatial-resolution MSI (HR-MSI) with low-spatial-resolution HSI of the same scene has become the very popular HSI super-resolution scheme. In this paper, a novel low tensor-train (TT) rank (LTTR)-based HSI super-resolution method is proposed, where an LTTR prior is designed to learn the correlations among the spatial, spectral, and nonlocal modes of the nonlocal similar high-spatial-resolution HSI (HR-HSI) cubes. First, we cluster the HR-MSI cubes as many groups based on their similarities, and the HR-HSI cubes are also clustered according to the learned cluster structure in the HR-MSI cubes. The HR-HSI cubes in each group are much similar to each other and can constitute a 4-D tensor, whose four modes are highly correlated. Therefore, we impose the LTTR constraint on these 4-D tensors, which can effectively learn the correlations among the spatial, spectral, and nonlocal modes because of the well-balanced matricization scheme of TT rank. We formulate the super-resolution problem as TT rank regularized optimization problem, which is solved via the scheme of alternating direction method of multipliers. Experiments on HSI data sets indicate the effectiveness of the LTTR-based method.

16.
IEEE Trans Neural Netw Learn Syst ; 29(11): 5345-5355, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-29994458

RESUMO

Hyperspectral image (HSI) sharpening, which aims at fusing an observable low spatial resolution (LR) HSI (LR-HSI) with a high spatial resolution (HR) multispectral image (HR-MSI) of the same scene to acquire an HR-HSI, has recently attracted much attention. Most of the recent HSI sharpening approaches are based on image priors modeling, which are usually sensitive to the parameters selection and time-consuming. This paper presents a deep HSI sharpening method (named DHSIS) for the fusion of an LR-HSI with an HR-MSI, which directly learns the image priors via deep convolutional neural network-based residual learning. The DHSIS method incorporates the learned deep priors into the LR-HSI and HR-MSI fusion framework. Specifically, we first initialize the HR-HSI from the fusion framework via solving a Sylvester equation. Then, we map the initialized HR-HSI to the reference HR-HSI via deep residual learning to learn the image priors. Finally, the learned image priors are returned to the fusion framework to reconstruct the final HR-HSI. Experimental results demonstrate the superiority of the DHSIS approach over existing state-of-the-art HSI sharpening approaches in terms of reconstruction accuracy and running time.

17.
Artigo em Inglês | MEDLINE | ID: mdl-29994767

RESUMO

Fusing a low spatial resolution hyperspectral image (LR-HSI) with a high spatial resolution multispectral image (HR-MSI) to obtain a high spatial resolution hyperspectral image (HR-HSI) has attracted increasing interest in recent years. In this paper, we propose a coupled sparse tensor factorization (CSTF) based approach for fusing such images. In the proposed CSTF method, we consider an HR-HSI as a three-dimensional tensor and redefine the fusion problem as the estimation of a core tensor and dictionaries of the three modes. The high spatial-spectral correlations in the HR-HSI are modeled by incorporating a regularizer which promotes sparse core tensors. The estimation of the dictionaries and the core tensor are formulated as a coupled tensor factorization of the LR-HSI and of the HR-MSI. Experiments on two remotely sensed HSIs demonstrate the superiority of the proposed CSTF algorithm over current state-of-the-art HSI-MSI fusion approaches.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...