Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters











Publication year range
1.
IEEE Trans Image Process ; 33: 856-866, 2024.
Article in English | MEDLINE | ID: mdl-38231815

ABSTRACT

Unsupervised Domain adaptation (UDA) aims to transfer knowledge from the labeled source domain to the unlabeled target domain. Most existing domain adaptation methods are based on convolutional neural networks (CNNs) to learn cross-domain invariant features. Inspired by the success of transformer architectures and their superiority to CNNs, we propose to combine the transformer with UDA to improve their generalization properties. In this paper, we present a novel model named Trans ferable V ector Q uantization A lignment for Unsupervised Domain Adaptation (TransVQA), which integrates the Transferable transformer-based feature extractor (Trans), vector quantization domain alignment (VQA), and mutual information weighted maximization confusion matrix (MIMC) of intra-class discrimination into a unified domain adaptation framework. First, TransVQA uses the transformer to extract more accurate features in different domains for classification. Second, TransVQA, based on the vector quantization alignment module, uses a two-step alignment method to align the extracted cross-domain features and solve the domain shift problem. The two-step alignment includes global alignment via vector quantization and intra-class local alignment via pseudo-labels. Third, for intra-class feature discrimination problem caused by the fuzzy alignment of different domains, we use the MIMC module to constrain the target domain output and increase the accuracy of pseudo-labels. The experiments on several datasets of domain adaptation show that TransVQA can achieve excellent performance and outperform existing state-of-the-art methods.

2.
IEEE Trans Image Process ; 32: 4416-4431, 2023.
Article in English | MEDLINE | ID: mdl-37527319

ABSTRACT

In recent years, researchers have become more interested in hyperspectral image fusion (HIF) as a potential alternative to expensive high-resolution hyperspectral imaging systems, which aims to recover a high-resolution hyperspectral image (HR-HSI) from two images obtained from low-resolution hyperspectral (LR-HSI) and high-spatial-resolution multispectral (HR-MSI). It is generally assumed that degeneration in both the spatial and spectral domains is known in traditional model-based methods or that there existed paired HR-LR training data in deep learning-based methods. However, such an assumption is often invalid in practice. Furthermore, most existing works, either introducing hand-crafted priors or treating HIF as a black-box problem, cannot take full advantage of the physical model. To address those issues, we propose a deep blind HIF method by unfolding model-based maximum a posterior (MAP) estimation into a network implementation in this paper. Our method works with a Laplace distribution (LD) prior that does not need paired training data. Moreover, we have developed an observation module to directly learn degeneration in the spatial domain from LR-HSI data, addressing the challenge of spatially-varying degradation. We also propose to learn the uncertainty (mean and variance) of LD models using a novel Swin-Transformer-based denoiser and to estimate the variance of degraded images from residual errors (rather than treating them as global scalars). All parameters of the MAP estimation algorithm and the observation module can be jointly optimized through end-to-end training. Extensive experiments on both synthetic and real datasets show that the proposed method outperforms existing competing methods in terms of both objective evaluation indexes and visual qualities.

3.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9325-9338, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37027639

ABSTRACT

Both network pruning and neural architecture search (NAS) can be interpreted as techniques to automate the design and optimization of artificial neural networks. In this paper, we challenge the conventional wisdom of training before pruning by proposing a joint search-and-training approach to learn a compact network directly from scratch. Using pruning as a search strategy, we advocate three new insights for network engineering: 1) to formulate adaptive search as a cold start strategy to find a compact subnetwork on the coarse scale; and 2) to automatically learn the threshold for network pruning; 3) to offer flexibility to choose between efficiency and robustness. More specifically, we propose an adaptive search algorithm in the cold start by exploiting the randomness and flexibility of filter pruning. The weights associated with the network filters will be updated by ThreshNet, a flexible coarse-to-fine pruning method inspired by reinforcement learning. In addition, we introduce a robust pruning strategy leveraging the technique of knowledge distillation through a teacher-student network. Extensive experiments on ResNet and VGGNet have shown that our proposed method can achieve a better balance in terms of efficiency and accuracy and notable advantages over current state-of-the-art pruning methods in several popular datasets, including CIFAR10, CIFAR100, and ImageNet. The code associate with this paper is available at: https://see.xidian.edu.cn/faculty/wsdong/Projects/AST-NP.htm.


Subject(s)
Algorithms , Learning , Humans , Neural Networks, Computer
4.
IEEE Trans Pattern Anal Mach Intell ; 45(9): 10778-10794, 2023 Sep.
Article in English | MEDLINE | ID: mdl-37023148

ABSTRACT

Image reconstruction from partial observations has attracted increasing attention. Conventional image reconstruction methods with hand-crafted priors often fail to recover fine image details due to the poor representation capability of the hand-crafted priors. Deep learning methods attack this problem by directly learning mapping functions between the observations and the targeted images can achieve much better results. However, most powerful deep networks lack transparency and are nontrivial to design heuristically. This paper proposes a novel image reconstruction method based on the Maximum a Posterior (MAP) estimation framework using learned Gaussian Scale Mixture (GSM) prior. Unlike existing unfolding methods that only estimate the image means (i.e., the denoising prior) but neglected the variances, we propose characterizing images by the GSM models with learned means and variances through a deep network. Furthermore, to learn the long-range dependencies of images, we develop an enhanced variant based on the Swin Transformer for learning GSM models. All parameters of the MAP estimator and the deep network are jointly optimized through end-to-end training. Extensive simulation and real data experimental results on spectral compressive imaging and image super-resolution demonstrate that the proposed method outperforms existing state-of-the-art methods.

5.
IEEE Trans Image Process ; 31: 3578-3590, 2022.
Article in English | MEDLINE | ID: mdl-35511851

ABSTRACT

Blind image quality assessment (BIQA), which is capable of precisely and automatically estimating human perceived image quality with no pristine image for comparison, attracts extensive attention and is of wide applications. Recently, many existing BIQA methods commonly represent image quality with a quantitative value, which is inconsistent with human cognition. Generally, human beings are good at perceiving image quality in terms of semantic description rather than quantitative value. Moreover, cognition is a needs-oriented task where humans are able to extract image contents with local to global semantics as they need. The mediocre quality value represents coarse or holistic image quality and fails to reflect degradation on hierarchical semantics. In this paper, to comply with human cognition, a novel quality caption model is inventively proposed to measure fine-grained image quality with hierarchical semantics degradation. Research on human visual system indicates there are hierarchy and reverse hierarchy correlations between hierarchical semantics. Meanwhile, empirical evidence shows that there are also bi-directional degradation dependencies between them. Thus, a novel bi-directional relationship-based network (BDRNet) is proposed for semantics degradation description, through adaptively exploring those correlations and degradation dependencies in a bi-directional manner. Extensive experiments demonstrate that our method outperforms the state-of-the-arts in terms of both evaluation performance and generalization ability.


Subject(s)
Cognition , Semantics , Humans
6.
Article in English | MEDLINE | ID: mdl-37015619

ABSTRACT

Unlike the success of neural architecture search (NAS) in high-level vision tasks, it remains challenging to find computationally efficient and memory-efficient solutions to low-level vision problems such as image restoration through NAS. One of the fundamental barriers to differential NAS-based image restoration is the optimization gap between the super-network and the sub-architectures, causing instability during the searching process. In this paper, we present a novel approach to fill this gap in image denoising application by connecting model-guided design (MoD) with NAS (MoD-NAS). Specifically, we propose to construct a new search space under a model-guided framework and develop more stable and efficient differential search strategies. MoD-NAS employs a highly reusable width search strategy and a densely connected search block to automatically select the operations of each layer as well as network width and depth via gradient descent. During the search process, the proposed MoD-NAS remains stable because of the smoother search space designed under the model-guided framework. Experimental results on several popular datasets show that our MoD-NAS method has achieved at least comparable even better PSNR performance than current state-of-the-art methods with fewer parameters, fewer flops, and less testing time.

7.
IEEE Trans Image Process ; 31: 458-471, 2022.
Article in English | MEDLINE | ID: mdl-34874856

ABSTRACT

Video quality assessment (VQA) task is an ongoing small sample learning problem due to the costly effort required for manual annotation. Since existing VQA datasets are of limited scale, prior research tries to leverage models pre-trained on ImageNet to mitigate this kind of shortage. Nonetheless, these well-trained models targeting on image classification task can be sub-optimal when applied on VQA data from a significantly different domain. In this paper, we make the first attempt to perform self-supervised pre-training for VQA task built upon contrastive learning method, targeting at exploiting the plentiful unlabeled video data to learn feature representation in a simple-yet-effective way. Specifically, we implement this idea by first generating distorted video samples with diverse distortion characteristics and visual contents based on the proposed distortion augmentation strategy. Afterwards, we conduct contrastive learning to capture quality-aware information by maximizing the agreement on feature representations of future frames and their corresponding predictions in the embedding space. In addition, we further introduce distortion prediction task as an additional learning objective to push the model towards discriminating different distortion categories of the input video. Solving these prediction tasks jointly with the contrastive learning not only provides stronger surrogate supervision signals, but also learns the shared knowledge among the prediction tasks. Extensive experiments demonstrate that our approach sets a new state-of-the-art in self-supervised learning for VQA task. Our results also underscore that the learned pre-trained model can significantly benefit the existing learning based VQA models. Source code is available at https://github.com/cpf0079/CSPT.


Subject(s)
Algorithms , Software
8.
IEEE Trans Image Process ; 30: 5754-5768, 2021.
Article in English | MEDLINE | ID: mdl-33979283

ABSTRACT

The trade-off between spatial and spectral resolution is one of the fundamental issues in hyperspectral images (HSI). Given the challenges of directly acquiring high-resolution hyperspectral images (HR-HSI), a compromised solution is to fuse a pair of images: one has high-resolution (HR) in the spatial domain but low-resolution (LR) in spectral-domain and the other vice versa. Model-based image fusion methods including pan-sharpening aim at reconstructing HR-HSI by solving manually designed objective functions. However, such hand-crafted prior often leads to inevitable performance degradation due to a lack of end-to-end optimization. Although several deep learning-based methods have been proposed for hyperspectral pan-sharpening, HR-HSI related domain knowledge has not been fully exploited, leaving room for further improvement. In this paper, we propose an iterative Hyperspectral Image Super-Resolution (HSISR) algorithm based on a deep HSI denoiser to leverage both domain knowledge likelihood and deep image prior. By taking the observation matrix of HSI into account during the end-to-end optimization, we show how to unfold an iterative HSISR algorithm into a novel model-guided deep convolutional network (MoG-DCN). The representation of the observation matrix by subnetworks also allows the unfolded deep HSISR network to work with different HSI situations, which enhances the flexibility of MoG-DCN. Extensive experimental results are reported to demonstrate that the proposed MoG-DCN outperforms several leading HSISR methods in terms of both implementation cost and visual quality. The code is available at https://see.xidian.edu.cn/faculty/wsdong/Projects/MoG-DCN.htm.

9.
IEEE Trans Image Process ; 30: 3650-3663, 2021.
Article in English | MEDLINE | ID: mdl-33705313

ABSTRACT

Blind image quality assessment (BIQA) is a useful but challenging task. It is a promising idea to design BIQA methods by mimicking the working mechanism of human visual system (HVS). The internal generative mechanism (IGM) indicates that the HVS actively infers the primary content (i.e., meaningful information) of an image for better understanding. Inspired by that, this paper presents a novel BIQA metric by mimicking the active inference process of IGM. Firstly, an active inference module based on the generative adversarial network (GAN) is established to predict the primary content, in which the semantic similarity and the structural dissimilarity (i.e., semantic consistency and structural completeness) are both considered during the optimization. Then, the image quality is measured on the basis of its primary content. Generally, the image quality is highly related to three aspects, i.e., the scene information (content-dependency), the distortion type (distortion-dependency), and the content degradation (degradation-dependency). According to the correlation between the distorted image and its primary content, the three aspects are analyzed and calculated respectively with a multi-stream convolutional neural network (CNN) based quality evaluator. As a result, with the help of the primary content obtained from the active inference and the comprehensive quality degradation measurement from the multi-stream CNN, our method achieves competitive performance on five popular IQA databases. Especially in cross-database evaluations, our method achieves significant improvements.


Subject(s)
Image Processing, Computer-Assisted/methods , Neural Networks, Computer , Algorithms , Databases, Factual
10.
IEEE Trans Pattern Anal Mach Intell ; 41(10): 2305-2318, 2019 10.
Article in English | MEDLINE | ID: mdl-30295612

ABSTRACT

Deep neural networks (DNNs) have shown very promising results for various image restoration (IR) tasks. However, the design of network architectures remains a major challenging for achieving further improvements. While most existing DNN-based methods solve the IR problems by directly mapping low quality images to desirable high-quality images, the observation models characterizing the image degradation processes have been largely ignored. In this paper, we first propose a denoising-based IR algorithm, whose iterative steps can be computed efficiently. Then, the iterative process is unfolded into a deep neural network, which is composed of multiple denoisers modules interleaved with back-projection (BP) modules that ensure the observation consistencies. A convolutional neural network (CNN) based denoiser that can exploit the multi-scale redundancies of natural images is proposed. As such, the proposed network not only exploits the powerful denoising ability of DNNs, but also leverages the prior of the observation model. Through end-to-end training, both the denoisers and the BP modules can be jointly optimized. Experimental results on several IR tasks, e.g., image denoisig, super-resolution and deblurring show that the proposed method can lead to very competitive and often state-of-the-art results on several IR tasks, including image denoising, deblurring, and super-resolution.

11.
IEEE Trans Image Process ; 27(10): 4810-4824, 2018 Oct.
Article in English | MEDLINE | ID: mdl-29969393

ABSTRACT

Recovering the background and foreground parts from video frames has important applications in video surveillance. Under the assumption that the background parts are stationary and the foreground are sparse, most of existing methods are based on the framework of robust principal component analysis (RPCA), i.e., modeling the background and foreground parts as a low-rank and sparse matrices, respectively. However, in realistic complex scenarios, the conventional norm sparse regularizer often fails to well characterize the varying sparsity of the foreground components. How to select the sparsity regularizer parameters adaptively according to the local statistics is critical to the success of the RPCA framework for background subtraction task. In this paper, we propose to model the sparse component with a Gaussian scale mixture (GSM) model. Compared with the conventional norm, the GSM-based sparse model has the advantages of jointly estimating the variances of the sparse coefficients (and hence the regularization parameters) and the unknown sparse coefficients, leading to significant estimation accuracy improvements. Moreover, considering that the foreground parts are highly structured, a structured extension of the GSM model is further developed. Specifically, the input frame is divided into many homogeneous regions using superpixel segmentation. By characterizing the set of sparse coefficients in each homogeneous region with the same GSM prior, the local dependencies among the sparse coefficients can be effectively exploited, leading to further improvements for background subtraction. Experimental results on several challenging scenarios show that the proposed method performs much better than most of existing background subtraction methods in terms of both performance and speed.

12.
Article in English | MEDLINE | ID: mdl-29994530

ABSTRACT

Recovering a high-resolution (HR) image from its low-resolution (LR) version is an ill-posed inverse problem. Learning accurate prior of HR images is of great importance to solve this inverse problem. Existing super-resolution (SR) methods either learn a non-parametric image prior from training data (a large set of LR/HR patch pairs) or estimate a parametric prior from the LR image analytically. Both methods have their limitations: the former lacks flexibility when dealing with different SR settings; while the latter often fails to adapt to spatially varying image structures. In this paper, we propose to take a hybrid approach toward image SR by combining those two lines of ideas - that is, a parametric sparse prior of HR images is learned from the training set as well as the input LR image. By exploiting the strengths of both worlds, we can more accurately recover the sparse codes and therefore HR image patches than conventional sparse coding approaches. Experimental results show that the proposed hybrid SR method significantly outperforms existing model-based SR methods and is highly competitive to current state-of-the-art learning-based SR methods in terms of both subjective and objective image qualities.

13.
IEEE Trans Image Process ; 26(7): 3171-3186, 2017 Jul.
Article in English | MEDLINE | ID: mdl-28278467

ABSTRACT

Recovering the image corrupted by additive white Gaussian noise (AWGN) and impulse noise is a challenging problem due to its difficulties in an accurate modeling of the distributions of the mixture noise. Many efforts have been made to first detect the locations of the impulse noise and then recover the clean image with image in painting techniques from an incomplete image corrupted by AWGN. However, it is quite challenging to accurately detect the locations of the impulse noise when the mixture noise is strong. In this paper, we propose an effective mixture noise removal method based on Laplacian scale mixture (LSM) modeling and nonlocal low-rank regularization. The impulse noise is modeled with LSM distributions, and both the hidden scale parameters and the impulse noise are jointly estimated to adaptively characterize the real noise. To exploit the nonlocal self-similarity and low-rank nature of natural image, a nonlocal low-rank regularization is adopted to regularize the denoising process. Experimental results on synthetic noisy images show that the proposed method outperforms existing mixture noise removal methods.

14.
IEEE Trans Image Process ; 25(5): 2337-52, 2016 May.
Article in English | MEDLINE | ID: mdl-27019486

ABSTRACT

Hyperspectral imaging has many applications from agriculture and astronomy to surveillance and mineralogy. However, it is often challenging to obtain high-resolution (HR) hyperspectral images using existing hyperspectral imaging techniques due to various hardware limitations. In this paper, we propose a new hyperspectral image super-resolution method from a low-resolution (LR) image and a HR reference image of the same scene. The estimation of the HR hyperspectral image is formulated as a joint estimation of the hyperspectral dictionary and the sparse codes based on the prior knowledge of the spatial-spectral sparsity of the hyperspectral image. The hyperspectral dictionary representing prototype reflectance spectra vectors of the scene is first learned from the input LR image. Specifically, an efficient non-negative dictionary learning algorithm using the block-coordinate descent optimization technique is proposed. Then, the sparse codes of the desired HR hyperspectral image with respect to learned hyperspectral basis are estimated from the pair of LR and HR reference images. To improve the accuracy of non-negative sparse coding, a clustering-based structured sparse coding method is proposed to exploit the spatial correlation among the learned sparse codes. The experimental results on both public datasets and real LR hypspectral images suggest that the proposed method substantially outperforms several existing HR hyperspectral image recovery techniques in the literature in terms of both objective quality metrics and computational efficiency.

15.
IEEE Trans Image Process ; 24(11): 4602-13, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26219097

ABSTRACT

The human visual system is highly adaptive to extract structure information for scene perception, and structure character is widely used in perception-oriented image processing works. However, the existing structure descriptors mainly describe the luminance contrast of a local region, but cannot effectively represent the spatial correlation of structure. In this paper, we introduce a novel structure descriptor according to the orientation selectivity mechanism in the primary visual cortex. Research on cognitive neuroscience indicate that the arrangement of excitatory and inhibitory cortex cells arise orientation selectivity in a local receptive field, within which the primary visual cortex performs visual information extraction for scene understanding. Inspired by the orientation selectivity mechanism, we compute the correlations among pixels in a local region based on the similarities of their preferred orientation. By imitating the arrangement of the excitatory/inhibitory cells, the correlations between a central pixel and its local neighbors are binarized, and the spatial correlation is represented with a set of binary values, which is named the orientation selectivity-based pattern. Then, taking both the gradient magnitude and the orientation selectivity-based pattern into account, a rotation invariant structure descriptor is introduced. The proposed structure descriptor is applied in texture classification and reduced reference image quality assessment, as two different application domains to verify its generality and robustness. Experimental results demonstrate that the orientation selectivity-based structure descriptor is robust to disturbance, and can effectively represent the structure degradation caused by different types of distortion.


Subject(s)
Models, Neurological , Visual Cortex/cytology , Visual Cortex/physiology , Visual Perception/physiology , Algorithms , Computer Simulation , Humans , Orientation/physiology
16.
Sensors (Basel) ; 15(2): 4176-92, 2015 Feb 12.
Article in English | MEDLINE | ID: mdl-25686307

ABSTRACT

Compressive sensing-based synthetic aperture radar (SAR) imaging has shown its superior capability in high-resolution image formation. However, most of those works focus on the scenes that can be sparsely represented in fixed spaces. When dealing with complicated scenes, these fixed spaces lack adaptivity in characterizing varied image contents. To solve this problem, a new compressive sensing-based radar imaging approach with adaptive sparse representation is proposed. Specifically, an autoregressive model is introduced to adaptively exploit the structural sparsity of an image. In addition, similarity among pixels is integrated into the autoregressive model to further promote the capability and thus an adaptive sparse representation facilitated by a weighted autoregressive model is derived. Since the weighted autoregressive model is inherently determined by the unknown image, we propose a joint optimization scheme by iterative SAR imaging and updating of the weighted autoregressive model to solve this problem. Eventually, experimental results demonstrated the validity and generality of the proposed approach.


Subject(s)
Diagnostic Imaging , Image Processing, Computer-Assisted/methods , Remote Sensing Technology , Algorithms , Humans , Models, Theoretical
17.
IEEE Trans Image Process ; 23(12): 5249-62, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25330492

ABSTRACT

In compressive sensing, wavelet space is widely used to generate sparse signal (image signal in particular) representations. In this paper, we propose a novel approach of statistical context modeling to increase the level of sparsity of wavelet image representations. It is shown, contrary to a widely held assumption, that high-frequency wavelet coefficients have nonzero mean distributions if conditioned on local image structures. Removing this bias can make wavelet image representations sparser, i.e., having a greater number of zero and closeto-zero coefficients. The resulting unbiased probability models can significantly improve the performance of existing wavelet-based compressive image reconstruction methods in both PSNR and visual quality. An efficient algorithm is presented to solve the compressive image recovery (CIR) problem using the refined models. Experimental results on both simulated compressive sensing (CS) image data and real CS image data show that the new CIR method significantly outperforms existing CIR methods in both PSNR and visual quality.

18.
IEEE Trans Image Process ; 23(10): 4527-38, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25167553

ABSTRACT

Designing an appropriate regularizer is of great importance for accurate optical flow estimation. Recent works exploiting the nonlocal similarity and the sparsity of the motion field have led to promising flow estimation results. In this paper, we propose to unify these two powerful priors. To this end, we propose an effective flow regularization technique based on joint low-rank and sparse matrix recovery. By grouping similar flow patches into clusters, we effectively regularize the motion field by decomposing each set of similar flow patches into a low-rank component and a sparse component. For better enforcing the low-rank property, instead of using the convex nuclear norm, we use the log det(·) function as the surrogate of rank, which can also be efficiently minimized by iterative singular value thresholding. Experimental results on the Middlebury benchmark show that the performance of the proposed nonlocal sparse and low-rank regularization method is higher than (or comparable to) those of previous approaches that harness these same priors, and is competitive to current state-of-the-art methods.

19.
IEEE Trans Image Process ; 23(8): 3618-32, 2014 Aug.
Article in English | MEDLINE | ID: mdl-24951688

ABSTRACT

Sparsity has been widely exploited for exact reconstruction of a signal from a small number of random measurements. Recent advances have suggested that structured or group sparsity often leads to more powerful signal reconstruction techniques in various compressed sensing (CS) studies. In this paper, we propose a nonlocal low-rank regularization (NLR) approach toward exploiting structured sparsity and explore its application into CS of both photographic and MRI images. We also propose the use of a nonconvex log det ( X) as a smooth surrogate function for the rank instead of the convex nuclear norm and justify the benefit of such a strategy using extensive experiments. To further improve the computational efficiency of the proposed algorithm, we have developed a fast implementation using the alternative direction multiplier method technique. Experimental results have shown that the proposed NLR-CS algorithm can significantly outperform existing state-of-the-art CS techniques for image recovery.


Subject(s)
Algorithms , Brain/anatomy & histology , Data Compression/methods , Image Interpretation, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Humans , Image Enhancement/methods , Pattern Recognition, Automated/methods , Reproducibility of Results , Sample Size , Sensitivity and Specificity , Signal Processing, Computer-Assisted
20.
IEEE Trans Image Process ; 22(4): 1382-94, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23314773

ABSTRACT

Sparse representation is proven to be a promising approach to image super-resolution, where the low-resolution (LR) image is usually modeled as the down-sampled version of its high-resolution (HR) counterpart after blurring. When the blurring kernel is the Dirac delta function, i.e., the LR image is directly down-sampled from its HR counterpart without blurring, the super-resolution problem becomes an image interpolation problem. In such cases, however, the conventional sparse representation models (SRM) become less effective, because the data fidelity term fails to constrain the image local structures. In natural images, fortunately, many nonlocal similar patches to a given patch could provide nonlocal constraint to the local structure. In this paper, we incorporate the image nonlocal self-similarity into SRM for image interpolation. More specifically, a nonlocal autoregressive model (NARM) is proposed and taken as the data fidelity term in SRM. We show that the NARM-induced sampling matrix is less coherent with the representation dictionary, and consequently makes SRM more effective for image interpolation. Our extensive experimental results demonstrate that the proposed NARM-based image interpolation method can effectively reconstruct the edge structures and suppress the jaggy/ringing artifacts, achieving the best image interpolation results so far in terms of PSNR as well as perceptual quality metrics such as SSIM and FSIM.


Subject(s)
Algorithms , Image Processing, Computer-Assisted/methods , Animals , Butterflies/anatomy & histology , Face/anatomy & histology , Humans , Models, Statistical , Starfish/anatomy & histology
SELECTION OF CITATIONS
SEARCH DETAIL