Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
Add more filters










Publication year range
1.
IEEE Trans Image Process ; 32: 3092-3107, 2023.
Article in English | MEDLINE | ID: mdl-37204945

ABSTRACT

In this paper we propose novel extensions to JPEG 2000 for the coding of discontinuous media which includes piecewise smooth imagery such as depth maps and optical flows. These extensions use breakpoints to model discontinuity boundary geometry and apply a breakpoint dependent Discrete Wavelet Transform (BP-DWT) to the input imagery. The highly scalable and accessible coding features provided by the JPEG 2000 compression framework are preserved by our proposed extensions, with the breakpoint and transform components encoded as independent bit streams that can be progressively decoded. Comparative rate-distortion results are provided along with corresponding visual examples which highlight the advantages of using breakpoint representations with accompanying BD-DWT and embedded bit-plane coding. Recently our proposed extensions have been adopted and are in the process of being published as a new Part 17 to the JPEG 2000 family of coding standards.

2.
Article in English | MEDLINE | ID: mdl-31613756

ABSTRACT

This paper proposes graph Laplacian regularization for robust estimation of optical flow. First, we analyze the spectral properties of dense graph Laplacians and show that dense graphs achieve a better trade-off between preserving flow discontinuities and filtering noise, compared with the usual Laplacian. Using this analysis, we then propose a robust optical flow estimation method based on Gaussian graph Laplacians. We revisit the framework of iteratively reweighted least-squares from the perspective of graph edge reweighting, and employ the Welsch loss function to preserve flow discontinuities and handle occlusions. Our experiments using the Middlebury and MPI-Sintel optical flow datasets demonstrate the robustness and the efficiency of our proposed approach.

3.
IEEE Trans Image Process ; 28(9): 4313-4327, 2019 Sep.
Article in English | MEDLINE | ID: mdl-30908217

ABSTRACT

In this paper, we are interested in the compression of image sets or video with considerable changes in illumination. We develop a framework to decompose frames into illumination fields and texture in order to achieve sparser representations of frames which is beneficial for compression. Illumination variations or contrast ratio factors among frames are described by a full resolution multiplicative field. First, we propose a Lifting-based Illumination Adaptive Transform (LIAT) framework which incorporates illumination compensation to temporal wavelet transforms. We estimate a full resolution illumination field, taking heed of its spatial sparsity by a rate-distortion (R-D) driven framework. An affine mesh model is also developed as a point of comparison. We find the operational coding cost of the subband frames by modeling a typical t + 2D wavelet video coding system. While our general findings on R-D optimization are applicable to a range of coding frameworks, in this paper, we report results based on employing JPEG 2000 coding tools. The experimental results highlight the benefits of the proposed R-D driven illumination estimation and compensation in comparison with alternative scalable coding methods and non-scalable coding schemes of AVC and HEVC employing weighted prediction.

4.
IEEE Trans Image Process ; 28(7): 3205-3218, 2019 Jul.
Article in English | MEDLINE | ID: mdl-30676962

ABSTRACT

We present a compression scheme for multiview imagery that facilitates high scalability and accessibility of the compressed content. Our scheme relies upon constructing at a single base view, a disparity model for a group of views, and then utilizing this base-anchored model to infer disparity at all views belonging to the group. We employ a hierarchical disparity-compensated inter-view transform where the corresponding analysis and synthesis filters are applied along the geometric flows defined by the base-anchored disparity model. The output of this inter-view transform along with the disparity information is subjected to spatial wavelet transforms and embedded block-based coding. Rate-distortion results reveal superior performance to the x.265 anchor chosen by the JPEG Pleno standards activity for the coding of multiview imagery captured by high-density camera arrays.

5.
IEEE Trans Image Process ; 28(1): 343-355, 2019 Jan.
Article in English | MEDLINE | ID: mdl-30176592

ABSTRACT

We address the problem of decoding joint photographic experts group (JPEG)-encoded images with less visual artifacts. We view the decoding task as an ill-posed inverse problem and find a regularized solution using a convex, graph Laplacian-regularized model. Since the resulting problem is non-smooth and entails non-local regularization, we use fast high-dimensional Gaussian filtering techniques with the proximal gradient descent method to solve our convex problem efficiently. Our patch-based "coefficient graph" is better suited than the traditional pixel-based ones for regularizing smooth non-stationary signals such as natural images and relates directly to classic non-local means de-noising of images. We also extend our graph along the temporal dimension to handle the decoding of M-JPEG-encoded video. Despite the minimalistic nature of our convex problem, it produces decoded images with similar quality to other more complex, state-of-the-art methods while being up to five times faster. We also expound on the relationship between our method and the classic ANCE method, reinterpreting ANCE from a graph-based regularization perspective.

6.
IEEE Trans Image Process ; 25(3): 1095-108, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26742132

ABSTRACT

This paper proposes a new method of calculating a matching metric for motion estimation. The proposed method splits the information in the source images into multiple scale and orientation subbands, reduces the subband values to a binary representation via an adaptive thresholding algorithm, and uses mutual information to model the similarity of corresponding square windows in each image. A moving window strategy is applied to recover a dense estimated motion field whose properties are explored. The proposed matching metric is a sum of mutual information scores across space, scale, and orientation. This facilitates the exploitation of information diversity in the source images. Experimental comparisons are performed amongst several related approaches, revealing that the proposed matching metric is better able to exploit information diversity, generating more accurate motion fields.

7.
IEEE Trans Image Process ; 23(9): 3802-15, 2014 Sep.
Article in English | MEDLINE | ID: mdl-24968173

ABSTRACT

In this paper, we propose the use of "motion hints" to produce interframe predictions. A motion hint is a loose and global description of motion that can be communicated using metadata; it describes a continuous and invertible motion model over multiple frames, spatially overlapping other motion hints. A motion hint provides a reasonably accurate description of motion but only a loose description of where it is applicable; it is the task of the client to identify the exact locations where this motion model is applicable. The focus of this paper is a probabilistic multiscale approach to identifying these locations of applicability; the method is robust to noise, quantization, and contrast changes. The proposed approach employs the Laplacian pyramid; it generates motion hint probabilities from observations at each scale of the pyramid. These probabilities are then combined across the scales of the pyramid starting from the coarsest scale. The computational cost of the approach is reasonable, and only the neighborhood of a pixel is employed to determine a motion hint probability, which makes parallel implementation feasible. This paper also elaborates on how motion hint probabilities are exploited in generating interframe predictions. The scheme of this paper is applicable to closed-loop prediction, but it is more useful in open-loop prediction scenarios, such as using prediction in conjunction with remote browsing of surveillance footage, communicated by a JPEG2000 Interactive Protocol (JPIP) server. We show that the interframe predictions obtained using the proposed approach are good both visually and in terms of PSNR.

8.
IEEE Trans Image Process ; 23(5): 2222-34, 2014 May.
Article in English | MEDLINE | ID: mdl-24686283

ABSTRACT

We present a noniterative multiresolution motion estimation strategy, involving block-based comparisons in each detail band of a Laplacian pyramid. A novel matching score is developed and analyzed. The proposed matching score is based on a class of nonlinear transformations of Laplacian detail bands, yielding 1-bit or 2-bit representations. The matching score is evaluated in a dense full-search motion estimation setting, with synthetic video frames and an optical flow data set. Together with a strategy for combining the matching scores across resolutions, the proposed method is shown to produce smoother and more robust estimates than mean square error (MSE) in each detail band and combined. It tolerates more of nontranslational motion, such as rotation, validating the analysis, while providing much better localization of the motion discontinuities. We also provide an efficient implementation of the motion estimation strategy and show that the computational complexity of the approach is closely related to the traditional MSE block-based full-search motion estimation procedure.

9.
IEEE Trans Image Process ; 20(9): 2650-63, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21411403

ABSTRACT

In a recent work, the authors proposed a novel paradigm for interactive video streaming and coined the term JPEG2000-Based Scalable Interactive Video (JSIV) for it. In this work, we investigate JSIV when motion compensation is employed to improve prediction, something that was intentionally left out in our earlier treatment. JSIV relies on three concepts: storing the video sequence as independent JPEG2000 frames to provide quality and spatial resolution scalability, prediction and conditional replenishment of code-blocks to exploit inter-frame redundancy, and loosely coupled server and client policies in which a server optimally selects the number of quality layers for each code-block transmitted and a client makes the most of the received (distorted) frames. In JSIV, the server transmission problem is optimally solved using Lagrangian-style rate-distortion optimization. The flexibility of JSIV enables us to employ a wide variety of frame prediction arrangements, including hierarchical B-frames. JSIV provides considerably better interactivity compared with existing schemes and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. Experimental results show that JSIV's performance is inferior to that of SVC in conventional streaming applications while JSIV performs better in interactive browsing applications.

10.
IEEE Trans Image Process ; 20(5): 1435-49, 2011 May.
Article in English | MEDLINE | ID: mdl-21095869

ABSTRACT

We propose a novel paradigm for interactive video streaming and we coin the term JPEG2000-based scalable interactive video (JSIV) for it. JSIV utilizes JPEG2000 to independently compress the original video sequence frames and provide for quality and spatial resolution scalability. To exploit interframe redundancy, JSIV utilizes prediction and conditional replenishment of code-blocks aided by a server policy that optimally selects the number of quality layer for each code-block transmitted and a client policy that makes most of the received (distorted) frames. It is also possible for JSIV to employ motion compensation; however, we leave this topic to future work. To optimally solve the server transmission problem, a Lagrangian-style rate-distortion optimization procedure is employed. In JSIV, a wide variety of frame prediction arrangements can be employed including hierarchical B-frames of the scalable video coding (SVC) extension of the H.264/AVC standard. JSIV provides considerably better interactivity compared to existing schemes and can adapt immediately to interactive changes in client interests, such as forward or backward playback and zooming into individual frames. Experimental results for surveillance footage, which does not suffer from the absence of motion compensation, show that JSIV's performance is comparable to that of SVC in some usage scenarios while JSIV performs better in others.


Subject(s)
Algorithms , Image Enhancement/methods , Image Processing, Computer-Assisted/methods , Videotape Recording/methods , Computer Graphics , Data Compression/methods
SELECTION OF CITATIONS
SEARCH DETAIL
...