Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Entropy (Basel) ; 23(7)2021 Jul 10.
Article in English | MEDLINE | ID: mdl-34356422

ABSTRACT

The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including its original version, a suitable version as an input to the underlying DNN, then the classification accuracy of the underlying DNN can be improved significantly while the size in bits of the selected input is, on average, reduced dramatically in comparison with the original image. This is in contrast to the conventional understanding that JPEG compression generally degrades the classification accuracy of DL. Specifically, for each original image, consider its 10 JPEG compressed versions with their quality factor (QF) values from {100,90,80,70,60,50,40,30,20,10}. Under the assumption that the ground truth label of the original image is known at the time of selecting an input, but unknown to the underlying DNN, we present a selector called Highest Rank Selector (HRS). It is shown that HRS is optimal in the sense of achieving the highest Top k accuracy on any set of images for any k among all possible selectors. When the underlying DNN is Inception V3 or ResNet-50 V2, HRS improves, on average, the Top 1 classification accuracy and Top 5 classification accuracy on the whole ImageNet validation dataset by 5.6% and 1.9%, respectively, while reducing the input size in bits dramatically-the compression ratio (CR) between the size of the original images and the size of the selected input images by HRS is 8 for the whole ImageNet validation dataset. When the ground truth label of the original image is unknown at the time of selection, we further propose a new convolutional neural network (CNN) topology which is based on the underlying DNN and takes the original image and its 10 JPEG compressed versions as 11 parallel inputs. It is demonstrated that the proposed new CNN topology, even when partially trained, can consistently improve the Top 1 accuracy of Inception V3 and ResNet-50 V2 by approximately 0.4% and the Top 5 accuracy of Inception V3 and ResNet-50 V2 by 0.32% and 0.2%, respectively. Other selectors without the knowledge of the ground truth label of the original image are also presented. They maintain the Top 1 accuracy, the Top 5 accuracy, or the Top 1 and Top 5 accuracy of the underlying DNN, while achieving CRs of 8.8, 3.3, and 3.1, respectively.

2.
Article in English | MEDLINE | ID: mdl-33026988

ABSTRACT

This paper revisits the problem of rate distortion optimization (RDO) with focus on inter-picture dependence. A joint RDO framework which incorporates the Lagrange multiplier as one of parameters to be optimized is proposed. Simplification strategies are demonstrated for practical applications. To make the problem tractable, we consider an approach where prediction residuals of pictures in a video sequence are assumed to be emitted from a finite set of sources. Consequently the RDO problem is formulated as finding optimal coding parameters for a finite number of sources, regardless of the length of the video sequence. Specifically, in cases where a hierarchical prediction structure is used, prediction residuals of pictures at the same prediction layer are assumed to be emitted from a common source. Following this approach, we propose an iterative algorithm to alternatively optimize the selections of quantization parameters (QPs) and the corresponding Lagrange multipliers. Based on the results of the iterative algorithm, we further propose two practical algorithms to compute QPs and the Lagrange multipliers for the RA(random access) hierarchical video coding: the first practical algorithm uses a fixed formula to compute QPs and the Lagrange multipliers, and the second practical algorithm adaptively adjusts both QPs and the Lagrange multipliers. Experimental results show that these three algorithms, integrated into the HM 16.20 reference software of HEVC, can achieve considerable RD improvements over the standard HM 16.20 encoder, in the common RA test configuration.

3.
Sheng Li Xue Bao ; 68(2): 141-7, 2016 Apr 25.
Article in Chinese | MEDLINE | ID: mdl-27108900

ABSTRACT

To study the pathological mechanisms of Niemann-Pick disease type C1, we observed the changes of activation of glial cells in the olfactory bulb of Npc1 mutant (Npc1(-/-)) mice. The genomic DNA was extracted from mouse tails for genotyping by PCR. Immunofluorescent histochemistry was performed to examine the activation of microglia and astrocytes in the olfactory bulb of Npc1(-/-) mice on postnatal day 30. NeuN, phosphorylated neurofilament (NF), Doublecortin (DCX), CD68 and GFAP were detected by Western blot. The results showed that Npc1 gene mutation strongly increased the activation of astrocytes and microglia in olfactory bulb associated with increased protein levels of CD68 and GFAP. Furthermore, the expression of phosphorylated NF was also significantly increased in the olfactory bulb of Npc1(-/-) mice compared with that in Npc1(+/+) mice. However, DCX expression was significantly reduced. The above results suggest that there are some early changes in the olfactory bulb of Npc1(-/-) mice.


Subject(s)
Neuroglia , Niemann-Pick Disease, Type C , Olfactory Bulb , Animals , Astrocytes , Axons , Doublecortin Protein , Genotype , Mice , Mice, Knockout , Microglia , Phosphorylation
4.
IEEE Trans Image Process ; 24(3): 886-900, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25532182

ABSTRACT

Recently, a new probability model dubbed the Laplacian transparent composite model (LPTCM) was developed for DCT coefficients, which could identify outlier coefficients in addition to providing superior modeling accuracy. In this paper, we aim at exploring its applications to image compression. To this end, we propose an efficient nonpredictive image compression system, where quantization (including both hard-decision quantization (HDQ) and soft-decision quantization (SDQ)) and entropy coding are completely redesigned based on the LPTCM. When tested over standard test images, the proposed system achieves overall coding results that are among the best and similar to those of H.264 or HEVC intra (predictive) coding, in terms of rate versus visual quality. On the other hand, in terms of rate versus objective quality, it significantly outperforms baseline JPEG by more than 4.3 dB in PSNR on average, with a moderate increase on complexity, and ECEB, the state-of-the-art nonpredictive image coding, by 0.75 dB when SDQ is OFF (i.e., HDQ case), with the same level of computational complexity, and by 1 dB when SDQ is ON, at the cost of slight increase in complexity. In comparison with H.264 intracoding, our system provides an overall 0.4-dB gain or so, with dramatically reduced computational complexity; in comparison with HEVC intracoding, it offers comparable coding performance in the high-rate region or for complicated images, but with only less than 5% of the HEVC intracoding complexity. In addition, our proposed system also offers multiresolution capability, which, together with its comparatively high coding efficiency and low complexity, makes it a good alternative for real-time image processing applications.

5.
IEEE Trans Image Process ; 23(11): 4799-811, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25248184

ABSTRACT

Quantization table design is revisited for image/video coding where soft decision quantization (SDQ) is considered. Unlike conventional approaches, where quantization table design is bundled with a specific encoding method, we assume optimal SDQ encoding and design a quantization table for the purpose of reconstruction. Under this assumption, we model transform coefficients across different frequencies as independently distributed random sources and apply the Shannon lower bound to approximate the rate distortion function of each source. We then show that a quantization table can be optimized in a way that the resulting distortion complies with certain behavior. Guided by this new design principle, we propose an efficient statistical-model-based algorithm using the Laplacian model to design quantization tables for DCT-based image coding. When applied to standard JPEG encoding, it provides more than 1.5-dB performance gain in PSNR, with almost no extra burden on complexity. Compared with the state-of-the-art JPEG quantization table optimizer, the proposed algorithm offers an average 0.5-dB gain in PSNR with computational complexity reduced by a factor of more than 2000 when SDQ is OFF, and a 0.2-dB performance gain or more with 85% of the complexity reduced when SDQ is ON. Significant compression performance improvement is also seen when the algorithm is applied to other image coding systems proposed in the literature.

6.
IEEE Trans Image Process ; 23(3): 1303-16, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24723528

ABSTRACT

The distributions of discrete cosine transform (DCT) coefficients of images are revisited on a per image base. To better handle, the heavy tail phenomenon commonly seen in the DCT coefficients, a new model dubbed a transparent composite model (TCM) is proposed and justified for both modeling accuracy and an additional data reduction capability. Given a sequence of the DCT coefficients, a TCM first separates the tail from the main body of the sequence. Then, a uniform distribution is used to model the DCT coefficients in the heavy tail, whereas a different parametric distribution is used to model data in the main body. The separate boundary and other parameters of the TCM can be estimated via maximum likelihood estimation. Efficient online algorithms are proposed for parameter estimation and their convergence is also proved. Experimental results based on Kullback-Leibler divergence and χ(2) test show that for real-valued continuous ac coefficients, the TCM based on truncated Laplacian offers the best tradeoff between modeling accuracy and complexity. For discrete or integer DCT coefficients, the discrete TCM based on truncated geometric distributions (GMTCM) models the ac coefficients more accurately than pure Laplacian models and generalized Gaussian models in majority cases while having simplicity and practicality similar to those of pure Laplacian models. In addition, it is demonstrated that the GMTCM also exhibits a good capability of data reduction or feature extraction-the DCT coefficients in the heavy tail identified by the GMTCM are truly outliers, and these outliers represent an outlier image revealing some unique global features of the image. Overall, the modeling performance and the data reduction feature of the GMTCM make it a desirable choice for modeling discrete or integer DCT coefficients in the real-world image or video applications, as summarized in a few of our further studies on quantization design, entropy coding design, and image understanding and management.


Subject(s)
Algorithms , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Numerical Analysis, Computer-Assisted
7.
IEEE Trans Image Process ; 18(1): 63-74, 2009 Jan.
Article in English | MEDLINE | ID: mdl-19095519

ABSTRACT

To maximize rate distortion performance while remaining faithful to the JPEG syntax, the joint optimization of the Huffman tables, quantization step sizes, and DCT indices of a JPEG encoder is investigated. Given Huffman tables and quantization step sizes, an efficient graph-based algorithm is first proposed to find the optimal DCT indices in the form of run-size pairs. Based on this graph-based algorithm, an iterative algorithm is then presented to jointly optimize run-length coding, Huffman coding, and quantization table selection. The proposed iterative algorithm not only results in a compressed bitstream completely compatible with existing JPEG and MPEG decoders, but is also computationally efficient. Furthermore, when tested over standard test images, it achieves the best JPEG compression results, to the extent that its own JPEG compression performance even exceeds the quoted PSNR results of some state-of-the-art wavelet-based image coders such as Shapiro's embedded zerotree wavelet algorithm at the common bit rates under comparison. Both the graph-based algorithm and the iterative algorithm can be applied to application areas such as web image acceleration, digital camera image compression, MPEG frame optimization, and transcoding, etc.


Subject(s)
Algorithms , Data Compression/methods , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Signal Processing, Computer-Assisted , Reproducibility of Results , Sensitivity and Specificity
8.
IEEE Trans Image Process ; 18(1): 75-89, 2009 Jan.
Article in English | MEDLINE | ID: mdl-19095520

ABSTRACT

This paper proposes a designing framework for down-sampling compressed images/video with arbitrary ratio in the discrete cosine transform (DCT) domain. In this framework, we first derive a set of DCT-domain down-sampling methods which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain and show that the set contains a wide range of methods with various complexity and visual quality. Then, for a preselected spatial-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given spatial-domain down-sampling method for a trade-off between the visual quality and the complexity. By modeling LTDS as a multiple layer network, a so-called structural learning with forgetting algorithm is then applied to solve the optimization problem. The proposed framework has been applied to discover optimal LTDSs corresponding to a spatial down-sampling method with Butterworth low-pass filtering and bicubic interpolation. Experimental results show that the resulting LTDS achieves a significant reduction on the complexity when compared with other methods in the literature with similar visual quality.


Subject(s)
Algorithms , Data Compression/methods , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Signal Processing, Computer-Assisted , Video Recording/methods , Reproducibility of Results , Sensitivity and Specificity
9.
IEEE Trans Image Process ; 16(7): 1774-84, 2007 Jul.
Article in English | MEDLINE | ID: mdl-17605376

ABSTRACT

Rate distortion (RD) optimization for H.264 interframe coding with complete baseline decoding compatibility is investigated on a frame basis. Using soft decision quantization (SDQ) rather than the standard hard decision quantization, we first establish a general framework in which motion estimation, quantization, and entropy coding (in H.264) for the current frame can be jointly designed to minimize a true RD cost given previously coded reference frames. We then propose three RD optimization algorithms--a graph-based algorithm for near optimal SDQ in H.264 baseline encoding given motion estimation and quantization step sizes, an algorithm for near optimal residual coding in H.264 baseline encoding given motion estimation, and an iterative overall algorithm to optimize H.264 baseline encoding for each individual frame given previously coded reference frames-with them embedded in the indicated order. The graph-based algorithm for near optimal SDQ is the core; given motion estimation and quantization step sizes, it is guaranteed to perform optimal SDQ if the weak adjacent block dependency utilized in the context adaptive variable length coding of H.264 is ignored for optimization. The proposed algorithms have been implemented based on the reference encoder JM82 of H.264 with complete compatibility to the baseline profile. Experiments show that for a set of typical video testing sequences, the graph-based algorithm for near optimal SDQ, the algorithm for near optimal residual coding, and the overall algorithm achieve on average, 6%, 8%, and 12%, respectively, rate reduction at the same PSNR (ranging from 30 to 38 dB) when compared with the RD optimization method implemented in the H.264 reference software.


Subject(s)
Algorithms , Data Compression/methods , Data Compression/standards , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Signal Processing, Computer-Assisted , Video Recording/methods , Artifacts , Computer Graphics/standards , Documentation/standards , Internationality , Multimedia/standards , Numerical Analysis, Computer-Assisted
10.
IEEE Trans Image Process ; 15(6): 1680-9, 2006 Jun.
Article in English | MEDLINE | ID: mdl-16764291

ABSTRACT

We propose the concept of quality-aware image, in which certain extracted features of the original (high-quality) image are embedded into the image data as invisible hidden messages. When a distorted version of such an image is received, users can decode the hidden messages and use them to provide an objective measure of the quality of the distorted image. To demonstrate the idea, we build a practical quality-aware image encoding, decoding and quality analysis system, which employs: 1) a novel reduced-reference image quality assessment algorithm based on a statistical model of natural images and 2) a previously developed quantization watermarking-based data hiding technique in the wavelet transform domain.


Subject(s)
Algorithms , Data Compression/methods , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Signal Processing, Computer-Assisted , Quality Control
SELECTION OF CITATIONS
SEARCH DETAIL
...