Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
IEEE Trans Image Process ; 33: 1614-1626, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38358876

RESUMO

We present a systematic approach for training and testing structural texture similarity metrics (STSIMs) so that they can be used to exploit texture redundancy for structurally lossless image compression. The training and testing is based on a set of image distortions that reflect the characteristics of the perturbations present in natural texture images. We conduct empirical studies to determine the perceived similarity scale across all pairs of original and distorted textures. We then introduce a data-driven approach for training the Mahalanobis formulation of STSIM based on the resulting annotated texture pairs. Experimental results demonstrate that training results in significant improvements in metric performance. We also show that the performance of the trained STSIM metrics is competitive with state of the art metrics based on convolutional neural networks, at substantially lower computational cost.

2.
IEEE Trans Image Process ; 31: 5529-5542, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35951565

RESUMO

We present a pattern-based approach for reconstructing a K-level image from cutsets, dense samples taken along a family of lines or curves in two- or three-dimensional space, which break the image into blocks, each of which is typically reconstructed independently of the others. The pattern-based approach utilizes statistics of human segmentations to generate a codebook of patterns, each of which represents a pair of a block boundary specification and the corresponding pattern in the block interior. We develop the approach for rectangular cutset topologies and show that it can be extended to general periodic sampling topologies. We also show that, for bilevel cutset reconstruction, the pattern-based can be combined with the previously proposed cutset-MRF approach to substantially reduce the size of the codebook with a slight increase in reconstruction error. In addition, we present an algorithm for segmenting the cutset samples of an original grayscale or color image, followed by reconstruction of the full segmentation field via the pattern-based approach. Experimental results show that the proposed approaches outperform the cutset-MRF approaches in terms of both reconstruction error rate and perceptual quality. Moreover, this is accomplished without any side information about the structure of the block interior. Systematic comparisons of the performance of different sampling topologies are also provided.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador , Humanos
3.
IEEE Trans Image Process ; 30: 3610-3622, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33646950

RESUMO

We propose objective, image-based techniques for quantitative evaluation of facial skin gloss that is consistent with human judgments. We use polarization photography to obtain separate images of surface and subsurface reflections, and rely on psychophysical studies to uncover and separate the influence of the two components on skin gloss perception. We capture images of facial skin at two levels, macro-scale (whole face) and meso-scale (skin patch), before and after cleansing. To generate a broad range of skin appearances for each subject, we apply photometric image transformations to the surface and subsurface reflection images. We then use linear regression to link statistics of the surface and subsurface reflections to the perceived gloss obtained in our empirical studies. The focus of this paper is on within-subject gloss perception, that is, on visual differences among images of the same subject. Our analysis shows that the contrast of the surface reflection has a strong positive influence on skin gloss perception, while the darkness of the subsurface reflection (skin tone) has a weaker positive effect on perceived gloss. We show that a regression model based on the concatenation of statistics from the two reflection images can successfully predict relative gloss differences.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Pele/diagnóstico por imagem , Face/diagnóstico por imagem , Feminino , Humanos , Masculino , Fotografação , Propriedades de Superfície
4.
IEEE Trans Image Process ; 30: 1527-1541, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33360989

RESUMO

We consider lossy compression of a broad class of bilevel images that satisfy the smoothness criterion, namely, images in which the black and white regions are separated by smooth or piecewise smooth boundaries, and especially lossy compression of complex bilevel images in this class. We propose a new hierarchical compression approach that extends the previously proposed fixed-grid lossy cutset coding (LCC) technique by adapting the grid size to local image detail. LCC was claimed to have the best rate-distortion performance of any lossy compression technique in the given image class, but cannot take advantage of detail variations across an image. The key advantages of the hierarchical LCC (HLCC) is that, by adapting to local detail, it provides constant quality controlled by a single parameter (distortion threshold), independent of image content, and better overall visual quality and rate-distortion performance, over a wider range of bitrates. We also introduce several other enhancements of LCC that improve reconstruction accuracy and perceptual quality. These include the use of multiple connection bits that provide structural information by specifying which black (or white) runs on the boundary of a block must be connected, a boundary presmoothing step, stricter connectivity constraints, and more elaborate probability estimation for arithmetic coding. We also propose a progressive variation that refines the image reconstruction as more bits are transmitted, with very small additional overhead. Experimental results with a wide variety of, and especially complex, bilevel images in the given class confirm that the proposed techniques provide substantially better visual quality and rate-distortion performance than existing lossy bilevel compression techniques, at bitrates lower than lossless compression with the JBIG or JBIG2 standards.

5.
J Opt Soc Am A Opt Image Sci Vis ; 32(2): 329-42, 2015 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-26366606

RESUMO

The development and testing of objective texture similarity metrics that agree with human judgments of texture similarity require, in general, extensive subjective tests. The effectiveness and efficiency of such tests depend on a careful analysis of the abilities of human perception and the application requirements. The focus of this paper is on defining performance requirements and testing procedures for objective texture similarity metrics. We identify three operating domains for evaluating the performance of a similarity metric: the ability to retrieve "identical" textures; the top of the similarity scale, where a monotonic relationship between metric values and subjective scores is desired; and the ability to distinguish between perceptually similar and dissimilar textures. Each domain has different performance goals and requires different testing procedures. For the third domain, we propose ViSiProG, a new Visual Similarity by Progressive Grouping procedure for conducting subjective experiments that organizes a texture database into clusters of visually similar images. The grouping is based on visual blending and greatly simplifies labeling image pairs as similar or dissimilar. ViSiProG collects subjective data in an efficient and effective manner, so that a relatively large database of textures can be accommodated. Experimental results and comparisons with structural texture similarity metrics demonstrate both the effectiveness of the proposed subjective testing procedure and the performance of the metrics.

6.
IEEE Trans Image Process ; 23(4): 1652-65, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24808337

RESUMO

An effective, low complexity method for lossy compression of scenic bilevel images, called lossy cutset coding, is proposed based on a Markov random field model. It operates by losslessly encoding pixels in a square grid of lines, which is a cutset with respect to a Markov random field model, and preserves key structural information, such as borders between black and white regions. Relying on the Markov random field model, the decoder takes a MAP approach to reconstructing the interior of each grid block from the pixels on its boundary, thereby creating a piecewise smooth image that is consistent with the encoded grid pixels. The MAP rule, which reduces to finding the block interiors with fewest black-white transitions, is directly implementable for the most commonly occurring block boundaries, thereby avoiding the need for brute force or iterative solutions. Experimental results demonstrate that the new method is computationally simple, outperforms the current lossy compression technique most suited to scenic bilevel images, and provides substantially lower rates than lossless techniques, e.g., JBIG, with little loss in perceived image quality.

7.
IEEE Trans Image Process ; 22(7): 2545-58, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23481854

RESUMO

We develop new metrics for texture similarity that accounts for human visual perception and the stochastic nature of textures. The metrics rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are essentially identical. The proposed metrics extend the ideas of structural similarity and are guided by research in texture analysis-synthesis. They are implemented using a steerable filter decomposition and incorporate a concise set of subband statistics, computed globally or in sliding windows. We conduct systematic tests to investigate metric performance in the context of "known-item search," the retrieval of textures that are "identical" to the query texture. This eliminates the need for cumbersome subjective tests, thus enabling comparisons with human performance on a large database. Our experimental results indicate that the proposed metrics outperform peak signal-to-noise ratio (PSNR), structural similarity metric (SSIM) and its variations, as well as state-of-the-art texture classification metrics, using standard statistical measures.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Modelos Teóricos , Bases de Dados Factuais , Humanos , Propriedades de Superfície , Percepção Visual
8.
IEEE Trans Image Process ; 18(3): 495-508, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19150799

RESUMO

Most full-reference fidelity/quality metrics compare the original image to a distorted image at the same resolution assuming a fixed viewing condition. However, in many applications, such as video streaming, due to the diversity of channel capacities and display devices, the viewing distance and the spatiotemporal resolution of the displayed signal may be adapted in order to optimize the perceived signal quality. For example, at low bitrate coding applications an observer may prefer to reduce the resolution or increase the viewing distance to reduce the visibility of the compression artifacts. The tradeoff between resolution/viewing conditions and visibility of compression artifacts requires new approaches for the evaluation of image quality that account for both image distortions and image size. In order to better understand such tradeoffs, we conducted subjective tests using two representative still image coders, JPEG and JPEG 2000. Our results indicate that an observer would indeed prefer a lower spatial resolution (at a fixed viewing distance) in order to reduce the visibility of the compression artifacts, but not all the way to the point where the artifacts are completely invisible. Moreover, the observer is willing to accept more artifacts as the image size decreases. The subjective test results we report can be used to select viewing conditions for coding applications. They also set the stage for the development of novel fidelity metrics. The focus of this paper is on still images, but it is expected that similar tradeoffs apply to video.


Assuntos
Algoritmos , Artefatos , Aumento da Imagem/métodos , Reconhecimento Visual de Modelos/fisiologia , Análise e Desempenho de Tarefas , Adulto , Feminino , Humanos , Masculino , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Adulto Jovem
9.
IEEE Trans Image Process ; 17(9): 1663-71, 2008 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-18713672

RESUMO

Demand for multimedia services, such as video streaming over wireless networks, has grown dramatically in recent years. The downlink transmission of multiple video sequences to multiple users over a shared resource-limited wireless channel, however, is a daunting task. Among the many challenges in this area are the time-varying channel conditions, limited available resources, such as bandwidth and power, and the different transmission requirements of different video content. This work takes into account the time-varying nature of the wireless channels, as well as the importance of individual video packets, to develop a cross-layer resource allocation and packet scheduling scheme for multiuser video streaming over lossy wireless packet access networks. Assuming that accurate channel feedback is not available at the scheduler, random channel losses combined with complex error concealment at the receiver make it impossible for the scheduler to determine the actual distortion of the sequence at the receiver. Therefore, the objective of the optimization is to minimize the expected distortion of the received sequence, where the expectation is calculated at the scheduler with respect to the packet loss probability in the channel. The expected distortion is used to order the packets in the transmission queue of each user, and then gradients of the expected distortion are used to efficiently allocate resources across users. Simulations show that the proposed scheme performs significantly better than a conventional content-independent scheme for video transmission.


Assuntos
Redes de Comunicação de Computadores , Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Sinais Assistido por Computador , Telemetria/métodos , Gravação em Vídeo/métodos , Algoritmos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
10.
IEEE Trans Image Process ; 17(8): 1261-73, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18632337

RESUMO

Perceptual image quality metrics have explicitly accounted for human visual system (HVS) sensitivity to subband noise by estimating just noticeable distortion (JND) thresholds. A recently proposed class of quality metrics, known as structural similarity metrics (SSIM), models perception implicitly by taking into account the fact that the HVS is adapted for extracting structural information from images. We evaluate SSIM metrics and compare their performance to traditional approaches in the context of realistic distortions that arise from compression and error concealment in video compression/transmission applications. In order to better explore this space of distortions, we propose models for simulating typical distortions encountered in such applications. We compare specific SSIM implementations both in the image space and the wavelet domain; these include the complex wavelet SSIM (CWSSIM), a translation-insensitive SSIM implementation. We also propose a perceptually weighted multiscale variant of CWSSIM, which introduces a viewing distance dependence and provides a natural way to unify the structural similarity approach with the traditional JND-based perceptual approaches.


Assuntos
Artefatos , Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Técnica de Subtração , Gravação em Vídeo/métodos , Algoritmos , Análise Numérica Assistida por Computador , Controle de Qualidade , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
11.
IEEE Trans Image Process ; 15(2): 289-99, 2006 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-16479799

RESUMO

Characterizing the video quality seen by an end-user is a critical component of any video transmission system. In packet-based communication systems, such as wireless channels or the Internet, packet delivery is not guaranteed. Therefore, from the point-of-view of the transmitter, the distortion at the receiver is a random variable. Traditional approaches have primarily focused on minimizing the expected value of the end-to-end distortion. This paper explores the benefits of accounting for not only the mean, but also the variance of the end-to-end distortion when allocating limited source and channel resources. By accounting for the variance of the distortion, the proposed approach increases the reliability of the system by making it more likely that what the end-user sees, closely resembles the mean end-to-end distortion calculated at the transmitter. Experimental results demonstrate that variance-aware resource allocation can help limit error propagation and is more robust to channel-mismatch than approaches whose goal is to strictly minimize the expected distortion.


Assuntos
Algoritmos , Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Internet , Multimídia , Processamento de Sinais Assistido por Computador , Gravação em Vídeo/métodos , Gráficos por Computador , Telecomunicações
12.
IEEE Trans Image Process ; 15(1): 40-53, 2006 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-16435535

RESUMO

The problem of application-layer error control for real-time video transmission over packet lossy networks is commonly addressed via joint source-channel coding (JSCC), where source coding and forward error correction (FEC) are jointly designed to compensate for packet losses. In this paper, we consider hybrid application-layer error correction consisting of FEC and retransmissions. The study is carried out in an integrated joint source-channel coding (IJSCC) framework, where error resilient source coding, channel coding, and error concealment are jointly considered in order to achieve the best video delivery quality. We first show the advantage of the proposed IJSCC framework as compared to a sequential JSCC approach, where error resilient source coding and channel coding are not fully integrated. In the USCC framework, we also study the performance of different error control scenarios, such as pure FEC, pure retransmission, and their combination. Pure FEC and application layer retransmissions are shown to each achieve optimal results depending on the packet loss rates and the round-trip time. A hybrid of FEC and retransmissions is shown to outperform each component individually due to its greater flexibility.


Assuntos
Algoritmos , Redes de Comunicação de Computadores , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Armazenamento e Recuperação da Informação/métodos , Processamento de Sinais Assistido por Computador , Gravação em Vídeo/métodos , Gráficos por Computador , Sistemas Computacionais , Análise Numérica Assistida por Computador
13.
IEEE Trans Image Process ; 14(10): 1524-36, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16238058

RESUMO

We propose a new approach for image segmentation that is based on low-level features for color and texture. It is aimed at segmentation of natural scenes, in which the color and texture of each segment does not typically exhibit uniform statistical characteristics. The proposed approach combines knowledge of human perception with an understanding of signal characteristics in order to segment natural scenes into perceptually/semantically uniform regions. The proposed approach is based on two types of spatially adaptive low-level features. The first describes the local color composition in terms of spatially adaptive dominant colors, and the second describes the spatial characteristics of the grayscale component of the texture. Together, they provide a simple and effective characterization of texture that the proposed algorithm uses to obtain robust and, at the same time, accurate and precise segmentations. The resulting segmentations convey semantic information that can be used for content-based retrieval. The performance of the proposed algorithms is demonstrated in the domain of photographic images, including low-resolution, degraded, and compressed images.


Assuntos
Algoritmos , Inteligência Artificial , Cor , Colorimetria/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Retroalimentação , Imageamento Tridimensional/métodos , Armazenamento e Recuperação da Informação/métodos , Análise Numérica Assistida por Computador , Processamento de Sinais Assistido por Computador , Percepção Visual
14.
IEEE Trans Image Process ; 12(10): 1181-93, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-18237886

RESUMO

In this paper, we present a new shape-coding approach, which decouples the shape information into two independent signal data sets; the skeleton and the boundary distance from the skeleton. The major benefit of this approach is that it allows for a more flexible tradeoff between approximation error and bit budget. Curves of arbitrary order can be utilized for approximating both the skeleton and distance signals. For a given bit budget for a video frame, we solve the problem of choosing the number and location of the control points for all skeleton and distance signals of all boundaries within a frame, so that the overall distortion is minimized. An operational rate-distortion (ORD) optimal approach using Lagrangian relaxation and a four-dimensional direct acyclic graph (DAG) shortest path algorithm is developed for solving the problem. To reduce the computational complexity from O(N(5)) to O(N(3)), where N is the number of admissible control points for a skeleton, a suboptimal greedy-trellis search algorithm is proposed and compared with the optimal algorithm. In addition, an even more efficient algorithm with computational complexity O(N(2)) that finds an ORD optimal solution using a relaxed distortion criterion is also proposed and compared with the optimal solution. Experimental results demonstrate that our proposed approaches outperform existing ORD optimal approaches, which do not follow the same decomposition of the source data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...