Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-37030810

RESUMO

Video watermarking embeds a message into a cover video in an imperceptible manner, which can be retrieved even if the video undergoes certain modifications or distortions. Traditional watermarking methods are often manually designed for particular types of distortions and thus cannot simultaneously handle a broad spectrum of distortions. To this end, we propose a robust deep learning-based solution for video watermarking that is end-to-end trainable. Our model consists of a novel multiscale design where the watermarks are distributed across multiple spatial-temporal scales. Extensive evaluations on a wide variety of distortions show that our method outperforms traditional video watermarking methods as well as deep image watermarking models by a large margin. We further demonstrate the practicality of our method on a realistic video-editing application.

2.
Annu Rev Vis Sci ; 7: 571-604, 2021 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-34524880

RESUMO

The first mobile camera phone was sold only 20 years ago, when taking pictures with one's phone was an oddity, and sharing pictures online was unheard of. Today, the smartphone is more camera than phone. How did this happen? This transformation was enabled by advances in computational photography-the science and engineering of making great images from small-form-factor, mobile cameras. Modern algorithmic and computing advances, including machine learning, have changed the rules of photography, bringing to it new modes of capture, postprocessing, storage, and sharing. In this review, we give a brief history of mobile computational photography and describe some of the key technological components, including burst photography, noise reduction, and super-resolution. At each step, we can draw naive parallels to the human visual system.


Assuntos
Telefone Celular , Fotografação , Humanos , Smartphone
3.
IEEE Trans Image Process ; 30: 6673-6685, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34264828

RESUMO

Could we compress images via standard codecs while avoiding visible artifacts? The answer is obvious - this is doable as long as the bit budget is generous enough. What if the allocated bit-rate for compression is insufficient? Then unfortunately, artifacts are a fact of life. Many attempts were made over the years to fight this phenomenon, with various degrees of success. In this work we aim to break the unholy connection between bit-rate and image quality, and propose a way to circumvent compression artifacts by pre-editing the incoming image and modifying its content to fit the given bits. We design this editing operation as a learned convolutional neural network, and formulate an optimization problem for its training. Our loss takes into account a proximity between the original image and the edited one, a bit-budget penalty over the proposed image, and a no-reference image quality measure for forcing the outcome to be visually pleasing. The proposed approach is demonstrated on the popular JPEG compression, showing savings in bits and/or improvements in visual quality, obtained with intricate editing effects.

4.
IEEE Trans Image Process ; 30: 5944-5955, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34166193

RESUMO

This work considers noise removal from images, focusing on the well-known K-SVD denoising algorithm. This sparsity-based method was proposed in 2006, and for a short while it was considered as state-of-the-art. However, over the years it has been surpassed by other methods, including the recent deep-learning-based newcomers. The question we address in this paper is whether K-SVD was brought to its peak in its original conception, or whether it can be made competitive again. The approach we take in answering this question is to redesign the algorithm to operate in a supervised manner. More specifically, we propose an end-to-end deep architecture with the exact K-SVD computational path, and train it for optimized denoising. Our work shows how to overcome difficulties arising in turning the K-SVD scheme into a differentiable, and thus learnable, machine. With a small number of parameters to learn and while preserving the original K-SVD essence, the proposed architecture is shown to outperform the classical K-SVD algorithm substantially, and getting closer to recent state-of-the-art learning-based denoising methods. Adopting a broader context, this work touches on themes around the design of deep-learning solutions for image processing tasks, while paving a bridge between classic methods and novel deep-learning-based ones.

5.
Artigo em Inglês | MEDLINE | ID: mdl-30640613

RESUMO

In this work, we broadly connect kernel-based filtering (e.g. approaches such as the bilateral filter and nonlocal means, but also many more) with general variational formulations of Bayesian regularized least squares, and the related concept of proximal operators. Variational/Bayesian/proximal formulations often result in optimization problems that do not have closed-form solutions, and therefore typically require global iterative solutions. Our main contribution here is to establish how one can approximate the solution of the resulting global optimization problems using locally adaptive filters with specific kernels. Our results are valid for small regularization strength (i.e. weak noise) but the approach is powerful enough to be useful for a wide range of applications because we expose how to derive a "kernelized" solution to these problems that approximates the global solution in one shot, using only local operations. As another side benefit in the reverse direction, given a local data-adaptive filter constructed with a particular choice of kernel, we enable the interpretation of such filters in the variational/Bayesian/proximal framework.

6.
IEEE Trans Med Imaging ; 37(9): 1978-1988, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29990154

RESUMO

Optical coherence tomography (OCT) has revolutionized diagnosis and prognosis of ophthalmic diseases by visualization and measurement of retinal layers. To speed up the quantitative analysis of disease biomarkers, an increasing number of automatic segmentation algorithms have been proposed to estimate the boundary locations of retinal layers. While the performance of these algorithms has significantly improved in recent years, a critical question to ask is how far we are from a theoretical limit to OCT segmentation performance. In this paper, we present the Cramèr-Rao lower bounds (CRLBs) for the problem of OCT layer segmentation. In deriving the CRLBs, we address the important problem of defining statistical models that best represent the intensity distribution in each layer of the retina. Additionally, we calculate the bounds under an optimal affine bias, reflecting the use of prior knowledge in many segmentation algorithms. Experiments using in vivo images of human retina from a commercial spectral domain OCT system are presented, showing potential for improvement of automated segmentation accuracy. Our general mathematical model can be easily adapted for virtually any OCT system. Furthermore, the statistical models of signal and noise developed in this paper can be utilized for the future improvements of OCT image denoising, reconstruction, and many other applications.


Assuntos
Processamento de Imagem Assistida por Computador/métodos , Modelos Estatísticos , Retina/diagnóstico por imagem , Tomografia de Coerência Óptica/métodos , Algoritmos , Humanos
7.
Artigo em Inglês | MEDLINE | ID: mdl-29994025

RESUMO

Automatically learned quality assessment for images has recently become a hot topic due to its usefulness in a wide variety of applications such as evaluating image capture pipelines, storage techniques and sharing media. Despite the subjective nature of this problem, most existing methods only predict the mean opinion score provided by datasets such as AVA [1] and TID2013 [2]. Our approach differs from others in that we predict the distribution of human opinion scores using a convolutional neural network. Our architecture also has the advantage of being significantly simpler than other methods with comparable performance. Our proposed approach relies on the success (and retraining) of proven, state-of-the-art deep object recognition networks. Our resulting network can be used to not only score images reliably and with high correlation to human perception, but also to assist with adaptation and optimization of photo editing/enhancement algorithms in a photographic pipeline. All this is done without need for a "golden" reference image, consequently allowing for single-image, semantic- and perceptually-aware, no-reference quality assessment.

8.
IEEE Trans Image Process ; 26(9): 4229-4242, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28541202

RESUMO

Pedestrian detection in thermal infrared images poses unique challenges because of the low resolution and noisy nature of the image. Here, we propose a mid-level attribute in the form of the multidimensional template, or tensor, using local steering kernel (LSK) as low-level descriptors for detecting pedestrians in far infrared images. LSK is specifically designed to deal with intrinsic image noise and pixel level uncertainty by capturing local image geometry succinctly instead of collecting local orientation statistics (e.g., histograms in histogram of oriented gradients). In order to learn the LSK tensor, we introduce a new image similarity kernel following the popular maximum margin framework of support vector machines facilitating a relatively short and simple training phase for building a rigid pedestrian detector. Tensor representation has several advantages, and indeed, LSK templates allow exact acceleration of the sluggish but de facto sliding window-based detection methodology with multichannel discrete Fourier transform, facilitating very fast and efficient pedestrian localization. The experimental studies on publicly available thermal infrared images justify our proposals and model assumptions. In addition, the proposed work also involves the release of our in-house annotations of pedestrians in more than 17 000 frames of OSU color thermal database for the purpose of sharing with the research community.

9.
IEEE Trans Image Process ; 26(5): 2338-2351, 2017 May.
Artigo em Inglês | MEDLINE | ID: mdl-28287968

RESUMO

Style transfer is a process of migrating a style from a given image to the content of another, synthesizing a new image, which is an artistic mixture of the two. Recent work on this problem adopting convolutional neural-networks (CNN) ignited a renewed interest in this field, due to the very impressive results obtained. There exists an alternative path toward handling the style transfer task, via the generalization of texture synthesis algorithms. This approach has been proposed over the years, but its results are typically less impressive compared with the CNN ones. In this paper, we propose a novel style transfer algorithm that extends the texture synthesis work of Kwatra et al. (2005), while aiming to get stylized images that are closer in quality to the CNN ones. We modify Kwatra's algorithm in several key ways in order to achieve the desired transfer, with emphasis on a consistent way for keeping the content intact in selected regions, while producing hallucinated and rich style in others. The results obtained are visually pleasing and diverse, shown to be competitive with the recent CNN style transfer algorithms. The proposed algorithm is fast and flexible, being able to process any pair of content + style images.

10.
IEEE Trans Pattern Anal Mach Intell ; 38(3): 546-62, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27046497

RESUMO

One shot, generic object detection involves searching for a single query object in a larger target image. Relevant approaches have benefited from features that typically model the local similarity patterns. In this paper, we combine local similarity (encoded by local descriptors) with a global context (i.e., a graph structure) of pairwise affinities among the local descriptors, embedding the query descriptors into a low dimensional but discriminatory subspace. Unlike principal components that preserve global structure of feature space, we actually seek a linear approximation to the Laplacian eigenmap that permits us a locality preserving embedding of high dimensional region descriptors. Our second contribution is an accelerated but exact computation of matrix cosine similarity as the decision rule for detection, obviating the computationally expensive sliding window search. We leverage the power of Fourier transform combined with integral image to achieve superior runtime efficiency that allows us to test multiple hypotheses (for pose estimation) within a reasonably short time. Our approach to one shot detection is training-free, and experiments on the standard data sets confirm the efficacy of our model. Besides, low computation cost of the proposed (codebook-free) object detector facilitates rather straightforward query detection in large data sets including movie videos.

11.
Sci Rep ; 5: 12303, 2015 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-26201867

RESUMO

The increasing interest in nanoscience in many research fields like physics, chemistry, and biology, including the environmental fate of the produced nano-objects, requires instrumental improvements to address the sub-micrometric analysis challenges. The originality of our approach is to use both the super-resolution concept and multivariate curve resolution (MCR-ALS) algorithm in confocal Raman imaging to surmount its instrumental limits and to characterize chemical components of atmospheric aerosols at the level of the individual particles. We demonstrate the possibility to go beyond the diffraction limit with this algorithmic approach. Indeed, the spatial resolution is improved by 65% to achieve 200 nm for the considered far-field spectrophotometer. A multivariate curve resolution method is then coupled with super-resolution in order to explore the heterogeneous structure of submicron particles for describing physical and chemical processes that may occur in the atmosphere. The proposed methodology provides new tools for sub-micron characterization of heterogeneous samples using far-field (i.e. conventional) Raman imaging spectrometer.

12.
IEEE Trans Image Process ; 23(12): 5136-51, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25312932

RESUMO

Any image can be represented as a function defined on a weighted graph, in which the underlying structure of the image is encoded in kernel similarity and associated Laplacian matrices. In this paper, we develop an iterative graph-based framework for image restoration based on a new definition of the normalized graph Laplacian. We propose a cost function, which consists of a new data fidelity term and regularization term derived from the specific definition of the normalized graph Laplacian. The normalizing coefficients used in the definition of the Laplacian and associated regularization term are obtained using fast symmetry preserving matrix balancing. This results in some desired spectral properties for the normalized Laplacian such as being symmetric, positive semidefinite, and returning zero vector when applied to a constant image. Our algorithm comprises of outer and inner iterations, where in each outer iteration, the similarity weights are recomputed using the previous estimate and the updated objective function is minimized using inner conjugate gradient iterations. This procedure improves the performance of the algorithm for image deblurring, where we do not have access to a good initial estimate of the underlying image. In addition, the specific form of the cost function allows us to render the spectral analysis for the solutions of the corresponding linear equations. In addition, the proposed approach is general in the sense that we have shown its effectiveness for different restoration problems, including deblurring, denoising, and sharpening. Experimental results verify the effectiveness of the proposed algorithm on both synthetic and real examples.


Assuntos
Algoritmos , Processamento de Imagem Assistida por Computador/métodos , Simulação por Computador , Humanos , Movimento (Física)
13.
IEEE Trans Image Process ; 23(10): 4460-73, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25148666

RESUMO

In this paper, we introduce a new image editing tool based on the spectrum of a global filter computed from image affinities. Recently, it has been shown that the global filter derived from a fully connected graph representing the image can be approximated using the Nyström extension. This filter is computed by approximating the leading eigenvectors of the filter. These orthonormal eigenfunctions are highly expressive of the coarse and fine details in the underlying image, where each eigenvector can be interpreted as one scale of a data-dependent multiscale image decomposition. In this filtering scheme, each eigenvalue can boost or suppress the corresponding signal component in each scale. Our analysis shows that the mapping of the eigenvalues by an appropriate polynomial function endows the filter with a number of important capabilities, such as edge-aware sharpening, denoising, tone manipulation, and abstraction, to name a few. Furthermore, the edits can be easily propagated across the image.

14.
IEEE Trans Image Process ; 23(2): 755-68, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26270916

RESUMO

Most existing state-of-the-art image denoising algorithms are based on exploiting similarity between a relatively modest number of patches. These patch-based methods are strictly dependent on patch matching, and their performance is hamstrung by the ability to reliably find sufficiently similar patches. As the number of patches grows, a point of diminishing returns is reached where the performance improvement due to more patches is offset by the lower likelihood of finding sufficiently close matches. The net effect is that while patch-based methods, such as BM3D, are excellent overall, they are ultimately limited in how well they can do on (larger) images with increasing complexity. In this paper, we address these shortcomings by developing a paradigm for truly global filtering where each pixel is estimated from all pixels in the image. Our objectives in this paper are two-fold. First, we give a statistical analysis of our proposed global filter, based on a spectral decomposition of its corresponding operator, and we study the effect of truncation of this spectral decomposition. Second, we derive an approximation to the spectral (principal) components using the Nyström extension. Using these, we demonstrate that this global filter can be implemented efficiently by sampling a fairly small percentage of the pixels in the image. Experiments illustrate that our strategy can effectively globalize any existing denoising filters to estimate each pixel using all pixels in the image, hence improving upon the best patch-based methods.

15.
IEEE Trans Image Process ; 22(12): 4879-91, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23974627

RESUMO

Estimating the amount of blur in a given image is important for computer vision applications. More specifically, the spatially varying defocus point-spread-functions (PSFs) over an image reveal geometric information of the scene, and their estimate can also be used to recover an all-in-focus image. A PSF for a defocus blur can be specified by a single parameter indicating its scale. Most existing algorithms can only select an optimal blur from a finite set of candidate PSFs for each pixel. Some of those methods require a coded aperture filter inserted in the camera. In this paper, we present an algorithm estimating a defocus scale map from a single image, which is applicable to conventional cameras. This method is capable of measuring the probability of local defocus scale in the continuous domain. It also takes smoothness and color edge information into consideration to generate a coherent blur map indicating the amount of blur at each pixel. Simulated and real data experiments illustrate excellent performance and its successful applications in foreground/background segmentation.

16.
J Vis ; 13(4)2013 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-23479475

RESUMO

The human visual system possesses the remarkable ability to pick out salient objects in images. Even more impressive is its ability to do the very same in the presence of disturbances. In particular, the ability persists despite the presence of noise, poor weather, and other impediments to perfect vision. Meanwhile, noise can significantly degrade the accuracy of automated computational saliency detection algorithms. In this article, we set out to remedy this shortcoming. Existing computational saliency models generally assume that the given image is clean, and a fundamental and explicit treatment of saliency in noisy images is missing from the literature. Here we propose a novel and statistically sound method for estimating saliency based on a nonparametric regression framework and investigate the stability of saliency models for noisy images and analyze how state-of-the-art computational models respond to noisy visual stimuli. The proposed model of saliency at a pixel of interest is a data-dependent weighted average of dissimilarities between a center patch around that pixel and other patches. To further enhance the degree of accuracy in predicting the human fixations and of stability to noise, we incorporate a global and multiscale approach by extending the local analysis window to the entire input image, even further to multiple scaled copies of the image. Our method consistently outperforms six other state-of-the-art models (Bruce & Tsotsos, 2009; Garcia-Diaz, Fdez-Vidal, Pardo, & Dosil, 2012; Goferman, Zelnik-Manor, & Tal, 2010; Hou & Zhang, 2007; Seo & Milanfar, 2009; Zhang, Tong, & Marks, 2008) for both noise-free and noisy cases.


Assuntos
Modelos Biológicos , Mascaramento Perceptivo/fisiologia , Tempo de Reação/fisiologia , Percepção Visual/fisiologia , Fixação Ocular/fisiologia , Humanos , Análise de Regressão , Limiar Sensorial/fisiologia
17.
IEEE Trans Image Process ; 22(4): 1470-85, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23221828

RESUMO

Spatial domain image filters (e.g., bilateral filter, non-local means, locally adaptive regression kernel) have achieved great success in denoising. Their overall performance, however, has not generally surpassed the leading transform domain-based filters (such as BM3-D). One important reason is that spatial domain filters lack efficiency to adaptively fine tune their denoising strength; something that is relatively easy to do in transform domain method with shrinkage operators. In the pixel domain, the smoothing strength is usually controlled globally by, for example, tuning a regularization parameter. In this paper, we propose spatially adaptive iterative filtering (SAIF) is the Middle Eastern/Arabic name for sword. This acronym somehow seems appropriate for what the algorithm does by precisely tuning the value of the iteration number. a new strategy to control the denoising strength locally for any spatial domain method. This approach is capable of filtering local image content iteratively using the given base filter, and the type of iteration and the iteration number are automatically optimized with respect to estimated risk (i.e., mean-squared error). In exploiting the estimated local signal-to-noise-ratio, we also present a new risk estimator that is different from the often-employed SURE method, and exceeds its performance in many cases. Experiments illustrate that our strategy can significantly relax the base algorithm's sensitivity to its tuning (smoothing) parameters, and effectively boost the performance of several existing denoising filters to generate state-of-the-art results under both simulated and practical conditions.

18.
IEEE Trans Pattern Anal Mach Intell ; 35(1): 157-70, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23154324

RESUMO

To correct geometric distortion and reduce space and time-varying blur, a new approach is proposed in this paper capable of restoring a single high-quality image from a given image sequence distorted by atmospheric turbulence. This approach reduces the space and time-varying deblurring problem to a shift invariant one. It first registers each frame to suppress geometric deformation through B-spline-based nonrigid registration. Next, a temporal regression process is carried out to produce an image from the registered frames, which can be viewed as being convolved with a space invariant near-diffraction-limited blur. Finally, a blind deconvolution algorithm is implemented to deblur the fused image, generating a final output. Experiments using real data illustrate that this approach can effectively alleviate blur and distortions, recover details of the scene, and significantly improve visual quality.


Assuntos
Algoritmos , Artefatos , Inteligência Artificial , Atmosfera , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos
19.
IEEE Trans Image Process ; 21(4): 1687-700, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22084050

RESUMO

Blind deconvolution, which comprises simultaneous blur and image estimations, is a strongly ill-posed problem. It is by now well known that if multiple images of the same scene are acquired, this multichannel (MC) blind deconvolution problem is better posed and allows blur estimation directly from the degraded images. We improve the MC idea by adding robustness to noise and stability in the case of large blurs or if the blur size is vastly overestimated. We formulate blind deconvolution as an l(1) -regularized optimization problem and seek a solution by alternately optimizing with respect to the image and with respect to blurs. Each optimization step is converted to a constrained problem by variable splitting and then is addressed with an augmented Lagrangian method, which permits simple and fast implementation in the Fourier domain. The rapid convergence of the proposed method is illustrated on synthetically blurred data. Applicability is also demonstrated on the deconvolution of real photos taken by a digital camera.


Assuntos
Algoritmos , Artefatos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
20.
IEEE Trans Image Process ; 21(4): 1635-49, 2012 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-22020683

RESUMO

In this paper, we propose a denoising method motivated by our previous analysis of the performance bounds for image denoising. Insights from that study are used here to derive a high-performance practical denoising algorithm. We propose a patch-based Wiener filter that exploits patch redundancy for image denoising. Our framework uses both geometrically and photometrically similar patches to estimate the different filter parameters. We describe how these parameters can be accurately estimated directly from the input noisy image. Our denoising approach, designed for near-optimal performance (in the mean-squared error sense), has a sound statistical foundation that is analyzed in detail. The performance of our approach is experimentally verified on a variety of images and noise levels. The results presented here demonstrate that our proposed method is on par or exceeding the current state of the art, both visually and quantitatively.


Assuntos
Algoritmos , Artefatos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Fotografação/métodos , Processamento de Sinais Assistido por Computador , Razão Sinal-Ruído , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...