Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
1.
Front Neurosci ; 17: 1208882, 2023.
Article in English | MEDLINE | ID: mdl-37483357

ABSTRACT

We show that classical hue cancellation experiments lead to human-like opponent curves even if the task is done by trivial (identity) artificial networks. Specifically, human-like opponent spectral sensitivities always emerge in artificial networks as long as (i) the retina converts the input radiation into any tristimulus-like representation, and (ii) the post-retinal network solves the standard hue cancellation task, e.g. the network looks for the weights of the cancelling lights so that every monochromatic stimulus plus the weighted cancelling lights match a grey reference in the (arbitrary) color representation used by the network. In fact, the specific cancellation lights (and not the network architecture) are key to obtain human-like curves: results show that the classical choice of the lights is the one that leads to the best (more human-like) result, and any other choices lead to progressively different spectral sensitivities. We show this in two ways: through artificial psychophysics using a range of networks with different architectures and a range of cancellation lights, and through a change-of-basis theoretical analogy of the experiments. This suggests that the opponent curves of the classical experiment are just a by-product of the front-end photoreceptors and of a very specific experimental choice but they do not inform about the downstream color representation. In fact, the architecture of the post-retinal network (signal recombination or internal color space) seems irrelevant for the emergence of the curves in the classical experiment. This result in artificial networks questions the conventional interpretation of the classical result in humans by Jameson and Hurvich.

2.
ArXiv ; 2023 Oct 05.
Article in English | MEDLINE | ID: mdl-36994156

ABSTRACT

In the 1950s, Barlow and Attneave hypothesised a link between biological vision and information maximisation. Following Shannon, information was defined using the probability of natural images. A number of physiological and psychophysical phenomena have been derived ever since from principles like info-max, efficient coding, or optimal denoising. However, it remains unclear how this link is expressed in mathematical terms from image probability. First, classical derivations were subjected to strong assumptions on the probability models and on the behaviour of the sensors. Moreover, the direct evaluation of the hypothesis was limited by the inability of the classical image models to deliver accurate estimates of the probability. In this work we directly evaluate image probabilities using an advanced generative model for natural images, and we analyse how probability-related factors can be combined to predict human perception via sensitivity of state-of-the-art subjective image quality metrics. We use information theory and regression analysis to find a combination of just two probability-related factors that achieves 0.8 correlation with subjective metrics. This probability-based sensitivity is psychophysically validated by reproducing the basic trends of the Contrast Sensitivity Function, its suprathreshold variation, and trends of the Weber-law and masking.

3.
Entropy (Basel) ; 24(12)2022 Nov 25.
Article in English | MEDLINE | ID: mdl-36554129

ABSTRACT

Recent studies proposed the use of Total Correlation to describe functional connectivity among brain regions as a multivariate alternative to conventional pairwise measures such as correlation or mutual information. In this work, we build on this idea to infer a large-scale (whole-brain) connectivity network based on Total Correlation and show the possibility of using this kind of network as biomarkers of brain alterations. In particular, this work uses Correlation Explanation (CorEx) to estimate Total Correlation. First, we prove that CorEx estimates of Total Correlation and clustering results are trustable compared to ground truth values. Second, the inferred large-scale connectivity network extracted from the more extensive open fMRI datasets is consistent with existing neuroscience studies, but, interestingly, can estimate additional relations beyond pairwise regions. And finally, we show how the connectivity graphs based on Total Correlation can also be an effective tool to aid in the discovery of brain diseases.

4.
J Vis ; 22(8): 2, 2022 07 11.
Article in English | MEDLINE | ID: mdl-35833884

ABSTRACT

Visual illusions expand our understanding of the visual system by imposing constraints in the models in two different ways: i) visual illusions for humans should induce equivalent illusions in the model, and ii) illusions synthesized from the model should be compelling for human viewers too. These constraints are alternative strategies to find good vision models. Following the first research strategy, recent studies have shown that artificial neural network architectures also have human-like illusory percepts when stimulated with classical hand-crafted stimuli designed to fool humans. In this work we focus on the second (less explored) strategy: we propose a framework to synthesize new visual illusions using the optimization abilities of current automatic differentiation techniques. The proposed framework can be used with classical vision models as well as with more recent artificial neural network architectures. This framework, validated by psychophysical experiments, can be used to study the difference between a vision model and the actual human perception and to optimize the vision model to decrease this difference.


Subject(s)
Illusions , Hand , Humans , Vision, Ocular , Visual Perception
5.
J Vis ; 22(6): 8, 2022 05 03.
Article in English | MEDLINE | ID: mdl-35587354

ABSTRACT

Three decades ago, Atick et al. suggested that human frequency sensitivity may emerge from the enhancement required for a more efficient analysis of retinal images. Here we reassess the relevance of low-level vision tasks in the explanation of the contrast sensitivity functions (CSFs) in light of 1) the current trend of using artificial neural networks for studying vision, and 2) the current knowledge of retinal image representations. As a first contribution, we show that a very popular type of convolutional neural networks (CNNs), called autoencoders, may develop human-like CSFs in the spatiotemporal and chromatic dimensions when trained to perform some basic low-level vision tasks (like retinal noise and optical blur removal), but not others (like chromatic) adaptation or pure reconstruction after simple bottlenecks). As an illustrative example, the best CNN (in the considered set of simple architectures for enhancement of the retinal signal) reproduces the CSFs with a root mean square error of 11% of the maximum sensitivity. As a second contribution, we provide experimental evidence of the fact that, for some functional goals (at low abstraction level), deeper CNNs that are better in reaching the quantitative goal are actually worse in replicating human-like phenomena (such as the CSFs). This low-level result (for the explored networks) is not necessarily in contradiction with other works that report advantages of deeper nets in modeling higher level vision goals. However, in line with a growing body of literature, our results suggests another word of caution about CNNs in vision science because the use of simplified units or unrealistic architectures in goal optimization may be a limitation for the modeling and understanding of human vision.


Subject(s)
Contrast Sensitivity , Neural Networks, Computer , Humans , Retina , Vision, Ocular
6.
Entropy (Basel) ; 24(10)2022 Oct 10.
Article in English | MEDLINE | ID: mdl-37420462

ABSTRACT

Biological neural networks for color vision (also known as color appearance models) consist of a cascade of linear + nonlinear layers that modify the linear measurements at the retinal photo-receptors leading to an internal (nonlinear) representation of color that correlates with psychophysical experience. The basic layers of these networks include: (1) chromatic adaptation (normalization of the mean and covariance of the color manifold); (2) change to opponent color channels (PCA-like rotation in the color space); and (3) saturating nonlinearities to obtain perceptually Euclidean color representations (similar to dimension-wise equalization). The Efficient Coding Hypothesis argues that these transforms should emerge from information-theoretic goals. In case this hypothesis holds in color vision, the question is what is the coding gain due to the different layers of the color appearance networks? In this work, a representative family of color appearance models is analyzed in terms of how the redundancy among the chromatic components is modified along the network and how much information is transferred from the input data to the noisy response. The proposed analysis is performed using data and methods that were not available before: (1) new colorimetrically calibrated scenes in different CIE illuminations for the proper evaluation of chromatic adaptation; and (2) new statistical tools to estimate (multivariate) information-theoretic quantities between multidimensional sets based on Gaussianization. The results confirm that the efficient coding hypothesis holds for current color vision models, and identify the psychophysical mechanisms critically responsible for gains in information transference: opponent channels and their nonlinear nature are more important than chromatic adaptation at the retina.

7.
J Math Neurosci ; 10(1): 18, 2020 Nov 11.
Article in English | MEDLINE | ID: mdl-33175257

ABSTRACT

How much visual information about the retinal images can be extracted from the different layers of the visual pathway? This question depends on the complexity of the visual input, the set of transforms applied to this multivariate input, and the noise of the sensors in the considered layer. Separate subsystems (e.g. opponent channels, spatial filters, nonlinearities of the texture sensors) have been suggested to be organized for optimal information transmission. However, the efficiency of these different layers has not been measured when they operate together on colorimetrically calibrated natural images and using multivariate information-theoretic units over the joint spatio-chromatic array of responses.In this work, we present a statistical tool to address this question in an appropriate (multivariate) way. Specifically, we propose an empirical estimate of the information transmitted by the system based on a recent Gaussianization technique. The total correlation measured using the proposed estimator is consistent with predictions based on the analytical Jacobian of a standard spatio-chromatic model of the retina-cortex pathway. If the noise at certain representation is proportional to the dynamic range of the response, and one assumes sensors of equivalent noise level, then transmitted information shows the following trends: (1) progressively deeper representations are better in terms of the amount of captured information, (2) the transmitted information up to the cortical representation follows the probability of natural scenes over the chromatic and achromatic dimensions of the stimulus space, (3) the contribution of spatial transforms to capture visual information is substantially greater than the contribution of chromatic transforms, and (4) nonlinearities of the responses contribute substantially to the transmitted information but less than the linear transforms.

8.
Sci Rep ; 10(1): 16277, 2020 10 01.
Article in English | MEDLINE | ID: mdl-33004868

ABSTRACT

The responses of visual neurons, as well as visual perception phenomena in general, are highly nonlinear functions of the visual input, while most vision models are grounded on the notion of a linear receptive field (RF). The linear RF has a number of inherent problems: it changes with the input, it presupposes a set of basis functions for the visual system, and it conflicts with recent studies on dendritic computations. Here we propose to model the RF in a nonlinear manner, introducing the intrinsically nonlinear receptive field (INRF). Apart from being more physiologically plausible and embodying the efficient representation principle, the INRF has a key property of wide-ranging implications: for several vision science phenomena where a linear RF must vary with the input in order to predict responses, the INRF can remain constant under different stimuli. We also prove that Artificial Neural Networks with INRF modules instead of linear filters have a remarkably improved performance and better emulate basic human perception. Our results suggest a change of paradigm for vision science as well as for artificial intelligence.


Subject(s)
Visual Perception , Animals , Artificial Intelligence , Humans , Models, Biological , Neural Networks, Computer , Nonlinear Dynamics , Retinal Neurons/physiology , Vision, Ocular/physiology , Visual Perception/physiology
9.
J Neurophysiol ; 123(6): 2249-2268, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32159407

ABSTRACT

In this paper, we study the communication efficiency of a psychophysically tuned cascade of Wilson-Cowan and divisive normalization layers that simulate the retina-V1 pathway. This is the first analysis of Wilson-Cowan networks in terms of multivariate total correlation. The parameters of the cortical model have been derived through the relation between the steady state of the Wilson-Cowan model and the divisive normalization model. The communication efficiency has been analyzed in two ways: First, we provide an analytical expression for the reduction of the total correlation among the responses of a V1-like population after the application of the Wilson-Cowan interaction. Second, we empirically study the efficiency with visual stimuli and statistical tools that were not available before 1) we use a recent, radiometrically calibrated, set of natural scenes, and 2) we use a recent technique to estimate the multivariate total correlation in bits from sets of visual responses, which only involves univariate operations, thus giving better estimates of the redundancy. The theoretical and the empirical results show that, although this cascade of layers was not optimized for statistical independence in any way, the redundancy between the responses gets substantially reduced along the neural pathway. Specifically, we show that 1) the efficiency of a Wilson-Cowan network is similar to its equivalent divisive normalization model; 2) while initial layers (Von Kries adaptation and Weber-like brightness) contribute to univariate equalization, and the bigger contributions to the reduction in total correlation come from the computation of nonlinear local contrast and the application of local oriented filters; and 3) psychophysically tuned models are more efficient (reduce more total correlation) in the more populated regions of the luminance-contrast plane. These results are an alternative confirmation of the efficient coding hypothesis for the Wilson-Cowan systems, and, from an applied perspective, they suggest that neural field models could be an option in image coding to perform image compression.NEW & NOTEWORTHY The Wilson-Cowan interaction is analyzed in total correlation terms for the first time. Theoretical and empirical results show that this psychophysically tuned interaction achieves the biggest efficiency in the most frequent region of the image space. This is an original confirmation of the efficient coding hypothesis and suggests that neural field models can be an alternative to divisive normalization in image compression.


Subject(s)
Models, Biological , Nerve Net/physiology , Neural Networks, Computer , Retina/physiology , Visual Cortex/physiology , Visual Pathways/physiology , Visual Perception/physiology , Humans
10.
Front Neurosci ; 13: 8, 2019.
Article in English | MEDLINE | ID: mdl-30894796

ABSTRACT

Subjective image quality databases are a major source of raw data on how the visual system works in naturalistic environments. These databases describe the sensitivity of many observers to a wide range of distortions of different nature and intensity seen on top of a variety of natural images. Data of this kind seems to open a number of possibilities for the vision scientist to check the models in realistic scenarios. However, while these natural databases are great benchmarks for models developed in some other way (e.g., by using the well-controlled artificial stimuli of traditional psychophysics), they should be carefully used when trying to fit vision models. Given the high dimensionality of the image space, it is very likely that some basic phenomena are under-represented in the database. Therefore, a model fitted on these large-scale natural databases will not reproduce these under-represented basic phenomena that could otherwise be easily illustrated with well selected artificial stimuli. In this work we study a specific example of the above statement. A standard cortical model using wavelets and divisive normalization tuned to reproduce subjective opinion on a large image quality dataset fails to reproduce basic cross-masking. Here we outline a solution for this problem by using artificial stimuli and by proposing a modification that makes the model easier to tune. Then, we show that the modified model is still competitive in the large-scale database. Our simulations with these artificial stimuli show that when using steerable wavelets, the conventional unit norm Gaussian kernels in divisive normalization should be multiplied by high-pass filters to reproduce basic trends in masking. Basic visual phenomena may be misrepresented in large natural image datasets but this can be solved with model-interpretable stimuli. This is an additional argument in praise of artifice in line with Rust and Movshon (2005).

11.
PLoS One ; 12(6): e0178345, 2017.
Article in English | MEDLINE | ID: mdl-28640816

ABSTRACT

Neurons at primary visual cortex (V1) in humans and other species are edge filters organized in orientation maps. In these maps, neurons with similar orientation preference are clustered together in iso-orientation domains. These maps have two fundamental properties: (1) retinotopy, i.e. correspondence between displacements at the image space and displacements at the cortical surface, and (2) a trade-off between good coverage of the visual field with all orientations and continuity of iso-orientation domains in the cortical space. There is an active debate on the origin of these locally continuous maps. While most of the existing descriptions take purely geometric/mechanistic approaches which disregard the network function, a clear exception to this trend in the literature is the original approach of Hyvärinen and Hoyer based on infomax and Topographic Independent Component Analysis (TICA). Although TICA successfully addresses a number of other properties of V1 simple and complex cells, in this work we question the validity of the orientation maps obtained from TICA. We argue that the maps predicted by TICA can be analyzed in the retinal space, and when doing so, it is apparent that they lack the required continuity and retinotopy. Here we show that in the orientation maps reported in the TICA literature it is easy to find examples of violation of the continuity between similarly tuned mechanisms in the retinal space, which suggest a random scrambling incompatible with the maps in primates. The new experiments in the retinal space presented here confirm this guess: TICA basis vectors actually follow a random salt-and-pepper organization back in the image space. Therefore, the interesting clusters found in the TICA topology cannot be interpreted as the actual cortical orientation maps found in cats, primates or humans. In conclusion, Topographic ICA does not reproduce cortical orientation maps.


Subject(s)
Brain Mapping , Orientation , Statistics as Topic , Visual Cortex/physiology , Animals , Humans , Retina/physiology
12.
Front Hum Neurosci ; 9: 557, 2015.
Article in English | MEDLINE | ID: mdl-26528165

ABSTRACT

When adapted to a particular scenery our senses may fool us: colors are misinterpreted, certain spatial patterns seem to fade out, and static objects appear to move in reverse. A mere empirical description of the mechanisms tuned to color, texture, and motion may tell us where these visual illusions come from. However, such empirical models of gain control do not explain why these mechanisms work in this apparently dysfunctional manner. Current normative explanations of aftereffects based on scene statistics derive gain changes by (1) invoking decorrelation and linear manifold matching/equalization, or (2) using nonlinear divisive normalization obtained from parametric scene models. These principled approaches have different drawbacks: the first is not compatible with the known saturation nonlinearities in the sensors and it cannot fully accomplish information maximization due to its linear nature. In the second, gain change is almost determined a priori by the assumed parametric image model linked to divisive normalization. In this study we show that both the response changes that lead to aftereffects and the nonlinear behavior can be simultaneously derived from a single statistical framework: the Sequential Principal Curves Analysis (SPCA). As opposed to mechanistic models, SPCA is not intended to describe how physiological sensors work, but it is focused on explaining why they behave as they do. Nonparametric SPCA has two key advantages as a normative model of adaptation: (i) it is better than linear techniques as it is a flexible equalization that can be tuned for more sensible criteria other than plain decorrelation (either full information maximization or error minimization); and (ii) it makes no a priori functional assumption regarding the nonlinearity, so the saturations emerge directly from the scene data and the goal (and not from the assumed function). It turns out that the optimal responses derived from these more sensible criteria and SPCA are consistent with dysfunctional behaviors such as aftereffects.

13.
Int J Neural Syst ; 24(7): 1440007, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25164247

ABSTRACT

This paper presents a new framework for manifold learning based on a sequence of principal polynomials that capture the possibly nonlinear nature of the data. The proposed Principal Polynomial Analysis (PPA) generalizes PCA by modeling the directions of maximal variance by means of curves, instead of straight lines. Contrarily to previous approaches, PPA reduces to performing simple univariate regressions, which makes it computationally feasible and robust. Moreover, PPA shows a number of interesting analytical properties. First, PPA is a volume-preserving map, which in turn guarantees the existence of the inverse. Second, such an inverse can be obtained in closed form. Invertibility is an important advantage over other learning methods, because it permits to understand the identified features in the input domain where the data has physical meaning. Moreover, it allows to evaluate the performance of dimensionality reduction in sensible (input-domain) units. Volume preservation also allows an easy computation of information theoretic quantities, such as the reduction in multi-information after the transform. Third, the analytical nature of PPA leads to a clear geometrical interpretation of the manifold: it allows the computation of Frenet-Serret frames (local features) and of generalized curvatures at any point of the space. And fourth, the analytical Jacobian allows the computation of the metric induced by the data, thus generalizing the Mahalanobis distance. These properties are demonstrated theoretically and illustrated experimentally. The performance of PPA is evaluated in dimensionality and redundancy reduction, in both synthetic and real datasets from the UCI repository.


Subject(s)
Artificial Intelligence , Models, Statistical , Algorithms , Neural Networks, Computer , Nonlinear Dynamics , Principal Component Analysis/methods , Regression Analysis
14.
PLoS One ; 9(2): e86481, 2014.
Article in English | MEDLINE | ID: mdl-24533049

ABSTRACT

Independent component and canonical correlation analysis are two general-purpose statistical methods with wide applicability. In neuroscience, independent component analysis of chromatic natural images explains the spatio-chromatic structure of primary cortical receptive fields in terms of properties of the visual environment. Canonical correlation analysis explains similarly chromatic adaptation to different illuminations. But, as we show in this paper, neither of the two methods generalizes well to explain both spatio-chromatic processing and adaptation at the same time. We propose a statistical method which combines the desirable properties of independent component and canonical correlation analysis: It finds independent components in each data set which, across the two data sets, are related to each other via linear or higher-order correlations. The new method is as widely applicable as canonical correlation analysis, and also to more than two data sets. We call it higher-order canonical correlation analysis. When applied to chromatic natural images, we found that it provides a single (unified) statistical framework which accounts for both spatio-chromatic processing and adaptation. Filters with spatio-chromatic tuning properties as in the primary visual cortex emerged and corresponding-colors psychophysics was reproduced reasonably well. We used the new method to make a theory-driven testable prediction on how the neural response to colored patterns should change when the illumination changes. We predict shifts in the responses which are comparable to the shifts reported for chromatic contrast habituation.


Subject(s)
Color , Image Processing, Computer-Assisted , Models, Statistical , Algorithms , Artificial Intelligence , Color Perception/physiology , Computer Simulation , Humans , Light , Neurosciences/methods , Photic Stimulation/methods , Probability , Psychophysics , Visual Cortex/physiology
15.
Neural Comput ; 24(10): 2751-88, 2012 Oct.
Article in English | MEDLINE | ID: mdl-22845821

ABSTRACT

Mechanisms of human color vision are characterized by two phenomenological aspects: the system is nonlinear and adaptive to changing environments. Conventional attempts to derive these features from statistics use separate arguments for each aspect. The few statistical explanations that do consider both phenomena simultaneously follow parametric formulations based on empirical models. Therefore, it may be argued that the behavior does not come directly from the color statistics but from the convenient functional form adopted. In addition, many times the whole statistical analysis is based on simplified databases that disregard relevant physical effects in the input signal, as, for instance, by assuming flat Lambertian surfaces. In this work, we address the simultaneous statistical explanation of the nonlinear behavior of achromatic and chromatic mechanisms in a fixed adaptation state and the change of such behavior (i.e., adaptation) under the change of observation conditions. Both phenomena emerge directly from the samples through a single data-driven method: the sequential principal curves analysis (SPCA) with local metric. SPCA is a new manifold learning technique to derive a set of sensors adapted to the manifold using different optimality criteria. Here sequential refers to the fact that sensors (curvilinear dimensions) are designed one after the other, and not to the particular (eventually iterative) method to draw a single principal curve. Moreover, in order to reproduce the empirical adaptation reported under D65 and A illuminations, a new database of colorimetrically calibrated images of natural objects under these illuminants was gathered, thus overcoming the limitations of available databases. The results obtained by applying SPCA show that the psychophysical behavior on color discrimination thresholds, discount of the illuminant, and corresponding pairs in asymmetric color matching emerge directly from realistic data regularities, assuming no a priori functional form. These results provide stronger evidence for the hypothesis of a statistically driven organization of color sensors. Moreover, the obtained results suggest that the nonuniform resolution of color sensors at this low abstraction level may be guided by an error-minimization strategy rather than by an information-maximization goal.


Subject(s)
Adaptation, Physiological , Color Perception/physiology , Color Vision/physiology , Models, Biological , Nonlinear Dynamics , Computer Simulation , Humans , Learning , Photic Stimulation , Principal Component Analysis , Psychophysics
16.
IEEE Trans Neural Netw ; 22(4): 537-49, 2011 Apr.
Article in English | MEDLINE | ID: mdl-21349790

ABSTRACT

Most signal processing problems involve the challenging task of multidimensional probability density function (PDF) estimation. In this paper, we propose a solution to this problem by using a family of rotation-based iterative Gaussianization (RBIG) transforms. The general framework consists of the sequential application of a univariate marginal Gaussianization transform followed by an orthonormal transform. The proposed procedure looks for differentiable transforms to a known PDF so that the unknown PDF can be estimated at any point of the original domain. In particular, we aim at a zero-mean unit-covariance Gaussian for convenience. RBIG is formally similar to classical iterative projection pursuit algorithms. However, we show that, unlike in PP methods, the particular class of rotations used has no special qualitative relevance in this context, since looking for interestingness is not a critical issue for PDF estimation. The key difference is that our approach focuses on the univariate part (marginal Gaussianization) of the problem rather than on the multivariate part (rotation). This difference implies that one may select the most convenient rotation suited to each practical application. The differentiability, invertibility, and convergence of RBIG are theoretically and experimentally analyzed. Relation to other methods, such as radial Gaussianization, one-class support vector domain description, and deep neural networks is also pointed out. The practical performance of RBIG is successfully illustrated in a number of multidimensional problems such as image synthesis, classification, denoising, and multi-information estimation.


Subject(s)
Neural Networks, Computer , Normal Distribution , Principal Component Analysis , Algorithms , Computer Simulation , Humans , Rotation , Wavelet Analysis
17.
Neural Comput ; 22(12): 3179-206, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20858127

ABSTRACT

The conventional approach in computational neuroscience in favor of the efficient coding hypothesis goes from image statistics to perception. It has been argued that the behavior of the early stages of biological visual processing (e.g., spatial frequency analyzers and their nonlinearities) may be obtained from image samples and the efficient coding hypothesis using no psychophysical or physiological information. In this work we address the same issue in the opposite direction: from perception to image statistics. We show that psychophysically fitted image representation in V1 has appealing statistical properties, for example, approximate PDF factorization and substantial mutual information reduction, even though no statistical information is used to fit the V1 model. These results are complementary evidence in favor of the efficient coding hypothesis.


Subject(s)
Models, Neurological , Neurons/physiology , Visual Cortex/physiology , Visual Perception/physiology
18.
J Opt Soc Am A Opt Image Sci Vis ; 27(4): 852-64, 2010 Apr 01.
Article in English | MEDLINE | ID: mdl-20360827

ABSTRACT

Structural similarity metrics and information-theory-based metrics have been proposed as completely different alternatives to the traditional metrics based on error visibility and human vision models. Three basic criticisms were raised against the traditional error visibility approach: (1) it is based on near-threshold performance, (2) its geometric meaning may be limited, and (3) stationary pooling strategies may not be statistically justified. These criticisms and the good performance of structural and information-theory-based metrics have popularized the idea of their superiority over the error visibility approach. In this work we experimentally or analytically show that the above criticisms do not apply to error visibility metrics that use a general enough divisive normalization masking model. Therefore, the traditional divisive normalization metric 1 is not intrinsically inferior to the newer approaches. In fact, experiments on a number of databases including a wide range of distortions show that divisive normalization is fairly competitive with the newer approaches, robust, and easy to interpret in linear terms. These results suggest that, despite the criticisms of the traditional error visibility approach, divisive normalization masking models should be considered in the image quality discussion.

19.
Network ; 17(1): 85-102, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16613796

ABSTRACT

It has been argued that the aim of non-linearities in different visual and auditory mechanisms may be to remove the relations between the coefficients of the signal after global linear ICA-like stages. Specifically, in Schwartz and Simoncelli (2001), it was shown that masking effects are reproduced by fitting the parameters of a particular non-linearity in order to remove the dependencies between the energy of wavelet coefficients. In this work, we present a different result that supports the same efficient encoding hypothesis. However, this result is more general because, instead of assuming any specific functional form for the non-linearity, we show that by using an unconstrained approach, masking-like behavior emerges directly from natural images. This result is an additional indication that Barlow's efficient encoding hypothesis may explain not only the shape of receptive fields of V1 sensors but also their non-linear behavior.


Subject(s)
Models, Neurological , Neural Networks, Computer , Nonlinear Dynamics , Visual Cortex/physiology , Visual Fields/physiology , Visual Perception/physiology , Humans , Photic Stimulation/methods
20.
IEEE Trans Image Process ; 15(1): 68-80, 2006 Jan.
Article in English | MEDLINE | ID: mdl-16435537

ABSTRACT

Image compression systems commonly operate by transforming the input signal into a new representation whose elements are independently quantized. The success of such a system depends on two properties of the representation. First, the coding rate is minimized only if the elements of the representation are statistically independent. Second, the perceived coding distortion is minimized only if the errors in a reconstructed image arising from quantization of the different elements of the representation are perceptually independent. We argue that linear transforms cannot achieve either of these goals and propose, instead, an adaptive nonlinear image representation in which each coefficient of a linear transform is divided by a weighted sum of coefficient amplitudes in a generalized neighborhood. We then show that the divisive operation greatly reduces both the statistical and the perceptual redundancy amongst representation elements. We develop an efficient method of inverting this transformation, and we demonstrate through simulations that the dual reduction in dependency can greatly improve the visual quality of compressed images.


Subject(s)
Algorithms , Data Compression/methods , Image Enhancement/methods , Image Interpretation, Computer-Assisted/methods , Signal Processing, Computer-Assisted , Computer Graphics , Computer Simulation , Models, Statistical , Nonlinear Dynamics , Numerical Analysis, Computer-Assisted
SELECTION OF CITATIONS
SEARCH DETAIL
...