Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
IEEE Trans Pattern Anal Mach Intell ; 46(4): 2041-2053, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38039177

ABSTRACT

Converging evidence indicates that deep neural network models that are trained on large datasets are biased toward color and texture information. Humans, on the other hand, can easily recognize objects and scenes from images as well as from bounding contours. Mid-level vision is characterized by the recombination and organization of simple primary features into more complex ones by a set of so-called Gestalt grouping rules. While described qualitatively in the human literature, a computational implementation of these perceptual grouping rules is so far missing. In this article, we contribute a novel set of algorithms for the detection of contour-based cues in complex scenes. We use the medial axis transform (MAT) to locally score contours according to these grouping rules. We demonstrate the benefit of these cues for scene categorization in two ways: (i) Both human observers and CNN models categorize scenes most accurately when perceptual grouping information is emphasized. (ii) Weighting the contours with these measures boosts performance of a CNN model significantly compared to the use of unweighted contours. Our work suggests that, even though these measures are computed directly from contours in the image, current CNN models do not appear to extract or utilize these grouping cues.

2.
PLoS One ; 17(1): e0260266, 2022.
Article in English | MEDLINE | ID: mdl-35061699

ABSTRACT

Human observers can rapidly perceive complex real-world scenes. Grouping visual elements into meaningful units is an integral part of this process. Yet, so far, the neural underpinnings of perceptual grouping have only been studied with simple lab stimuli. We here uncover the neural mechanisms of one important perceptual grouping cue, local parallelism. Using a new, image-computable algorithm for detecting local symmetry in line drawings and photographs, we manipulated the local parallelism content of real-world scenes. We decoded scene categories from patterns of brain activity obtained via functional magnetic resonance imaging (fMRI) in 38 human observers while they viewed the manipulated scenes. Decoding was significantly more accurate for scenes containing strong local parallelism compared to weak local parallelism in the parahippocampal place area (PPA), indicating a central role of parallelism in scene perception. To investigate the origin of the parallelism signal we performed a model-based fMRI analysis of the public BOLD5000 dataset, looking for voxels whose activation time course matches that of the locally parallel content of the 4916 photographs viewed by the participants in the experiment. We found a strong relationship with average local symmetry in visual areas V1-4, PPA, and retrosplenial cortex (RSC). Notably, the parallelism-related signal peaked first in V4, suggesting V4 as the site for extracting paralleism from the visual input. We conclude that local parallelism is a perceptual grouping cue that influences neuronal activity throughout the visual hierarchy, presumably starting at V4. Parallelism plays a key role in the representation of scene categories in PPA.


Subject(s)
Brain Mapping
3.
PLoS One ; 16(11): e0258376, 2021.
Article in English | MEDLINE | ID: mdl-34748556

ABSTRACT

We often take people's ability to understand and produce line drawings for granted. But where should we draw lines, and why? We address psychological principles that underlie efficient representations of complex information in line drawings. First, 58 participants with varying degree of artistic experience produced multiple drawings of a small set of scenes by tracing contours on a digital tablet. Second, 37 independent observers ranked the drawings by how representative they are of the original photograph. Matching contours between drawings of the same scene revealed that the most consistently drawn contours tend to be drawn earlier. We generated half-images with the most- versus least-consistently drawn contours and asked 25 observers categorize the quickly presented scenes. Observers performed significantly better for the most compared to the least consistent half-images. The most consistently drawn contours were more likely to depict occlusion boundaries, whereas the least consistently drawn contours frequently depicted surface normals.


Subject(s)
Form Perception/physiology , Pattern Recognition, Visual/physiology , Vision, Ocular/physiology , Visual Perception/physiology , Adolescent , Adult , Computer Graphics , Female , Humans , Linear Models , Male , Middle Aged , Photography , Psychophysics/standards , Young Adult
4.
Vision (Basel) ; 4(2)2020 May 26.
Article in English | MEDLINE | ID: mdl-32466442

ABSTRACT

The early visual system is composed of spatial frequency-tuned channels that break an image into its individual frequency components. Therefore, researchers commonly filter images for spatial frequencies to arrive at conclusions about the differential importance of high versus and low spatial frequency image content. Here, we show how simple decisions about the filtering of the images, and how they are displayed on the screen, can result in drastically different behavioral outcomes. We show that jointly normalizing the contrast of the stimuli is critical in order to draw accurate conclusions about the influence of the different spatial frequencies, as images of the real world naturally have higher contrast energy at low than high spatial frequencies. Furthermore, the specific choice of filter shape can result in contradictory results about whether high or low spatial frequencies are more useful for understanding image content. Finally, we show that the manner in which the high spatial frequency content is displayed on the screen influences how recognizable an image is. Previous findings that make claims about the visual system's use of certain spatial frequency bands should be revisited, especially if their methods sections do not make clear what filtering choices were made.

5.
J Vis ; 19(6): 2, 2019 06 03.
Article in English | MEDLINE | ID: mdl-31166580

ABSTRACT

People are able to perceive the 3D shape of illuminated surfaces using image shading cues. Theories about how we accomplish this often assume that the human visual system estimates a single lighting direction and interprets shading cues in accord with that estimate. In natural scenes, however, lighting can be much more complex than this, with multiple nearby light sources. Here we show that the human visual system can successfully judge 3D surface shape even when the lighting direction varies from place to place over a surface, provided the scale at which these lighting changes occur is similar to, or larger than, the size of the shape features being judged. Furthermore, we show that despite being able to accommodate rapid changes in lighting direction when judging shape, observers are generally unable to detect these changes. We conclude that, rather than relying on a single estimated illumination direction, the human visual system can accommodate illumination that varies substantially and rapidly across a surface.


Subject(s)
Form Perception/physiology , Lighting , Female , Humans , Imaging, Three-Dimensional , Male , Photic Stimulation/methods , Young Adult
6.
Atten Percept Psychophys ; 81(1): 35-46, 2019 Jan.
Article in English | MEDLINE | ID: mdl-30191476

ABSTRACT

Our research has previously shown that scene categories can be predicted from observers' eye movements when they view photographs of real-world scenes. The time course of category predictions reveals the differential influences of bottom-up and top-down information. Here we used these known differences to determine to what extent image features at different representational levels contribute toward guiding gaze in a category-specific manner. Participants viewed grayscale photographs and line drawings of real-world scenes while their gaze was tracked. Scene categories could be predicted from fixation density at all times over a 2-s time course in both photographs and line drawings. We replicated the shape of the prediction curve found previously, with an initial steep decrease in prediction accuracy from 300 to 500 ms, representing the contribution of bottom-up information, followed by a steady increase, representing top-down knowledge of category-specific information. We then computed the low-level features (luminance contrasts and orientation statistics), mid-level features (local symmetry and contour junctions), and Deep Gaze II output from the images, and used that information as a reference in our category predictions in order to assess their respective contributions to category-specific guidance of gaze. We observed that, as expected, low-level salience contributes mostly to the initial bottom-up peak of gaze guidance. Conversely, the mid-level features that describe scene structure (i.e., local symmetry and junctions) split their contributions between bottom-up and top-down attentional guidance, with symmetry contributing to both bottom-up and top-down guidance, while junctions play a more prominent role in the top-down guidance of gaze.


Subject(s)
Attention/physiology , Eye Movements/physiology , Fixation, Ocular/physiology , Orientation, Spatial/physiology , Photic Stimulation/methods , Adolescent , Female , Humans , Male , Vision, Ocular/physiology , Visual Perception/physiology , Young Adult
7.
Cognition ; 182: 307-317, 2019 01.
Article in English | MEDLINE | ID: mdl-30415132

ABSTRACT

People are able to rapidly categorize briefly flashed images of real-world environments, even when they are reduced to line drawings. This setting allows for the study of time-limited perceptual grouping processes in the human visual system that are applicable to line drawings. Previous work (Wilder, Dickinson, Jepson, & Walther, 2018) showed that standard local features of individual contours, or junctions between contours, do not account for this rapid classification ability but, rather, the relative placement of these contours appeared to be important. Here we provide strong support for this observation by demonstrating that local ribbon symmetry between neighboring pairs of contours facilitates the categorization of complex real-world environments. To this end, we introduce a novel computational approach, based on the medial axis transform, for measuring the degree of local ribbon symmetry in a line drawing. We use this measure to separate the contour pixels for a given scene into the most ribbon symmetric half and the least ribbon symmetric half. We then show human observers the resulting half-images in a rapid-categorization experiment. Our results demonstrate that local ribbon symmetry facilitates the categorization of complex real-world environments. This is the first study of the role of local symmetry in inter-contour grouping for human scene classification. We conclude that local ribbon symmetry appears to play an important role in jump-starting the grouping of image content into meaningful units, even in flashed presentations.


Subject(s)
Concept Formation/physiology , Pattern Recognition, Visual/physiology , Psychomotor Performance/physiology , Space Perception/physiology , Adolescent , Adult , Female , Humans , Male , Young Adult
8.
J Vis ; 18(8): 1, 2018 08 01.
Article in English | MEDLINE | ID: mdl-30073270

ABSTRACT

Photographs and line drawings of natural scenes are easily classified even when the image is only briefly visible to the observer. Contour junctions and points of high curvature have been shown to be important for perceptual organization (Attneave, 1954; Biederman, 1987) and have been proposed to be influential in rapid scene classification (Walther & Shen, 2014). Here, we manipulate the junctions in images, either randomly translating them, or selectively removing or maintaining them. Observers were better at classifying images when the contours were randomly translated (disrupting the junctions) than when the junctions were randomly shifted (partially disrupting contour information). Moreover, observers were better at classifying a scene when shown only segments between junctions, than when shown only the junctions, with the middle segments removed. These results suggest that categorizing line drawings of real-world scenes does not solely rely on junction statistics. The spatial locations of the junctions are important, as well as their relationships with one another. Furthermore, the segments between junctions appear to facilitate scene classification, possibly due to their involvement in symmetry relationships with other contour segments.


Subject(s)
Pattern Recognition, Visual/physiology , Spatial Processing/physiology , Visual Perception/physiology , Adolescent , Adult , Female , Humans , Male , Young Adult
9.
J Vis ; 18(8): 9, 2018 08 01.
Article in English | MEDLINE | ID: mdl-30140891

ABSTRACT

Classification image analysis is a powerful technique for elucidating linear detection and discrimination mechanisms, but it has primarily been applied to contrast detection. Here we report a novel classification image methodology for identifying linear mechanisms underlying shape discrimination. Although prior attempts to apply classification image methods to shape perception have been confined to simple radial shapes, the method proposed here can be applied to general 2-D (planar) shapes of arbitrary complexity, including natural shapes. Critical to the method is the projection of each target shape onto a Fourier descriptor (FD) basis set, which allows the essential perceptual features of each shape to be represented by a relatively small number of coefficients. We demonstrate that under this projection natural shapes are low pass, following a relatively steep power law. To efficiently identify the observer's classification template, we employ a yes/no paradigm and match the spectral density of the stimulus noise in FD space to the power law density of the target shape. The proposed method generates linear template models for animal shape detection that are predictive of human judgments. These templates are found to be biased away from the ideal, overly weighting lower frequencies. This low-pass bias suggests that higher frequency shape processing relies on nonlinear mechanisms.


Subject(s)
Classification , Form Perception/physiology , Visual Perception/physiology , Humans , Judgment , Psychometrics
10.
Vision Res ; 126: 220-231, 2016 09.
Article in English | MEDLINE | ID: mdl-26505685

ABSTRACT

The detection of contours in noise has been extensively studied, but the detection of closed contours, such as the boundaries of whole objects, has received relatively little attention. Closed contours pose substantial challenges not present in the simple (open) case, because they form the outlines of whole shapes and thus take on a range of potentially important configural properties. In this paper we consider the detection of closed contours in noise as a probabilistic decision problem. Previous work on open contours suggests that contour complexity, quantified as the negative log probability (Description Length, DL) of the contour under a suitably chosen statistical model, impairs contour detectability; more complex (statistically surprising) contours are harder to detect. In this study we extended this result to closed contours, developing a suitable probabilistic model of whole shapes that gives rise to several distinct though interrelated measures of shape complexity. We asked subjects to detect either natural shapes (Exp. 1) or experimentally manipulated shapes (Exp. 2) embedded in noise fields. We found systematic effects of global shape complexity on detection performance, demonstrating how aspects of global shape and form influence the basic process of object detection.


Subject(s)
Form Perception/physiology , Pattern Recognition, Visual/physiology , Perceptual Closure , Bayes Theorem , Discrimination, Psychological , Gestalt Theory , Humans , Perceptual Masking/physiology , Photic Stimulation/methods
11.
J Vis ; 15(6): 6, 2015.
Article in English | MEDLINE | ID: mdl-26024453

ABSTRACT

Itis well-known that "smooth" chains of oriented elements-contours-are more easily detected amid background noise than more undulating (i.e., "less smooth") chains. Here, we develop a Bayesian framework for contour detection and show that it predicts that contour detection performance should decrease with the contour's complexity, quantified as the description length (DL; i.e., the negative logarithm of probability integrated along the contour). We tested this prediction in two experiments in which subjects were asked to detect simple open contours amid pixel noise. In Experiment 1, we demonstrate a consistent decline in performance with increasingly complex contours, as predicted by the Bayesian model. In Experiment 2, we confirmed that this effect is due to integrated complexity along the contour, and does not seem to depend on local stretches of linear structure. The results corroborate the probabilistic model of contours, and show how contour detection can be understood as a special case of a more general process-the identification of organized patterns in the environment.


Subject(s)
Form Perception/physiology , Pattern Recognition, Visual/physiology , Adult , Bayes Theorem , Humans , Models, Theoretical , Probability
12.
Cognition ; 119(3): 325-40, 2011 Jun.
Article in English | MEDLINE | ID: mdl-21440250

ABSTRACT

This paper investigates the classification of shapes into broad natural categories such as animal or leaf. We asked whether such coarse classifications can be achieved by a simple statistical classification of the shape skeleton. We surveyed databases of natural shapes, extracting shape skeletons and tabulating their parameters within each class, seeking shape statistics that effectively discriminated the classes. We conducted two experiments in which human subjects were asked to classify novel shapes into the same natural classes. We compared subjects' classifications to those of a naive Bayesian classifier based on the natural shape statistics, and found good agreement. We conclude that human superordinate shape classifications can be well understood as involving a simple statistical classification of the shape skeleton that has been "tuned" to the natural statistics of shape.


Subject(s)
Data Interpretation, Statistical , Form Perception/physiology , Algorithms , Animals , Bayes Theorem , Databases, Factual , Female , Humans , Language , Male , Models, Neurological , Photic Stimulation , Young Adult
13.
Vision Res ; 49(9): 1017-31, 2009 May.
Article in English | MEDLINE | ID: mdl-18649913

ABSTRACT

Visual attention and saccades are typically studied in artificial situations, with stimuli presented to the steadily fixating eye, or saccades made along specified paths. By contrast, in real-world tasks saccadic patterns are constrained only by the demands of the motivating task. We studied attention during pauses between saccades made to perform three free-viewing tasks: counting dots, pointing to the same dots with a visible cursor, or simply looking at the dots using a freely-chosen path. Attention was assessed by the ability to identify the orientation of a briefly-presented Gabor probe. All primary tasks produced losses in identification performance, with counting producing the largest losses, followed by pointing and then looking-only. Looking-only resulted in a 37% increase in contrast thresholds in the orientation task. Counting produced more severe losses that were not overcome by increasing Gabor contrast. Detection or localization of the Gabor, unlike identification, were largely unaffected by any of the primary tasks. Taken together, these results show that attention is required to control saccades, even with freely-chosen paths, but the attentional demands of saccades are less than those attached to tasks such as counting, which have a significant cognitive load. Counting proved to be a highly demanding task that either exhausted momentary processing capacity (e.g., working memory or executive functions), or, alternatively, encouraged a strategy of filtering out all signals irrelevant to counting itself. The fact that the attentional demands of saccades (as well as those of detection/localization) are relatively modest makes it possible to continually adjust both the spatial and temporal pattern of saccades so as to re-allocate attentional resources as needed to handle the complex and multifaceted demands of real-world environments.


Subject(s)
Attention/physiology , Pattern Recognition, Visual/physiology , Saccades/physiology , Contrast Sensitivity/physiology , Eye Movement Measurements , Humans , Orientation , Photic Stimulation/methods , Psychomotor Performance , Psychophysics , Sensory Thresholds/physiology
SELECTION OF CITATIONS
SEARCH DETAIL
...