Search | VHL Regional Portal

1.

How schema knowledge influences memory in older adults: Filling in the gaps, or leading memory astray?

Ramey, Michelle M; Yonelinas, Andrew P; Henderson, John M.

Cognition ; 250: 105826, 2024 Jun 13.

Article in English | MEDLINE | ID: mdl-38875942

ABSTRACT

Age-related declines in episodic memory do not affect all types of mnemonic information equally: when to-be-remembered information is in line with one's prior knowledge, or schema-congruent, older adults often show no impairments. There are two major accounts of this effect: One proposes that schemas compensate for memory failures in aging, and the other proposes that schemas instead actively impair older adults' otherwise intact memory for incongruent information. However, the evidence thus far is inconclusive, likely due to methodological constraints in teasing apart these complex underlying dynamics. We developed a paradigm that separately examines the contributions of underlying memory and schema knowledge to a final memory decision, allowing these dynamics to be examined directly. In the present study, healthy older and younger adults first searched for target objects in congruent or incongruent locations within scenes. In a subsequent test, participants indicated where in each scene the target had been located previously, and provided confidence-based recognition memory judgments that indexed underlying memory, in terms of recollection and familiarity, for the background scenes. We found that age-related increases in schema effects on target location spatial recall were predicted and statistically mediated by age-related increases in underlying memory failures, specifically within recollection. We also found that, relative to younger adults, older adults had poorer spatial memory precision within recollected scenes but slightly better precision within familiar scenes-and age increases in schema bias were primarily exhibited within recollected scenes. Interestingly, however, there were also slight age-related increases in schema effects that could not be explained by memory deficits alone, outlining a role for active schema influences as well. Together, these findings support the account that age-related schema effects on memory are compensatory in that they are driven primarily by underlying memory failures, and further suggest that age-related deficits in memory precision may also drive schema effects.

2.

The role of local meaning in infants' fixations of natural scenes.

Oakes, Lisa M; Hayes, Taylor R; Klotz, Shannon M; Pomaranski, Katherine I; Henderson, John M.

Infancy ; 29(2): 284-298, 2024.

Article in English | MEDLINE | ID: mdl-38183667

ABSTRACT

As infants view visual scenes every day, they must shift their eye gaze and visual attention from location to location, sampling information to process and learn. Like adults, infants' gaze when viewing natural scenes (i.e., photographs of everyday scenes) is influenced by the physical features of the scene image and a general bias to look more centrally in a scene. However, it is unknown how infants' gaze while viewing such scenes is influenced by the semantic content of the scenes. Here, we tested the relative influence of local meaning, controlling for physical salience and center bias, on the eye gaze of 4- to 12-month-old infants (N = 92) as they viewed natural scenes. Overall, infants were more likely to fixate scene regions rated as higher in meaning, indicating that, like adults, the semantic content, or local meaning, of scenes influences where they look. More importantly, the effect of meaning on infant attention increased with age, providing the first evidence for an age-related increase in the impact of local meaning on infants' eye movements while viewing natural scenes.

Subject(s)

Fixation, Ocular , Visual Perception , Adult , Infant , Humans , Eye Movements , Learning , Semantics

3.

Spatiotemporal jump detection during continuous film viewing: Insights from a flicker paradigm.

Upadhyayula, Aditya; Henderson, John M.

Atten Percept Psychophys ; 86(2): 559-566, 2024 Feb.

Article in English | MEDLINE | ID: mdl-38172463

ABSTRACT

We investigated how sensitive visual processing is to spatiotemporal disruptions in ongoing visual events. Prior work has demonstrated that participants often miss spatiotemporal disruptions in videos presented in the form of scene edits or disruptions during saccades. Here, we asked whether this phenomenon generalizes to spatiotemporal disruptions that are not tied to saccades. In two flicker paradigm experiments, participants were instructed to identify spatiotemporal disruptions created when videos either jumped forward or backward in time. Participants often missed the jumps, and forward jumps were reported less frequently compared with backward jumps, demonstrating that a flicker paradigm produces effects similar to a saccade contingent disruption paradigm. These results suggest that difficulty detecting spatiotemporal disruptions is a general phenomenon that extends beyond trans-saccadic events.

Subject(s)

Saccades , Visual Perception , Humans

4.

Transformers bridge vision and language to estimate and understand scene meaning.

Hayes, Taylor R; Henderson, John M.

Res Sq ; 2023 May 29.

Article in English | MEDLINE | ID: mdl-37398443

ABSTRACT

Humans rapidly process and understand real-world scenes with ease. Our stored semantic knowledge gained from experience is thought to be central to this ability by organizing perceptual information into meaningful units to efficiently guide our attention in scenes. However, the role stored semantic representations play in scene guidance remains difficult to study and poorly understood. Here, we apply a state-of-the-art multimodal transformer trained on billions of image-text pairs to help advance our understanding of the role semantic representations play in scene understanding. We demonstrate across multiple studies that this transformer-based approach can be used to automatically estimate local scene meaning in indoor and outdoor scenes, predict where people look in these scenes, detect changes in local semantic content, and provide a human-interpretable account of why one scene region is more meaningful than another. Taken together, these findings highlight how multimodal transformers can advance our understanding of the role scene semantics play in scene understanding by serving as a representational framework that bridges vision and language.

5.

Objects are selected for attention based upon meaning during passive scene viewing.

Peacock, Candace E; Hall, Elizabeth H; Henderson, John M.

Psychon Bull Rev ; 30(5): 1874-1886, 2023 Oct.

Article in English | MEDLINE | ID: mdl-37095319

ABSTRACT

While object meaning has been demonstrated to guide attention during active scene viewing and object salience guides attention during passive viewing, it is unknown whether object meaning predicts attention in passive viewing tasks and whether attention during passive viewing is more strongly related to meaning or salience. To answer this question, we used a mixed modeling approach where we computed the average meaning and physical salience of objects in scenes while statistically controlling for the roles of object size and eccentricity. Using eye-movement data from aesthetic judgment and memorization tasks, we then tested whether fixations are more likely to land on high-meaning objects than low-meaning objects while controlling for object salience, size, and eccentricity. The results demonstrated that fixations are more likely to be directed to high meaning objects than low meaning objects regardless of these other factors. Further analyses revealed that fixation durations were positively associated with object meaning irrespective of the other object properties. Overall, these findings provide the first evidence that objects are, in part, selected by meaning for attentional selection during passive scene viewing.

Subject(s)

Fixation, Ocular , Visual Perception , Humans , Photic Stimulation/methods , Eye Movements , Judgment

6.

Spatiotemporal jump detection during continuous film viewing.

Upadhyayula, Aditya; Henderson, John M.

J Vis ; 23(2): 13, 2023 02 01.

Article in English | MEDLINE | ID: mdl-36848067

ABSTRACT

Prior research on film viewing has demonstrated that participants frequently fail to notice spatiotemporal disruptions, such as scene edits in the movies. Whether such insensitivity to spatiotemporal disruptions extends beyond scene edits in film viewing is not well understood. Across three experiments, we created spatiotemporal disruptions by presenting participants with minute long movie clips, and occasionally jumping the movie clips ahead or backward in time. Participants were instructed to press a button when they noticed any disruptions while watching the clips. The results from experiments 1 and 2 indicate that participants failed to notice the disruptions in continuity about 10% to 30% of the time depending on the magnitude of the jump. In addition, detection rates were lower by approximately 10% when the videos jumped ahead in time compared to the backward jumps across all jump magnitudes, suggesting a role of knowledge about the future affects jump detection. An additional analysis used optic flow similarity during these disruptions. Our findings suggest that insensitivity to spatiotemporal disruptions during film viewing is influenced by knowledge about future states.

Subject(s)

Motion Pictures , Optic Flow , Humans

7.

Searching for meaning: Local scene semantics guide attention during natural visual search in scenes.

Peacock, Candace E; Singh, Praveena; Hayes, Taylor R; Rehrig, Gwendolyn; Henderson, John M.

Q J Exp Psychol (Hove) ; 76(3): 632-648, 2023 Mar.

Article in English | MEDLINE | ID: mdl-35510885

ABSTRACT

Models of visual search in scenes include image salience as a source of attentional guidance. However, because scene meaning is correlated with image salience, it could be that the salience predictor in these models is driven by meaning. To test this proposal, we generated meaning maps that represented the spatial distribution of semantic informativeness in scenes, and salience maps which represented the spatial distribution of conspicuous image features and tested their influence on fixation densities from two object search tasks in real-world scenes. The results showed that meaning accounted for significantly greater variance in fixation densities than image salience, both overall and in early attention across both studies. Here, meaning explained 58% and 63% of the theoretical ceiling of variance in attention across both studies, respectively. Furthermore, both studies demonstrated that fast initial saccades were not more likely to be directed to higher salience regions than slower initial saccades, and initial saccades of all latencies were directed to regions containing higher meaning than salience. Together, these results demonstrated that even though meaning was task-neutral, the visual system still selected meaningful over salient scene regions for attention during search.

Subject(s)

Semantics , Visual Perception , Humans , Saccades , Fixation, Ocular

8.

Visual attention during seeing for speaking in healthy aging.

Rehrig, Gwendolyn; Hayes, Taylor R; Henderson, John M; Ferreira, Fernanda.

Psychol Aging ; 38(1): 49-66, 2023 Feb.

Article in English | MEDLINE | ID: mdl-36395016

ABSTRACT

As we age, we accumulate a wealth of information about the surrounding world. Evidence from visual search suggests that older adults retain intact knowledge for where objects tend to occur in everyday environments (semantic information) that allows them to successfully locate objects in scenes, but may overrely on semantic guidance. We investigated age differences in the allocation of attention to semantically informative and visually salient information in a task in which the eye movements of younger (N = 30, aged 18-24) and older (N = 30, aged 66-82) adults were tracked as they described real-world scenes. We measured the semantic information in scenes based on "meaning map" ratings from a norming sample of young and older adults, and image salience as graph-based visual saliency. Logistic mixed-effects modeling was used to determine whether, controlling for center bias, fixated scene locations differed in semantic informativeness and visual salience from locations that were not fixated, and whether these effects differed for young and older adults. Semantic informativeness predicted fixated locations well overall, as did image salience, although unique variance in the model was better explained by semantic informativeness than image salience. Older adults were less likely to fixate informative locations in scenes than young adults were, though the locations older adults' fixated were independently predicted well by informativeness. These results suggest young and older adults both use semantic information to guide attention in scenes and that older adults do not overrely on semantic information across the board. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

Subject(s)

Healthy Aging , Visual Perception , Humans , Aged , Aging , Eye Movements , Semantics , Fixation, Ocular

9.

Working memory control predicts fixation duration in scene-viewing.

Loh, Zoe; Hall, Elizabeth H; Cronin, Deborah; Henderson, John M.

Psychol Res ; 87(4): 1143-1154, 2023 Jun.

Article in English | MEDLINE | ID: mdl-35879564

ABSTRACT

When viewing scenes, observers differ in how long they linger at each fixation location and how far they move their eyes between fixations. What factors drive these differences in eye-movement behaviors? Previous work suggests individual differences in working memory capacity may influence fixation durations and saccade amplitudes. In the present study, participants (N = 98) performed two scene-viewing tasks, aesthetic judgment and memorization, while viewing 100 photographs of real-world scenes. Working memory capacity, working memory processing ability, and fluid intelligence were assessed with an operation span task, a memory updating task, and Raven's Advanced Progressive Matrices, respectively. Across participants, we found significant effects of task on both fixation durations and saccade amplitudes. At the level of each individual participant, we also found a significant relationship between memory updating task performance and participants' fixation duration distributions. However, we found no effect of fluid intelligence and no effect of working memory capacity on fixation duration or saccade amplitude distributions, inconsistent with previous findings. These results suggest that the ability to flexibly maintain and update working memory is strongly related to fixation duration behavior.

Subject(s)

Fixation, Ocular , Memory, Short-Term , Humans , Eye Movements , Saccades , Cognition

10.

Scene inversion reveals distinct patterns of attention to semantically interpreted and uninterpreted features.

Hayes, Taylor R; Henderson, John M.

Cognition ; 229: 105231, 2022 12.

Article in English | MEDLINE | ID: mdl-35908295

ABSTRACT

Semantic guidance theories propose that attention in real-world scenes is strongly associated with semantically informative scene regions. That is, we look where there are recognizable and informative objects that help us make sense of our visual environment. In contrast, image guidance theories propose that local differences in semantically uninterpreted image features such as luminance, color, and edge orientation primarily determine where we look in scenes. While it is clear that both semantic guidance and image guidance play a role in where we look in scenes, the degree of their relative contributions and how they interact with each other remains poorly understood. In the current study, we presented real-world scenes in upright and inverted orientations and used general linear mixed effects models to understand how semantic guidance, image guidance, and observer center bias were associated with fixation location and fixation duration. We observed distinct patterns of change under inversion. Semantic guidance was severely disrupted by scene inversion, while image guidance was mildly impaired and observer center bias was enhanced. In addition, we found that fixation durations for semantically rich regions decreased when viewing inverted scenes relative to upright scene viewing, while fixation durations for image salience and center bias were unaffected by inversion. Together these results provide important new constraints on theories and computational models of attention in real-world scenes.

Subject(s)

Fixation, Ocular , Semantics , Humans , Visual Perception

11.

Eye movements dissociate between perceiving, sensing, and unconscious change detection in scenes.

Ramey, Michelle M; Henderson, John M; Yonelinas, Andrew P.

Psychon Bull Rev ; 29(6): 2122-2132, 2022 Dec.

Article in English | MEDLINE | ID: mdl-35653039

ABSTRACT

Detecting visual changes can be based on perceiving, whereby one can identify a specific detail that has changed, on sensing, whereby one knows that there is a change but is unable to identify what changed, or on unconscious change detection, whereby one is unaware of any change even though the change influences one's behavior. Prior work has indicated that the processes underlying these different types of change detection are functionally and neurally distinct, but the attentional mechanisms that are related to these different types of change detection remain largely unknown. In the current experiment, we examined eye movements during a change detection task in globally manipulated scenes, and participants indicated their change detection confidence on a scale that allowed us to isolate perceiving, sensing, and unconscious change detection. For perceiving-based change detection, but not sensing-based or unconscious change detection, participants were more likely to preferentially revisit highly changed scene regions across the first and second presentation of the scene (i.e., resampling). This increase in resampling started within 250 ms of the test scene onset, suggesting that the effect began within the first two fixations. In addition, changed scenes were related to more clustered (i.e., less dispersed) eye movements than unchanged scenes, particularly when the subjects were highly confident that no change had occurred - providing evidence for change detection outside of conscious awareness. The results indicate that perceiving, sensing, and unconscious change detection responses are related to partially distinct patterns of eye movements.

Subject(s)

Attention , Eye Movements , Humans , Attention/physiology , Consciousness

12.

Episodic memory processes modulate how schema knowledge is used in spatial memory decisions.

Ramey, Michelle M; Henderson, John M; Yonelinas, Andrew P.

Cognition ; 225: 105111, 2022 08.

Article in English | MEDLINE | ID: mdl-35487103

ABSTRACT

Schema knowledge can dramatically affect how we encode and retrieve memories. Current models propose that schema information is combined with episodic memory at retrieval to influence memory decisions, but it is not known how the strength or type of episodic memory (i.e., unconscious memory versus familiarity versus recollection) influences the extent to which schema information is incorporated into memory decisions. To address this question, we had participants search for target objects in semantically expected (i.e., congruent) locations or in unusual (i.e., incongruent) locations within scenes. In a subsequent test, participants indicated where in each scene the target had been located previously, then provided confidence-based recognition memory judgments that indexed recollection, familiarity strength, and unconscious memory for the scenes. In both an initial online study (n = 133) and replication (n = 59), target location recall was more accurate for targets that had been located in schema-congruent rather than incongruent locations; importantly, this effect was strongest for new scenes, decreased with unconscious memory, decreased further with familiarity strength, and was eliminated entirely for recollected scenes. Moreover, when participants recollected an incongruent scene but did not correctly remember the target location, they were still biased away from congruent regions-suggesting that detrimental schema bias was suppressed in the presence of recollection even when precise target location information was not remembered. The results indicate that episodic memory modulates how schemas are used: Schema knowledge contributes to spatial memory judgments primarily when episodic memory fails to provide precise information, and recollection can override schema bias completely.

Subject(s)

Memory, Episodic , Humans , Knowledge , Mental Recall , Recognition, Psychology , Spatial Memory

13.

Look at what I can do: Object affordances guide visual attention while speakers describe potential actions.

Rehrig, Gwendolyn; Barker, Madison; Peacock, Candace E; Hayes, Taylor R; Henderson, John M; Ferreira, Fernanda.

Atten Percept Psychophys ; 84(5): 1583-1610, 2022 Jul.

Article in English | MEDLINE | ID: mdl-35484443

ABSTRACT

As we act on the world around us, our eyes seek out objects we plan to interact with. A growing body of evidence suggests that overt visual attention selects objects in the environment that could be interacted with, even when the task precludes physical interaction. In previous work, objects that afford grasping interactions influenced attention when static scenes depicted reachable spaces, and attention was otherwise better explained by general informativeness. Because grasping is but one of many object interactions, previous work may have downplayed the influence of object affordances on attention. The current study investigated the relationship between overt visual attention and object affordances versus broadly construed semantic information in scenes as speakers describe or memorize scenes. In addition to meaning and grasp maps-which capture informativeness and grasping object affordances in scenes, respectively-we introduce interact maps, which capture affordances more broadly. In a mixed-effects analysis of 5 eyetracking experiments, we found that meaning predicted fixated locations in a general description task and during scene memorization. Grasp maps marginally predicted fixated locations during action description for scenes that depicted reachable spaces only. Interact maps predicted fixated regions in description experiments alone. Our findings suggest observers allocate attention to scene regions that could be readily interacted with when talking about the scene, while general informativeness preferentially guides attention when the task does not encourage careful consideration of objects in the scene. The current study suggests that the influence of object affordances on visual attention in scenes is mediated by task demands.

Subject(s)

Eye Movements , Visual Perception , Hand Strength , Humans , Pattern Recognition, Visual , Semantics

14.

Meaning maps detect the removal of local semantic scene content but deep saliency models do not.

Hayes, Taylor R; Henderson, John M.

Atten Percept Psychophys ; 84(3): 647-654, 2022 Apr.

Article in English | MEDLINE | ID: mdl-35138579

ABSTRACT

Meaning mapping uses human raters to estimate different semantic features in scenes, and has been a useful tool in demonstrating the important role semantics play in guiding attention. However, recent work has argued that meaning maps do not capture semantic content, but like deep learning models of scene attention, represent only semantically-neutral image features. In the present study, we directly tested this hypothesis using a diffeomorphic image transformation that is designed to remove the meaning of an image region while preserving its image features. Specifically, we tested whether meaning maps and three state-of-the-art deep learning models were sensitive to the loss of semantic content in this critical diffeomorphed scene region. The results were clear: meaning maps generated by human raters showed a large decrease in the diffeomorphed scene regions, while all three deep saliency models showed a moderate increase in the diffeomorphed scene regions. These results demonstrate that meaning maps reflect local semantic content in scenes while deep saliency models do something else. We conclude the meaning mapping approach is an effective tool for estimating semantic content in scenes.

Subject(s)

Semantics , Visual Perception , Attention , Eye Movements , Humans

15.

Linking patterns of infant eye movements to a neural network model of the ventral stream using representational similarity analysis.

Kiat, John E; Luck, Steven J; Beckner, Aaron G; Hayes, Taylor R; Pomaranski, Katherine I; Henderson, John M; Oakes, Lisa M.

Dev Sci ; 25(1): e13155, 2022 01.

Article in English | MEDLINE | ID: mdl-34240787

ABSTRACT

Little is known about the development of higher-level areas of visual cortex during infancy, and even less is known about how the development of visually guided behavior is related to the different levels of the cortical processing hierarchy. As a first step toward filling these gaps, we used representational similarity analysis (RSA) to assess links between gaze patterns and a neural network model that captures key properties of the ventral visual processing stream. We recorded the eye movements of 4- to 12-month-old infants (N = 54) as they viewed photographs of scenes. For each infant, we calculated the similarity of the gaze patterns for each pair of photographs. We also analyzed the images using a convolutional neural network model in which the successive layers correspond approximately to the sequence of areas along the ventral stream. For each layer of the network, we calculated the similarity of the activation patterns for each pair of photographs, which was then compared with the infant gaze data. We found that the network layers corresponding to lower-level areas of visual cortex accounted for gaze patterns better in younger infants than in older infants, whereas the network layers corresponding to higher-level areas of visual cortex accounted for gaze patterns better in older infants than in younger infants. Thus, between 4 and 12 months, gaze becomes increasingly controlled by more abstract, higher-level representations. These results also demonstrate the feasibility of using RSA to link infant gaze behavior to neural network models. A video abstract of this article can be viewed at https://youtu.be/K5mF2Rw98Is.

Subject(s)

Eye Movements , Visual Cortex , Aged , Humans , Infant , Neural Networks, Computer , Visual Cortex/physiology , Visual Perception/physiology

16.

Rapid Extraction of the Spatial Distribution of Physical Saliency and Semantic Informativeness from Natural Scenes in the Human Brain.

Kiat, John E; Hayes, Taylor R; Henderson, John M; Luck, Steven J.

J Neurosci ; 42(1): 97-108, 2022 01 05.

Article in English | MEDLINE | ID: mdl-34750229

ABSTRACT

Physically salient objects are thought to attract attention in natural scenes. However, research has shown that meaning maps, which capture the spatial distribution of semantically informative scene features, trump physical saliency in predicting the pattern of eye moments in natural scene viewing. Meaning maps even predict the fastest eye movements, suggesting that the brain extracts the spatial distribution of potentially meaningful scene regions very rapidly. To test this hypothesis, we applied representational similarity analysis to ERP data. The ERPs were obtained from human participants (N = 32, male and female) who viewed a series of 50 different natural scenes while performing a modified 1-back task. For each scene, we obtained a physical saliency map from a computational model and a meaning map from crowd-sourced ratings. We then used representational similarity analysis to assess the extent to which the representational geometry of physical saliency maps and meaning maps can predict the representational geometry of the neural response (the ERP scalp distribution) at each moment in time following scene onset. We found that a link between physical saliency and the ERPs emerged first (â¼78 ms after stimulus onset), with a link to semantic informativeness emerging soon afterward (â¼87 ms after stimulus onset). These findings are in line with previous evidence indicating that saliency is computed rapidly, while also indicating that information related to the spatial distribution of semantically informative scene elements is computed shortly thereafter, early enough to potentially exert an influence on eye movements.SIGNIFICANCE STATEMENT Attention may be attracted by physically salient objects, such as flashing lights, but humans must also be able to direct their attention to meaningful parts of scenes. Understanding how we direct attention to meaningful scene regions will be important for developing treatments for disorders of attention and for designing roadways, cockpits, and computer user interfaces. Information about saliency appears to be extracted rapidly by the brain, but little is known about the mechanisms that determine the locations of meaningful information. To address this gap, we showed people photographs of real-world scenes and measured brain activity. We found that information related to the locations of meaningful scene elements was extracted rapidly, shortly after the emergence of saliency-related information.

Subject(s)

Attention/physiology , Brain Mapping/methods , Brain/physiology , Models, Neurological , Visual Perception/physiology , Adolescent , Adult , Evoked Potentials/physiology , Female , Humans , Male , Photic Stimulation , Semantics , Young Adult

17.

Meaning and expected surfaces combine to guide attention during visual search in scenes.

Peacock, Candace E; Cronin, Deborah A; Hayes, Taylor R; Henderson, John M.

J Vis ; 21(11): 1, 2021 10 05.

Article in English | MEDLINE | ID: mdl-34609475

ABSTRACT

How do spatial constraints and meaningful scene regions interact to control overt attention during visual search for objects in real-world scenes? To answer this question, we combined novel surface maps of the likely locations of target objects with maps of the spatial distribution of scene semantic content. The surface maps captured likely target surfaces as continuous probabilities. Meaning was represented by meaning maps highlighting the distribution of semantic content in local scene regions. Attention was indexed by eye movements during the search for target objects that varied in the likelihood they would appear on specific surfaces. The interaction between surface maps and meaning maps was analyzed to test whether fixations were directed to meaningful scene regions on target-related surfaces. Overall, meaningful scene regions were more likely to be fixated if they appeared on target-related surfaces than if they appeared on target-unrelated surfaces. These findings suggest that the visual system prioritizes meaningful scene regions on target-related surfaces during visual search in scenes.

Subject(s)

Attention , Visual Perception , Eye Movements , Humans , Pattern Recognition, Visual , Probability , Semantics

18.

Deep saliency models learn low-, mid-, and high-level features to predict scene attention.

Hayes, Taylor R; Henderson, John M.

Sci Rep ; 11(1): 18434, 2021 09 16.

Article in English | MEDLINE | ID: mdl-34531484

ABSTRACT

Deep saliency models represent the current state-of-the-art for predicting where humans look in real-world scenes. However, for deep saliency models to inform cognitive theories of attention, we need to know how deep saliency models prioritize different scene features to predict where people look. Here we open the black box of three prominent deep saliency models (MSI-Net, DeepGaze II, and SAM-ResNet) using an approach that models the association between attention, deep saliency model output, and low-, mid-, and high-level scene features. Specifically, we measured the association between each deep saliency model and low-level image saliency, mid-level contour symmetry and junctions, and high-level meaning by applying a mixed effects modeling approach to a large eye movement dataset. We found that all three deep saliency models were most strongly associated with high-level and low-level features, but exhibited qualitatively different feature weightings and interaction patterns. These findings suggest that prominent deep saliency models are primarily learning image features associated with high-level scene meaning and low-level image saliency and highlight the importance of moving beyond simply benchmarking performance.

Subject(s)

Attention , Eye Movements , Models, Neurological , Visual Perception/physiology , Humans , Machine Learning

19.

Developmental changes in natural scene viewing in infancy.

Pomaranski, Katherine I; Hayes, Taylor R; Kwon, Mee-Kyoung; Henderson, John M; Oakes, Lisa M.

Dev Psychol ; 57(7): 1025-1041, 2021 Jul.

Article in English | MEDLINE | ID: mdl-34435820

ABSTRACT

We extend decades of research on infants' visual processing by examining their eye gaze during viewing of natural scenes. We examined the eye movements of a racially diverse group of 4- to 12-month-old infants (N = 54; 27 boys; 24 infants were White and not Hispanic, 30 infants were African American, Asian American, mixed race and/or Hispanic) as they viewed images selected from the MIT Saliency Benchmark Project. In general, across this age range infants' fixation distributions became more consistent and more adult-like, suggesting that infants' fixations in natural scenes become increasingly more systematic. Evaluation of infants' fixation patterns with saliency maps generated by different models of physical salience revealed that although over this age range there was an increase in the correlations between infants' fixations and saliency, the amount of variance accounted for by salience actually decreased. At the youngest age, the amount of variance accounted for by salience was very similar to the consistency between infants' fixations, suggesting that the systematicity in these youngest infants' fixations was explained by their attention to physically salient regions. By 12 months, in contrast, the consistency between infants was greater than the variance accounted for by salience, suggesting that the systematicity in older infants' fixations reflected more than their attention to physically salient regions. Together these results show that infants' fixations when viewing natural scenes becomes more systematic and predictable, and that predictability is due to their attention to features other than physical salience. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

Subject(s)

Fixation, Ocular , Visual Perception , Adult , Aged , Humans , Infant , Male , Cognition , Eye Movements

20.

Looking for Semantic Similarity: What a Vector-Space Model of Semantics Can Tell Us About Attention in Real-World Scenes.

Hayes, Taylor R; Henderson, John M.

Psychol Sci ; 32(8): 1262-1270, 2021 08.

Article in English | MEDLINE | ID: mdl-34252325

ABSTRACT

The visual world contains more information than we can perceive and understand in any given moment. Therefore, we must prioritize important scene regions for detailed analysis. Semantic knowledge gained through experience is theorized to play a central role in determining attentional priority in real-world scenes but is poorly understood. Here, we examined the relationship between object semantics and attention by combining a vector-space model of semantics with eye movements in scenes. In this approach, the vector-space semantic model served as the basis for a concept map, an index of the spatial distribution of the semantic similarity of objects across a given scene. The results showed a strong positive relationship between the semantic similarity of a scene region and viewers' focus of attention; specifically, greater attention was given to more semantically related scene regions. We conclude that object semantics play a critical role in guiding attention through real-world scenes.

Subject(s)

Semantics , Visual Perception , Eye Movements , Humans , Pattern Recognition, Visual , Space Simulation

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL