Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Elife ; 132024 Apr 25.
Article in English | MEDLINE | ID: mdl-38661128

ABSTRACT

Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric views (e.g. left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.


Subject(s)
Neural Networks, Computer , Animals , Brain/physiology , Neurons/physiology , Macaca , Models, Neurological , Macaca mulatta
2.
bioRxiv ; 2023 Jul 06.
Article in English | MEDLINE | ID: mdl-36711779

ABSTRACT

Primates can recognize objects despite 3D geometric variations such as in-depth rotations. The computational mechanisms that give rise to such invariances are yet to be fully understood. A curious case of partial invariance occurs in the macaque face-patch AL and in fully connected layers of deep convolutional networks in which neurons respond similarly to mirror-symmetric views (e.g., left and right profiles). Why does this tuning develop? Here, we propose a simple learning-driven explanation for mirror-symmetric viewpoint tuning. We show that mirror-symmetric viewpoint tuning for faces emerges in the fully connected layers of convolutional deep neural networks trained on object recognition tasks, even when the training dataset does not include faces. First, using 3D objects rendered from multiple views as test stimuli, we demonstrate that mirror-symmetric viewpoint tuning in convolutional neural network models is not unique to faces: it emerges for multiple object categories with bilateral symmetry. Second, we show why this invariance emerges in the models. Learning to discriminate among bilaterally symmetric object categories induces reflection-equivariant intermediate representations. AL-like mirror-symmetric tuning is achieved when such equivariant responses are spatially pooled by downstream units with sufficiently large receptive fields. These results explain how mirror-symmetric viewpoint tuning can emerge in neural networks, providing a theory of how they might emerge in the primate brain. Our theory predicts that mirror-symmetric viewpoint tuning can emerge as a consequence of exposure to bilaterally symmetric objects beyond the category of faces, and that it can generalize beyond previously experienced object categories.

3.
Eur J Neurosci ; 54(7): 6445-6462, 2021 10.
Article in English | MEDLINE | ID: mdl-34480766

ABSTRACT

What do we perceive in a glance of an object? If we are questioned about it, will our perception be affected? How does the task demand influence visual processing in the brain and, consequently, our behaviour? To address these questions, we conducted an object categorisation experiment with three tasks, one at the superordinate level ('animate/inanimate') and two at the basic levels ('face/body' and 'animal/human face') along with a passive task in which participants were not required to categorise objects. To control bottom-up information and eliminate the effect of sensory-driven dissimilarity, we used a particular set of animal face images as the identical target stimuli across all tasks. We then investigated the impact of top-down task demands on behaviour and brain representations. Behavioural results demonstrated a superordinate advantage in the reaction time, while the accuracy was similar for all categorisation levels. The event-related potentials (ERPs) for all categorisation levels were highly similar except for about 170 ms and after 300 ms from stimulus onset. In these time windows, the animal/human face categorisation, which required fine-scale discrimination, elicited a differential ERP response. Similarly, decoding analysis over all electrodes showed the highest peak value of task decoding around 170 ms, followed by a few significant timepoints, generally after 300 ms. Moreover, brain responses revealed task-related neural modulation during categorisation tasks compared with the passive task. Overall, these findings demonstrate different task-related effects on the behavioural response and brain representations. The early and late components of neural modulation could be linked to perceptual and top-down processing of object categories, respectively.


Subject(s)
Brain , Visual Perception , Electroencephalography , Evoked Potentials , Humans , Pattern Recognition, Visual , Photic Stimulation , Reaction Time
4.
Sci Rep ; 6: 25025, 2016 04 26.
Article in English | MEDLINE | ID: mdl-27113635

ABSTRACT

Converging reports indicate that face images are processed through specialized neural networks in the brain -i.e. face patches in monkeys and the fusiform face area (FFA) in humans. These studies were designed to find out how faces are processed in visual system compared to other objects. Yet, the underlying mechanism of face processing is not completely revealed. Here, we show that a hierarchical computational model, inspired by electrophysiological evidence on face processing in primates, is able to generate representational properties similar to those observed in monkey face patches (posterior, middle and anterior patches). Since the most important goal of sensory neuroscience is linking the neural responses with behavioral outputs, we test whether the proposed model, which is designed to account for neural responses in monkey face patches, is also able to predict well-documented behavioral face phenomena observed in humans. We show that the proposed model satisfies several cognitive face effects such as: composite face effect and the idea of canonical face views. Our model provides insights about the underlying computations that transfer visual information from posterior to anterior face patches.


Subject(s)
Facial Recognition/physiology , Models, Theoretical , Animals , Cerebral Cortex/physiology , Haplorhini , Humans , Photic Stimulation
5.
Article in English | MEDLINE | ID: mdl-25100986

ABSTRACT

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex.

SELECTION OF CITATIONS
SEARCH DETAIL
...