Pesquisa | Portal Regional da BVS

Multiple visual objects are represented differently in the human brain and convolutional neural networks.

Mocz, Viola; Jeong, Su Keun; Chun, Marvin; Xu, Yaoda.

Sci Rep ; 13(1): 9088, 2023 06 05.

Artigo em Inglês | MEDLINE | ID: mdl-37277406

RESUMO

Objects in the real world usually appear with other objects. To form object representations independent of whether or not other objects are encoded concurrently, in the primate brain, responses to an object pair are well approximated by the average responses to each constituent object shown alone. This is found at the single unit level in the slope of response amplitudes of macaque IT neurons to paired and single objects, and at the population level in fMRI voxel response patterns in human ventral object processing regions (e.g., LO). Here, we compare how the human brain and convolutional neural networks (CNNs) represent paired objects. In human LO, we show that averaging exists in both single fMRI voxels and voxel population responses. However, in the higher layers of five CNNs pretrained for object classification varying in architecture, depth and recurrent processing, slope distribution across units and, consequently, averaging at the population level both deviated significantly from the brain data. Object representations thus interact with each other in CNNs when objects are shown together and differ from when objects are shown individually. Such distortions could significantly limit CNNs' ability to generalize object representations formed in different contexts.

Assuntos

Encéfalo , Reconhecimento Visual de Modelos , Animais , Humanos , Reconhecimento Visual de Modelos/fisiologia , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Redes Neurais de Computação , Mapeamento Encefálico , Imageamento por Ressonância Magnética , Macaca , Percepção Visual

Representing Multiple Visual Objects in the Human Brain and Convolutional Neural Networks.

Mocz, Viola; Jeong, Su Keun; Chun, Marvin; Xu, Yaoda.

bioRxiv ; 2023 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-36909506

RESUMO

Objects in the real world often appear with other objects. To recover the identity of an object whether or not other objects are encoded concurrently, in primate object-processing regions, neural responses to an object pair have been shown to be well approximated by the average responses to each constituent object shown alone, indicating the whole is equal to the average of its parts. This is present at the single unit level in the slope of response amplitudes of macaque IT neurons to paired and single objects, and at the population level in response patterns of fMRI voxels in human ventral object processing regions (e.g., LO). Here we show that averaging exists in both single fMRI voxels and voxel population responses in human LO, with better averaging in single voxels leading to better averaging in fMRI response patterns, demonstrating a close correspondence of averaging at the fMRI unit and population levels. To understand if a similar averaging mechanism exists in convolutional neural networks (CNNs) pretrained for object classification, we examined five CNNs with varying architecture, depth and the presence/absence of recurrent processing. We observed averaging at the CNN unit level but rarely at the population level, with CNN unit response distribution in most cases did not resemble human LO or macaque IT responses. The whole is thus not equal to the average of its parts in CNNs, potentially rendering the individual objects in a pair less accessible in CNNs during visual processing than they are in the human brain.

Decision-making from temporally accumulated conflicting evidence: The more the merrier.

Mocz, Viola; Xu, Yaoda.

J Vis ; 23(1): 3, 2023 01 03.

Artigo em Inglês | MEDLINE | ID: mdl-36598454

RESUMO

How do humans evaluate temporally accumulated discrete pieces of evidence and arrive at a decision despite the presence of conflicting evidence? In the present study, we showed human participants a sequential presentation of objects drawn from two novel object categories and asked them to decide whether a given presentation contained more objects from one or the other category. We found that both a more disparate ratio and greater numerosity of objects improved both reaction time (RT) and accuracy. The effect of numerosity was separate from ratio, where with a fixed object ratio, sequences with more total objects had lower RT and lower error rates than those with fewer total objects. We replicated these results across three experiments. Additionally, even with the total presentation duration equated and with the motor response assignment varied from trial to trial, an effect of numerosity was still found in RT. The same RT benefit was also present when objects were shown simultaneously, rather than sequentially. Together, these results showed that, for comparative numerosity judgment involving sequential displays, there was a benefit of numerosity, such that showing more objects independent of the object ratio and the total presentation time led to faster decision performance.

Assuntos

Julgamento , Humanos , Julgamento/fisiologia , Tempo de Reação/fisiologia

Predicting Identity-Preserving Object Transformations in Human Posterior Parietal Cortex and Convolutional Neural Networks.

Mocz, Viola; Vaziri-Pashkam, Maryam; Chun, Marvin; Xu, Yaoda.

J Cogn Neurosci ; 34(12): 2406-2435, 2022 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-36122358

RESUMO

Previous research shows that, within human occipito-temporal cortex (OTC), we can use a general linear mapping function to link visual object responses across nonidentity feature changes, including Euclidean features (e.g., position and size) and non-Euclidean features (e.g., image statistics and spatial frequency). Although the learned mapping is capable of predicting responses of objects not included in training, these predictions are better for categories included than those not included in training. These findings demonstrate a near-orthogonal representation of object identity and nonidentity features throughout human OTC. Here, we extended these findings to examine the mapping across both Euclidean and non-Euclidean feature changes in human posterior parietal cortex (PPC), including functionally defined regions in inferior and superior intraparietal sulcus. We additionally examined responses in five convolutional neural networks (CNNs) pretrained with object classification, as CNNs are considered as the current best model of the primate ventral visual system. We separately compared results from PPC and CNNs with those of OTC. We found that a linear mapping function could successfully link object responses in different states of nonidentity transformations in human PPC and CNNs for both Euclidean and non-Euclidean features. Overall, we found that object identity and nonidentity features are represented in a near-orthogonal, rather than complete-orthogonal, manner in PPC and CNNs, just like they do in OTC. Meanwhile, some differences existed among OTC, PPC, and CNNs. These results demonstrate the similarities and differences in how visual object information across an identity-preserving image transformation may be represented in OTC, PPC, and CNNs.

Assuntos

Mapeamento Encefálico , Imageamento por Ressonância Magnética , Animais , Humanos , Imageamento por Ressonância Magnética/métodos , Lobo Parietal/fisiologia , Redes Neurais de Computação , Lobo Temporal

Predicting Identity-Preserving Object Transformations across the Human Ventral Visual Stream.

Mocz, Viola; Vaziri-Pashkam, Maryam; Chun, Marvin M; Xu, Yaoda.

J Neurosci ; 41(35): 7403-7419, 2021 09 01.

Artigo em Inglês | MEDLINE | ID: mdl-34253629

RESUMO

In everyday life, we have no trouble categorizing objects varying in position, size, and orientation. Previous fMRI research shows that higher-level object processing regions in the human lateral occipital cortex may link object responses from different affine states (i.e., size and viewpoint) through a general linear mapping function capable of predicting responses to novel objects. In this study, we extended this approach to examine the mapping for both Euclidean (e.g., position and size) and non-Euclidean (e.g., image statistics and spatial frequency) transformations across the human ventral visual processing hierarchy, including areas V1, V2, V3, V4, ventral occipitotemporal cortex, and lateral occipitotemporal cortex. The predicted pattern generated from a linear mapping function could capture a significant amount of the changes associated with the transformations throughout the ventral visual stream. The derived linear mapping functions were not category independent as performance was better for the categories included than those not included in training and better between two similar versus two dissimilar categories in both lower and higher visual regions. Consistent with object representations being stronger in higher than in lower visual regions, pattern selectivity and object category representational structure were somewhat better preserved in the predicted patterns in higher than in lower visual regions. There were no notable differences between Euclidean and non-Euclidean transformations. These findings demonstrate a near-orthogonal representation of object identity and these nonidentity features throughout the human ventral visual processing pathway with these nonidentity features largely untangled from the identity features early in visual processing.SIGNIFICANCE STATEMENT Presently we still do not fully understand how object identity and nonidentity (e.g., position, size) information are simultaneously represented in the primate ventral visual system to form invariant representations. Previous work suggests that the human lateral occipital cortex may be linking different affine states of object representations through general linear mapping functions. Here, we show that across the entire human ventral processing pathway, we could link object responses in different states of nonidentity transformations through linear mapping functions for both Euclidean and non-Euclidean transformations. These mapping functions are not identity independent, suggesting that object identity and nonidentity features are represented in a near rather than a completely orthogonal manner.

Assuntos

Mapeamento Encefálico , Lobo Occipital/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Lobo Temporal/fisiologia , Córtex Visual/fisiologia , Vias Visuais/fisiologia , Adolescente , Adulto , Animais , Reconhecimento Facial/fisiologia , Feminino , Utensílios Domésticos , Humanos , Imageamento por Ressonância Magnética , Masculino , Adulto Jovem

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA