Pesquisa | Portal Regional da BVS

Modeling guidance and recognition in categorical search: bridging human and computer object detection.

Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris.

J Vis ; 13(3): 30, 2013 Oct 08.

Artigo em Inglês | MEDLINE | ID: mdl-24105460

RESUMO

Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.

Assuntos

Processamento de Imagem Assistida por Computador , Reconhecimento Visual de Modelos/fisiologia , Movimentos Oculares/fisiologia , Humanos , Tempo de Reação

Multi-voxel pattern analysis of selective representation of visual working memory in ventral temporal and occipital regions.

Han, Xufeng; Berg, Alexander C; Oh, Hwamee; Samaras, Dimitris; Leung, Hoi-Chung.

Neuroimage ; 73: 8-15, 2013 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-23380167

RESUMO

While previous results from univariate analysis showed that the activity level of the parahippocampal gyrus (PHG) but not the fusiform gyrus (FG) reflects selective maintenance of the cued picture category, present results from multi-voxel pattern analysis (MVPA) showed that the spatial response patterns of both regions can be used to differentiate the selected picture category in working memory. The ventral temporal and occipital areas including the PHG and FG have been shown to be specialized in perceiving and processing different kinds of visual information, though their role in the representation of visual working memory remains unclear. To test whether the PHG and FG show spatial response patterns that reflect selective maintenance of task-relevant visual working memory in comparison with other posterior association regions, we reanalyzed data from a previous fMRI study of visual working memory with a cue inserted during the delay period of a delayed recognition task. Classification of FG and PHG activation patterns for the selected category (face or scene) during the cue phase was well above chance using classifiers trained with fMRI data from the cue or probe phase. Classification of activity in other temporal and occipital regions for the cued picture category during the cue phase was relatively less consistent even though classification of their activity during the probe recognition was comparable with the FG and PHG. In sum, these findings suggest that the FG and PHG carry information relevant to the cued visual category, and their spatial activation patterns during selective maintenance seem to match those during visual recognition.

Assuntos

Memória de Curto Prazo/fisiologia , Lobo Occipital/fisiologia , Reconhecimento Fisiológico de Modelo/fisiologia , Lobo Temporal/fisiologia , Córtex Cerebral/fisiologia , Sinais (Psicologia) , Face , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Giro Para-Hipocampal/fisiologia , Estimulação Luminosa , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte , Percepção Visual/fisiologia

Babytalk: understanding and generating simple image descriptions.

Kulkarni, Girish; Premraj, Visruth; Ordonez, Vicente; Dhar, Sagnik; Li, Siming; Choi, Yejin; Berg, Alexander C; Berg, Tamara L.

IEEE Trans Pattern Anal Mach Intell ; 35(12): 2891-903, 2013 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-22848128

RESUMO

We present a system to automatically generate natural language descriptions from images. This system consists of two parts. The first part, content planning, smooths the output of computer vision-based detection and recognition algorithms with statistics mined from large pools of visually descriptive text to determine the best content words to use to describe an image. The second step, surface realization, chooses words to construct natural language sentences based on the predicted content and general statistics from natural language. We present multiple approaches for the surface realization step and evaluate each using automatic measures of similarity to human generated reference descriptions. We also collect forced choice human evaluations between descriptions from the proposed generation system and descriptions from competing approaches. The proposed system is very effective at producing relevant sentences for images. It also generates descriptions that are notably more true to the specific image content than previous work.

Assuntos

Algoritmos , Humanos

Efficient classification for additive kernel SVMs.

Maji, Subhransu; Berg, Alexander C; Malik, Jitendra.

IEEE Trans Pattern Anal Mach Intell ; 35(1): 66-77, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22392703

RESUMO

We show that a class of nonlinear kernel SVMs admits approximate classifiers with runtime and memory complexity that is independent of the number of support vectors. This class of kernels, which we refer to as additive kernels, includes widely used kernels for histogram-based image comparison like intersection and chi-squared kernels. Additive kernel SVMs can offer significant improvements in accuracy over linear SVMs on a wide variety of tasks while having the same runtime, making them practical for large-scale recognition or real-time detection tasks. We present experiments on a variety of datasets, including the INRIA person, Daimler-Chrysler pedestrians, UIUC Cars, Caltech-101, MNIST, and USPS digits, to demonstrate the effectiveness of our method for efficient evaluation of SVMs with additive kernels. Since its introduction, our method has become integral to various state-of-the-art systems for PASCAL VOC object detection/image classification, ImageNet Challenge, TRECVID, etc. The techniques we propose can also be applied to settings where evaluation of weighted additive kernels is required, which include kernelized versions of PCA, LDA, regression, k-means, as well as speeding up the inner loop of SVM classifier training algorithms.

Assuntos

Algoritmos , Inteligência Artificial , Técnicas de Apoio para a Decisão , Interpretação de Imagem Assistida por Computador/métodos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Máquina de Vetores de Suporte , Simulação por Computador

Describable Visual Attributes for Face Verification and Image Search.

Kumar, Neeraj; Berg, Alexander C; Belhumeur, Peter N; Nayar, Shree K.

IEEE Trans Pattern Anal Mach Intell ; 33(10): 1962-77, 2011 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-21383395

RESUMO

We introduce the use of describable visual attributes for face verification and image search. Describable visual attributes are labels that can be given to an image to describe its appearance. This paper focuses on images of faces and the attributes used to describe them, although the concepts also apply to other domains. Examples of face attributes include gender, age, jaw shape, nose size, etc. The advantages of an attribute-based representation for vision tasks are manifold: They can be composed to create descriptions at various levels of specificity; they are generalizable, as they can be learned once and then applied to recognize new objects or categories without any further training; and they are efficient, possibly requiring exponentially fewer attributes (and training data) than explicitly naming each category. We show how one can create and label large data sets of real-world images to train classifiers which measure the presence, absence, or degree to which an attribute is expressed in images. These classifiers can then automatically label new images. We demonstrate the current effectiveness--and explore the future potential--of using attributes for face verification and image search via human and computational experiments. Finally, we introduce two new face data sets, named FaceTracer and PubFig, with labeled attributes and identities, respectively.

Assuntos

Identificação Biométrica/métodos , Face/anatomia & histologia , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Feminino , Humanos , Masculino

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA