Pesquisa | Portal Regional da BVS

Noulas, A; Englebienne, G; Krose, B J A.

IEEE Trans Pattern Anal Mach Intell ; 34(1): 79-93, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21383401

RESUMO

We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an audiovisual recording as multimodal entities that generate observations in the audio stream, the video stream, and the joint audiovisual space. The framework is very robust to different contexts, makes no assumptions about the location of the recording equipment, and does not require labeled training data as it acquires the model parameters using the Expectation Maximization (EM) algorithm. We apply the proposed model to two meeting videos and a news broadcast video, all of which come from publicly available data sets. The results acquired in speaker diarization are in favor of the proposed multimodal framework, which outperforms the single modality analysis results and improves over the state-of-the-art audio-based speaker diarization.

Spatial interactions in rapid pattern discrimination.

Kröse, B J; Burbeck, C A.

Spat Vis ; 4(4): 211-22, 1989.

Artigo em Inglês | MEDLINE | ID: mdl-2486815

RESUMO

We measured reaction times (RTs) for identification of a target among distracters under stabilized image conditions in which the positions of the target and the distracters were constant within a single experimental session. Under these conditions, the observer need not search for the target because its position is known. We nevertheless found that the presence of even a single distracter could elevate RTs. The magnitude of this effect depended on the distance of the distracter from the target and, for some observers, the distance of the distracter from the fovea. When we added not one but six background elements in a ring around the target, RT increased even more. If, apart from these neighboring distracters, the target was surrounded by more distracters located beyond the nearest neighbors, RT was, in general, not increased further. These findings suggest that adding background elements in a search task can elevate RTs in ways that are not dependent on the positional uncertainty of the target.

Assuntos

Reconhecimento Visual de Modelos/fisiologia , Tempo de Reação/fisiologia , Feminino , Humanos , Aprendizagem , Masculino , Retina/fisiologia

The control and speed of shifts of attention.

Kröse, B J; Julesz, B.

Vision Res ; 29(11): 1607-19, 1989.

Artigo em Inglês | MEDLINE | ID: mdl-2635484

RESUMO

We measured the detectability of a target pattern in a display consisting of 12 elements in a circle around the central fixation point. The display was presented briefly and followed after a variable amount of time by a mask. We found that presenting a pre-cue, designating the target position, facilitated target detectability. Attention is directed to the cued location. When the observer has to detect a (second) target among the non cued elements, performance for locations close to the cue is not significantly different from performance for locations further away. This suggests that there is no "scan-path" or proximity effect. We also found that the identification of the cued element delayed the detectability of the subsequent target by more than 160 msec. In another series of experiments we studied the control of attentional shifts. We found that, for short mask delays (100, 160, and 260 msec) the observer is unable to selectively process elements which are not physically cued but only verbally defined by their position relative to the cue. When we increase the positional uncertainty of the target by increasing the number of physical cues, performance drops until it reaches an asymptote with 5 elements. We infer that, even though the target is very similar to the background, a parallel mechanism, used for the extraction of stimulus features, designates prospective target locations which may be subsequently checked by a (slow) attentional process.

Assuntos

Atenção/fisiologia , Percepção Visual/fisiologia , Sensibilidades de Contraste/fisiologia , Sinais (Psicologia) , Percepção de Forma/fisiologia , Humanos , Reconhecimento Visual de Modelos/fisiologia

Form discrimination: features or invariants?

Smets, G J; Stappers, P J; Kröse, B J.

Percept Mot Skills ; 67(1): 311-7, 1988 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-3211686

RESUMO

Does form discrimination rely on feature analysis, as the indirect theory of perception supposes, or on affordances (behavioural meanings specified by invariant patterns), as direct theory states. Subjects were to indicate the position of a target in a perspective rendering of a plane, displayed for 100 msec. in a large screen projection. In one of the conditions the target disrupted the plane, in the other it did not. Although targets of the two conditions shared the same features, the disruptive targets were discriminated more often than the nondisruptive targets. This result supports the direct approach to perception which states that a perceiver discriminates behaviourally relevant patterns rather than geometrical properties.

Assuntos

Discriminação Psicológica , Percepção de Forma , Teoria da Informação , Reconhecimento Visual de Modelos , Adolescente , Adulto , Humanos

Local structure analyzers as determinants of preattentive pattern discrimination.

Kröse, B J.

Biol Cybern ; 55(5): 289-98, 1987.

Artigo em Inglês | MEDLINE | ID: mdl-3828403

RESUMO

Contemporary literature suggests that preattentive texture or pattern discrimination is induced by differences between local structure features or "textons." This paper presents a model for the description of such local structure features based on the computation of local autocorrelations within the image. By means of this structure model a measure of structure dissimilarity is introduced. Experiments have been carried out to test a hypothesized relation between the detectability of a target pattern in a field of background patterns and the value of the structure dissimilarity measure. The experimental results show that it seems justified to relate, in a quantitative way, the detectability of the target pattern to the value of the structure dissimilarity measure.

Assuntos

Percepção de Forma , Reconhecimento Visual de Modelos , Humanos , Matemática , Modelos Psicológicos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA