Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Acoust Soc Am ; 133(3): 1515-24, 2013 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23464022

RESUMO

Time delay estimation (TDE) is a fundamental component of speaker localization and tracking algorithms. Most of the existing systems are based on the generalized cross-correlation method assuming gaussianity of the source. It has been shown that the distribution of speech, captured with far-field microphones, is highly varying, depending on the noise and reverberation conditions. Thus the performance of TDE is expected to fluctuate depending on the underlying assumption for the speech distribution, being also subject to multi-path reflections and competitive background noise. This paper investigates the effect upon TDE when modeling the source signal with different speech-based distributions. An information theoretical TDE method indirectly encapsulating higher order statistics (HOS) formed the basis of this work. The underlying assumption of Gaussian distributed source has been replaced by that of generalized Gaussian distribution that allows evaluating the problem under a larger set of speech-shaped distributions, ranging from Gaussian to Laplacian and Gamma. Closed forms of the univariate and multivariate entropy expressions of the generalized Gaussian distribution are derived to evaluate the TDE. The results indicate that TDE based on the specific criterion is independent of the underlying assumption for the distribution of the source, for the same covariance matrix.


Assuntos
Acústica , Modelos Estatísticos , Som , Acústica da Fala , Medida da Produção da Fala , Acústica/instrumentação , Algoritmos , Desenho de Equipamento , Humanos , Movimento (Física) , Análise Multivariada , Ruído , Distribuição Normal , Processamento de Sinais Assistido por Computador , Razão Sinal-Ruído , Espectrografia do Som , Fatores de Tempo , Transdutores , Vibração
2.
IEEE Trans Syst Man Cybern B Cybern ; 39(1): 7-15, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19150757

RESUMO

We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and video tracking modules. The audio module operates using a particle filter (PF) and an information-theoretic framework to provide accurate acoustic source location under reverberant conditions. The video subsystem combines in 3-D a number of 2-D trackers based on a variation of Stauffer's adaptive background algorithm with spatiotemporal adaptation of the learning parameters and a Kalman tracker in a feedback configuration. Extensive experiments show that gains are to be expected when fusion of the separate modalities is performed to detect the active speaker.

3.
IEEE Trans Syst Man Cybern B Cybern ; 38(3): 799-807, 2008 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-18558543

RESUMO

We propose a system for detecting the active speaker in cluttered and reverberant environments where more than one person speaks and moves. Rather than using only audio information, the system utilizes audiovisual information from multiple acoustic and video sensors that feed separate audio and video tracking modules. The audio module operates using a particle filter (PF) and an information-theoretic framework to provide accurate acoustic source location under reverberant conditions. The video subsystem combines in 3-D a number of 2-D trackers based on a variation of Stauffer's adaptive background algorithm with spatiotemporal adaptation of the learning parameters and a Kalman tracker in a feedback configuration. Extensive experiments show that gains are to be expected when fusion of the separate modalities is performed to detect the active speaker.


Assuntos
Inteligência Artificial , Biometria/métodos , Meio Ambiente , Interpretação de Imagem Assistida por Computador/métodos , Espectrografia do Som/métodos , Interface para o Reconhecimento da Fala , Algoritmos
4.
J Acoust Soc Am ; 114(2): 833-41, 2003 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-12942966

RESUMO

A theoretical framework is established, for the robustness of multichannel sound equalization in reverberant environments. Using results from statistical room acoustics, a closed-form expression is derived that predicts the degradation in performance of an equalization system as the sound source moves from its nominal position inside the enclosure. The presented analysis also provides means of identifying the performance bounds that can be expected when using such a system in an actual room. Using extensive computer simulations, the effect of physical parameters such as the relative positions of the source and the receivers, as well as effects of different design parameters are investigated. Based on the conditions imposed by these parameters, it is shown that, depending on the array geometry and the exact form of the equalizers, slight performance gains can be expected as the number of receivers is increased.


Assuntos
Acústica , Meio Ambiente , Modelos Teóricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...