Search | VHL Regional Portal

Sparse periodicity-based auditory features explain human performance in a spatial multitalker auditory scene analysis task.

Josupeit, Angela; Schoenmaker, Esther; van de Par, Steven; Hohmann, Volker.

Eur J Neurosci ; 51(5): 1353-1363, 2020 03.

Article in English | MEDLINE | ID: mdl-29855099

ABSTRACT

Human listeners robustly decode speech information from a talker of interest that is embedded in a mixture of spatially distributed interferers. A relevant question is which time-frequency segments of the speech are predominantly used by a listener to solve such a complex Auditory Scene Analysis task. A recent psychoacoustic study investigated the relevance of low signal-to-noise ratio (SNR) components of a target signal on speech intelligibility in a spatial multitalker situation. For this, a three-talker stimulus was manipulated in the spectro-temporal domain such that target speech time-frequency units below a variable SNR threshold (SNRcrit ) were discarded while keeping the interferers unchanged. The psychoacoustic data indicate that only target components at and above a local SNR of about 0 dB contribute to intelligibility. This study applies an auditory scene analysis "glimpsing" model to the same manipulated stimuli. Model data are found to be similar to the human data, supporting the notion of "glimpsing," that is, that salient speech-related information is predominantly used by the auditory system to decode speech embedded in a mixture of sounds, at least for the tested conditions of three overlapping speech signals. This implies that perceptually relevant auditory information is sparse and may be processed with low computational effort, which is relevant for neurophysiological research of scene analysis and novelty processing in the auditory system.

Subject(s)

Speech Perception , Acoustic Stimulation , Auditory Threshold , Humans , Perceptual Masking , Psychoacoustics , Signal-To-Noise Ratio , Sound , Speech Intelligibility

Modeling speech localization, talker identification, and word recognition in a multi-talker setting.

Josupeit, Angela; Hohmann, Volker.

J Acoust Soc Am ; 142(1): 35, 2017 07.

Article in English | MEDLINE | ID: mdl-28764452

ABSTRACT

This study introduces a model for solving three different auditory tasks in a multi-talker setting: target localization, target identification, and word recognition. The model was used to simulate psychoacoustic data from a call-sign-based listening test involving multiple spatially separated talkers [Brungart and Simpson (2007). Percept. Psychophys. 69(1), 79-91]. The main characteristics of the model are (i) the extraction of salient auditory features ("glimpses") from the multi-talker signal and (ii) the use of a classification method that finds the best target hypothesis by comparing feature templates from clean target signals to the glimpses derived from the multi-talker mixture. The four features used were periodicity, periodic energy, and periodicity-based interaural time and level differences. The model results widely exceeded probability of chance for all subtasks and conditions, and generally coincided strongly with the subject data. This indicates that, despite their sparsity, glimpses provide sufficient information about a complex auditory scene. This also suggests that complex source superposition models may not be needed for auditory scene analysis. Instead, simple models of clean speech may be sufficient to decode even complex multi-talker scenes.

Modeling of speech localization in a multi-talker mixture using periodicity and energy-based auditory features.

Josupeit, Angela; Kopco, Norbert; Hohmann, Volker.

J Acoust Soc Am ; 139(5): 2911, 2016 05.

Article in English | MEDLINE | ID: mdl-27250183

ABSTRACT

A recent study showed that human listeners are able to localize a short speech target simultaneously masked by four speech tokens in reverberation [Kopco, Best, and Carlile (2010). J. Acoust. Soc. Am. 127, 1450-1457]. Here, an auditory model for solving this task is introduced. The model has three processing stages: (1) extraction of the instantaneous interaural time difference (ITD) information, (2) selection of target-related ITD information ("glimpses") using a template-matching procedure based on periodicity, spectral energy, or both, and (3) target location estimation. The model performance was compared to the human data, and to the performance of a modified model using an ideal binary mask (IBM) at stage (2). The IBM-based model performed similarly to the subjects, indicating that the binaural model is able to accurately estimate source locations. Template matching using spectral energy and using a combination of spectral energy and periodicity achieved good results, while using periodicity alone led to poor results. Particularly, the glimpses extracted from the initial portion of the signal were critical for good performance. Simulation data show that the auditory features investigated here are sufficient to explain human performance in this challenging listening condition and thus may be used in models of auditory scene analysis.

Subject(s)

Cues , Noise/adverse effects , Perceptual Masking , Periodicity , Sound Localization , Speech Acoustics , Speech Perception , Acoustic Stimulation , Acoustics , Auditory Pathways/physiology , Computer Simulation , Female , Humans , Male , Models, Psychological , Time Factors

Lateralization of stimuli with alternating interaural time differences: The role of monaural envelope cues.

Reed, Darrin K; Dietz, Mathias; Josupeit, Angela; van de Par, Steven.

J Acoust Soc Am ; 139(1): 30-40, 2016 Jan.

Article in English | MEDLINE | ID: mdl-26827002

ABSTRACT

A temporally acute binaural system can help to resolve inherent fluctuations in binaural information that are often present in complex auditory scenes. Using a broadband noise stimulus that rapidly alternates between two different values of interaural time difference (ITD), the ability of the binaural system to hear the lateral position resulting from one of the ITD values was investigated. Results show that listeners are able to accurately lateralize brief noise tokens of only 3-7 ms in duration. In two subsequent experiments, the role of an amplitude modulation (AM) imposed on the ITD-switching stimulus used in the first experiment was tested. For wideband stimuli, the temporal position of the ITD target relative to the phase of the AM did not influence absolute lateralization or detection performance. When the stimuli were narrowband, however, detection of the ITD target was best when temporally positioned in the rising portion of the AM. These experiments illustrate that the auditory system is capable of making accurate lateral estimates of very brief moments of ITD information. Furthermore, for these instantaneous changes in ITD information, the stimulus bandwidth can influence the role of envelope cues for the readout of binaural information.

Release from masking of low-frequency complex tones by high-frequency complex tone cue bands.

Josupeit, Angela; Hohmann, Volker; van de Par, Steven.

J Acoust Soc Am ; 132(6): EL450-5, 2012 Dec.

Article in English | MEDLINE | ID: mdl-23231207

ABSTRACT

This study investigated the influence of high-frequency cue bands on the detection and discrimination of low-frequency target bands presented in a 3000-Hz low-pass noise masker. Target and cue bands were complex tones with 80-Hz spacing. The cue band consisted of 60 components starting at 4000 Hz; targets consisted of four components starting at different frequencies (500, 700, 1000, 1200, and 1500 Hz). Targets were presented with different durations within the 500-ms masker; target and cue bands had a common on- and offset. Presentation of the high-frequency complex tone significantly enhanced both the discrimination and detection thresholds by 2-3 dB.

Subject(s)

Cues , Discrimination, Psychological , Noise/adverse effects , Perceptual Masking , Pitch Perception , Acoustic Stimulation , Adult , Audiometry , Auditory Threshold , Humans , Middle Aged , Psychoacoustics , Signal Detection, Psychological , Time Factors

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL