Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 32
Filter
1.
J Acoust Soc Am ; 142(1): 249, 2017 07.
Article in English | MEDLINE | ID: mdl-28764474

ABSTRACT

This paper describes the effect of two types of temporal permutations of the speech waveform on speech intelligibility. Using an overlap-add procedure with triangular-shaped windows for frame lengths of 1/8 to 2048 ms, the temporal order of the speech samples within each frame was subjected to either of two types of permutations: time-reversal or randomization. For both permutations, speech intelligibility tests expectantly show 100% intelligibility for the very short frame lengths containing only a few speech samples. Intelligibility drops to essentially zero toward longer frame lengths of around 1 ms. Interestingly, only for the reverse condition, intelligibility recovers to essentially 100% for frame lengths in the 4-32 ms range, dropping again to zero for frame lengths exceeding about 100 ms. Tests for the Japanese and the English language show essentially similar results. The data are interpreted along the lines of a previous paper by Kazama and the present authors [J. Acoust. Soc. Am. 127(3), 1432-1439 (2010)]. As in that previous paper, the loss of temporal envelope correlation shows a pattern very similar to that of the intelligibility data, illustrating again the importance of the preservation of narrow-band envelopes for speech intelligibility.

2.
J Speech Lang Hear Res ; 56(5): 1364-72, 2013 Oct.
Article in English | MEDLINE | ID: mdl-23838985

ABSTRACT

PURPOSE: In this explorative study, the authors investigated the relationship between auditory and cognitive abilities and self-reported hearing disability. METHOD: Thirty-two adults with mild to moderate hearing loss completed the Amsterdam Inventory for Auditory Disability and Handicap (AIADH; Kramer, Kapteyn, Festen, & Tobi, 1996) and performed the Text Reception Threshold (TRT; Zekveld, George, Kramer, Goverts, & Houtgast, 2007) test as well as tests of spatial working memory (SWM) and visual sustained attention. Regression analyses examined the predictive value of age, hearing thresholds (pure-tone averages [PTAs]), speech perception in noise (speech reception thresholds in noise [SRTNs]), and the cognitive tests for the 5 AIADH factors. RESULTS: Besides the variance explained by age, PTA, and SRTN, cognitive abilities were related to each hearing factor. The reported difficulties with sound detection and speech perception in quiet were less severe for participants with higher age, lower PTAs, and better TRTs. Fewer sound localization and speech perception in noise problems were reported by participants with better SRTNs and smaller SWM. Fewer sound discrimination difficulties were reported by subjects with better SRTNs and TRTs and smaller SWM. CONCLUSIONS: The results suggest a general role of the ability to read partly masked text in subjective hearing. Large working memory was associated with more reported hearing difficulties. This study shows that besides auditory variables and age, cognitive abilities are related to self-reported hearing disability.


Subject(s)
Cognition , Hearing Loss, Sensorineural/physiopathology , Hearing Loss, Sensorineural/psychology , Persons With Hearing Impairments/psychology , Speech Perception/physiology , Acoustic Stimulation/methods , Aged , Aged, 80 and over , Attention/physiology , Auditory Threshold/physiology , Female , Humans , Male , Memory, Short-Term/physiology , Middle Aged , Noise , Perceptual Masking/physiology , Reading , Regression Analysis , Self Report , Speech Reception Threshold Test/methods
3.
Int J Audiol ; 52(5): 305-21, 2013 May.
Article in English | MEDLINE | ID: mdl-23570289

ABSTRACT

OBJECTIVE: This paper describes the composition and international multi-centre evaluation of a battery of tests termed the preliminary auditory profile. It includes measures of loudness perception, listening effort, speech perception, spectral and temporal resolution, spatial hearing, self-reported disability and handicap, and cognition. Clinical applicability and comparability across different centres are investigated. DESIGN: Headphone tests were conducted in five centres divided over four countries. Effects of test-retest, ear, and centre were investigated. Results for normally-hearing (NH) and hearing-impaired (HI) listeners are presented. STUDY SAMPLE: Thirty NH listeners aged 19-39 years, and 72 HI listeners aged 22-91 years with a broad range of hearing losses were included. RESULTS: Test-retest reliability was generally good and there were very few right/left ear effects. Results of all tests were comparable across centres for NH listeners after baseline correction to account for necessary differences between test materials. For HI listeners, results were comparable across centres for the language-independent tests. CONCLUSIONS: The auditory profile forms a clinical test battery that is applicable in four different languages. Even after baseline correction, differences between test materials have to be taken into account when interpreting results of language-dependent tests in HI listeners.


Subject(s)
Audiometry/methods , Auditory Perception , Hearing Disorders/diagnosis , Persons With Hearing Impairments/psychology , Acoustic Stimulation , Adult , Aged , Aged, 80 and over , Audiometry, Pure-Tone , Case-Control Studies , Cognition , Disability Evaluation , Europe , Hearing Disorders/psychology , Humans , Language , Loudness Perception , Middle Aged , Noise/adverse effects , Observer Variation , Perceptual Masking , Predictive Value of Tests , Reproducibility of Results , Sound Localization , Sound Spectrography , Speech Perception , Speech Reception Threshold Test , Time Factors , Time Perception , Young Adult
4.
J Speech Lang Hear Res ; 54(6): 1702-8, 2011 Dec.
Article in English | MEDLINE | ID: mdl-22180022

ABSTRACT

PURPOSE: Researchers have used the distortion-sensitivity approach in the psychoacoustical domain to investigate the role of auditory processing abilities in speech perception in noise (van Schijndel, Houtgast, & Festen, 2001; Goverts & Houtgast, 2010). In this study, the authors examined the potential applicability of the distortion-sensitivity approach for investigating the role of linguistic abilities in speech understanding in noise. METHOD: The authors applied the distortion-sensitivity approach by measuring the processing of visually presented masked text in a condition with manipulated syntactic, lexical, and semantic cues and while using the Text Reception Threshold (George et al., 2007; Kramer, Zekveld, & Houtgast, 2009; Zekveld, George, Kramer, Goverts, & Houtgast, 2007) method. Two groups that differed in linguistic abilities were studied: 13 native and 10 non-native speakers of Dutch, all typically hearing university students. RESULTS: As expected, the non-native subjects showed substantially reduced performance. The results of the distortion-sensitivity approach yielded differentiated results on the use of specific linguistic cues in the 2 groups. CONCLUSION: The results show the potential value of the distortion-sensitivity approach in studying the role of linguistic abilities in speech understanding in noise of individuals with hearing impairment.


Subject(s)
Language Tests , Linguistics , Perceptual Distortion/physiology , Psychoacoustics , Speech Perception/physiology , Acoustic Stimulation/methods , Adult , Auditory Threshold/physiology , Hearing Loss/diagnosis , Hearing Loss/physiopathology , Humans , Middle Aged , Multilingualism , Noise , Perceptual Masking/physiology , Photic Stimulation/methods , Reading , Young Adult
5.
J Acoust Soc Am ; 127(5): 3073-84, 2010 May.
Article in English | MEDLINE | ID: mdl-21117756

ABSTRACT

Reduced binaural performance of hearing-impaired listeners may not only be caused by raised hearing thresholds (reduced audibility), but also by supra-threshold coding deficits in signal cues. This question was investigated in the present study using binaural intelligibility level difference (BILD) comparisons: the improvement of speech-reception threshold scores for N(0)S(π) relative to N(0)S(0) presentation conditions. Investigated was what types of supra-threshold deficits play a role in reducing BILDs in hearing-impaired subjects. BILDs were investigated for 25 mild to moderate sensorineural hearing-impaired listeners, under conditions where optimal audibility was assured. All stimuli were bandpass filtered (250-4000 Hz). A distortion-sensitivity approach was used to investigate the sensitivity of subjects BILDs to external stimulus perturbations in the phase, frequency, time, and intensity domains. The underlying assumption of this approach was that an auditory coding deficit occurring in a signal cue in a particular domain will result in a low sensitivity to external perturbations applied in that domain. Compared to reference data for listeners with normal BILDs, distortion-sensitivity data for a subgroup of eight listeners with reduced BILDs suggests that these reductions in BILD were caused by coding deficits in the phase and time domains.


Subject(s)
Auditory Threshold , Cues , Hearing Loss, Sensorineural/psychology , Persons With Hearing Impairments/psychology , Signal Detection, Psychological , Speech Intelligibility , Acoustic Stimulation , Adult , Aged , Audiometry, Pure-Tone , Case-Control Studies , Female , Hearing Loss, Sensorineural/physiopathology , Humans , Male , Middle Aged , Psychoacoustics , Sound Spectrography , Speech Acoustics , Speech Reception Threshold Test , Time Factors
6.
J Speech Lang Hear Res ; 53(6): 1429-39, 2010 Dec.
Article in English | MEDLINE | ID: mdl-20689027

ABSTRACT

PURPOSE: The Speech Transmission Index (STI; Houtgast, Steeneken, & Plomp, 1980; Steeneken & Houtgast, 1980) is commonly used to quantify the adverse effects of reverberation and stationary noise on speech intelligibility for normal-hearing listeners. Duquesnoy and Plomp (1980) showed that the STI can be applied for presbycusic listeners, relating speech reception thresholds (SRTs) in various reverberant conditions to a fixed, subject-dependent STI value. The current study aims at extending their results to a wider range of hearing-impaired listeners. METHOD: A reverberant analogue of the SRT is presented--the speech reception reverberation threshold (SRRT)--which determines the amount of reverberation that a listener can sustain to understand 50% of the presented sentences. SRTs are performed and evaluated in terms of STI for 5 normal-hearing participants and 36 randomly selected hearing-impaired participants. RESULTS: Results show that differences in STI between reverberant and noisy conditions are only small, equivalent to a change in speech-to-noise ratio < 1.3 dB. CONCLUSION: The STI appears to be a convenient, single number to quantify speech reception of hearing-impaired listeners in noise and/or reverberation, regardless of the nature of the hearing loss. In future research, the SRRT may be applied to further investigate the supposed importance of cognitive processing in reverberant listening conditions.


Subject(s)
Acoustic Stimulation/methods , Auditory Threshold/physiology , Noise , Presbycusis/physiopathology , Speech Intelligibility/physiology , Speech Perception/physiology , Adult , Audiometry, Pure-Tone , Hearing/physiology , Humans , Middle Aged
7.
J Acoust Soc Am ; 127(3): 1432-9, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20329843

ABSTRACT

This paper investigates the significance of the magnitude or the phase in the short term Fourier spectrum for speech intelligibility as a function of the time-window length. For a wide range of window lengths (1/16-2048 ms), two hybrid signals were obtained by a cross-wise combination of the magnitude and phase spectra of speech and white noise. Speech intelligibility data showed the significance of the phase spectrum for longer windows (>256 ms) and for very short windows (<4 ms), and that of the magnitude spectrum for medium-range window lengths. The hybrid signals used in the intelligibility test were analyzed in terms of the preservation of the original narrow-band speech envelopes. Correlations between the narrow-band envelopes of the original speech and the hybrid signals show a similar pattern as a function of window length. This result illustrates the importance of the preservation of narrow-band envelopes for speech intelligibility. The observed significance of the phase spectrum in recovering the narrow-band envelopes for the long term windows and for the very short term windows is discussed.


Subject(s)
Fourier Analysis , Models, Theoretical , Phonetics , Speech Intelligibility/physiology , Speech Perception/physiology , Adult , Humans , Middle Aged , Noise , Young Adult
8.
Scand J Psychol ; 50(5): 507-15, 2009 Oct.
Article in English | MEDLINE | ID: mdl-19778398

ABSTRACT

The ability to comprehend speech in noise is influenced by bottom-up auditory and top-down cognitive capacities. Separate examination of these capacities is relevant for various purposes. Speech-Reception-Threshold (SRT) tests measure an individual's ability to comprehend speech. This paper addresses the value of the Text-Reception-Threshold (TRT) test (a visual parallel of the SRT test) to assess the cognitive capacities allocated during speech comprehension. We conducted a secondary data analysis, including 87 normally-hearing adults (aged 18 to 78 years). Correlation coefficients between age, TRT, working memory (Spatial Span) and SRT were examined. The TRT and SRT correlated significantly (r = 0.30), supporting the value of TRT in explaining inter-individual differences in SRTs. The relations between age and TRT and between SSP and TRT were non-significant. The results indicate that the current TRT test does not fully cover the cognitive aspects relevant in speech comprehension. Adaptation of the test is required before clinical implementation can be considered.


Subject(s)
Cognition/physiology , Comprehension/physiology , Speech Perception/physiology , Speech/physiology , Visual Perception/physiology , Acoustic Stimulation , Adolescent , Adult , Age Factors , Aged , Attention/physiology , Auditory Threshold/physiology , Female , Hearing/physiology , Humans , Male , Memory, Short-Term/physiology , Middle Aged , Neuropsychological Tests , Noise , Perceptual Masking/physiology , Photic Stimulation , Speech Discrimination Tests
9.
Ear Hear ; 30(2): 262-72, 2009 Apr.
Article in English | MEDLINE | ID: mdl-19194286

ABSTRACT

OBJECTIVE: The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that speech comprehension in noise by young listeners with normal hearing improves when presenting partly incorrect, automatically generated subtitles. The current study focused on the effects of age, hearing loss, visual working memory capacity, and linguistic skills on the benefit obtained from automatically generated subtitles during listening to speech in noise. DESIGN: In order to investigate the effects of age and hearing loss, three groups of participants were included: 22 young persons with normal hearing (YNH, mean age = 21 years), 22 middle-aged adults with normal hearing (MA-NH, mean age = 55 years) and 30 middle-aged adults with hearing impairment (MA-HI, mean age = 57 years). The benefit from automatic subtitling was measured by Speech Reception Threshold (SRT) tests (Plomp & Mimpen, 1979). Both unimodal auditory and bimodal audiovisual SRT tests were performed. In the audiovisual tests, the subtitles were presented simultaneously with the speech, whereas in the auditory test, only speech was presented. The difference between the auditory and audiovisual SRT was defined as the audiovisual benefit. Participants additionally rated the listening effort. We examined the influences of ASR accuracy level and text delay on the audiovisual benefit and the listening effort using a repeated measures General Linear Model analysis. In a correlation analysis, we evaluated the relationships between age, auditory SRT, visual working memory capacity and the audiovisual benefit and listening effort. RESULTS: The automatically generated subtitles improved speech comprehension in noise for all ASR accuracies and delays covered by the current study. Higher ASR accuracy levels resulted in more benefit obtained from the subtitles. Speech comprehension improved even for relatively low ASR accuracy levels; for example, participants obtained about 2 dB SNR audiovisual benefit for ASR accuracies around 74%. Delaying the presentation of the text reduced the benefit and increased the listening effort. Participants with relatively low unimodal speech comprehension obtained greater benefit from the subtitles than participants with better unimodal speech comprehension. We observed an age-related decline in the working-memory capacity of the listeners with normal hearing. A higher age and a lower working memory capacity were associated with increased effort required to use the subtitles to improve speech comprehension. CONCLUSIONS: Participants were able to use partly incorrect and delayed subtitles to increase their comprehension of speech in noise, regardless of age and hearing loss. This supports the further development and evaluation of an assistive listening system that displays automatically recognized speech to aid speech comprehension by listeners with hearing impairment.


Subject(s)
Deafness/rehabilitation , Hearing Aids , Hearing , Memory, Short-Term , Speech Perception , Speech Recognition Software , Adolescent , Adult , Age Factors , Aged , Auditory Threshold , Communication Aids for Disabled , Female , Humans , Linguistics , Male , Middle Aged , Noise , Reading , Young Adult
10.
Trends Amplif ; 13(1): 44-68, 2009 Mar.
Article in English | MEDLINE | ID: mdl-19126551

ABSTRACT

This study examined the subjective benefit obtained from automatically generated captions during telephone-speech comprehension in the presence of babble noise. Short stories were presented by telephone either with or without captions that were generated offline by an automatic speech recognition (ASR) system. To simulate online ASR, the word accuracy (WA) level of the captions was 60% or 70% and the text was presented delayed to the speech. After each test, the hearing impaired participants (n = 20) completed the NASA-Task Load Index and several rating scales evaluating the support from the captions. Participants indicated that using the erroneous text in speech comprehension was difficult and the reported task load did not differ between the audio + text and audio-only conditions. In a follow-up experiment (n = 10), the perceived benefit of presenting captions increased with an increase of WA levels to 80% and 90%, and elimination of the text delay. However, in general, the task load did not decrease when captions were presented. These results suggest that the extra effort required to process the text could have been compensated for by less effort required to comprehend the speech. Future research should aim at reducing the complexity of the task to increase the willingness of hearing impaired persons to use an assistive communication system automatically providing captions. The current results underline the need for obtaining both objective and subjective measures of benefit when evaluating assistive communication systems.


Subject(s)
Communication Aids for Disabled , Correction of Hearing Impairment , Hearing Loss, Mixed Conductive-Sensorineural/rehabilitation , Hearing Loss, Sensorineural/rehabilitation , Speech Perception , Speech Recognition Software , Telephone , Visual Perception , Adult , Aged , Aged, 80 and over , Cognition , Comprehension , Computer Systems , Female , Humans , Male , Memory , Middle Aged , Noise/adverse effects , Perceptual Masking , Speech Reception Threshold Test , Surveys and Questionnaires , Time Factors
11.
J Acoust Soc Am ; 124(2): 1269-77, 2008 Aug.
Article in English | MEDLINE | ID: mdl-18681613

ABSTRACT

Listening conditions in everyday life typically include a combination of reverberation and nonstationary background noise. It is well known that sentence intelligibility is adversely affected by these factors. To assess their combined effects, an approach is introduced which combines two methods of predicting speech intelligibility, the extended speech intelligibility index (ESII) and the speech transmission index. First, the effects of reverberation on nonstationary noise (i.e., reduction of masker modulations) and on speech modulations are evaluated separately. Subsequently, the ESII is applied to predict the speech reception threshold (SRT) in the masker with reduced modulations. To validate this approach, SRTs were measured for ten normal-hearing listeners, in various combinations of nonstationary noise and artificially created reverberation. After taking the characteristics of the speech corpus into account, results show that the approach accurately predicts SRTs in nonstationary noise and reverberation for normal-hearing listeners. Furthermore, it is shown that, when reverberation is present, the benefit from masker fluctuations may be substantially reduced.


Subject(s)
Noise/adverse effects , Perceptual Masking , Speech Acoustics , Speech Intelligibility , Speech Perception , Adult , Humans , Reproducibility of Results , Sound Spectrography , Speech Reception Threshold Test , Time Factors , Vibration
12.
J Speech Lang Hear Res ; 51(6): 1588-98, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18695020

ABSTRACT

PURPOSE: The sensitivity to sinusoidal amplitude modulations (SAMs) is reduced when other modulated maskers are presented simultaneously at a distant frequency (also referred to as modulation detection interference [MDI]). This article describes the results of onset differences between masker and target as a parameter. METHOD: Carrier frequencies were 1 kHz (target: 625 ms, 8 Hz SAM) and 2 kHz (masker: 625 ms, 8 Hz SAM; modulation depth = 1) presented at 25 dB SL for listeners with impaired hearing (n = 8) and at 25 dB SL and 50 dB SL for listeners with normal hearing (n = 6). Masker was delayed by 0, 125, 250, 500, 625, or 750 ms relative to the target. RESULTS: Sensitivity to SAMs was reduced in both groups by a modulated masker simultaneous presentation. Reducing the temporal overlap (i.e., increasing the onset delay between masker and target) increased the sensitivity to SAMs in the presence of modulated maskers. CONCLUSION: The gradual reduction in MDI with increasing asynchrony between masker and target suggests that MDI is not solely related to perceptual grouping. Reduced sensitivity to SAMs due to prior stimulation with SAM stimuli (forward masking), and deficits in across-channel integration, are other factors that may play a role.


Subject(s)
Hearing Loss, Sensorineural/epidemiology , Hearing , Perceptual Masking , Signal Detection, Psychological , Speech Perception , Aged , Female , Hearing Loss, Sensorineural/diagnosis , Humans , Male , Middle Aged , Sensitivity and Specificity , Severity of Illness Index , Time Factors
13.
Ear Hear ; 29(6): 838-52, 2008 Dec.
Article in English | MEDLINE | ID: mdl-18633325

ABSTRACT

OBJECTIVES: The aim of this study was to evaluate the benefit that listeners obtain from visually presented output from an automatic speech recognition (ASR) system during listening to speech in noise. DESIGN: Auditory-alone and audiovisual speech reception thresholds (SRTs) were measured. The SRT is defined as the speech-to-noise ratio at which 50% of the test sentences are reproduced correctly. In the auditory-alone SRT tests, the test sentences were presented only auditorily; in the audiovisual SRT test, the ASR output of each test sentence was also presented textually. The ASR system was used in two recognition modes: recognition of spoken words (word output), or recognition of speech sounds or phones (phone output). The benefit obtained from the ASR output was defined as the difference between the auditory-alone and the audiovisual SRT. We also examined the readability of unimodally displayed ASR output (i.e., the percentage of sentences in which ASR errors were identified and accurately corrected). In experiment 1, the readability and benefit obtained from ASR word output (n = 14) was compared with the benefit obtained from ASR phone output (n = 10). In experiment 2, the effect of presenting an indication of the ASR confidence level was examined (n = 14). The effect of delaying the presentation of the text relative to the speech (up to 6 sec) was examined in experiment 3 (n = 24). The ASR accuracy level was varied systematically in each experiment. RESULTS: Mean readability scores ranged from 0 to 46%, depending on ASR accuracy. Speech comprehension improved when the ASR output was displayed. For example, when the ASR output corresponded to readability scores of only about 20% correct, the text improved the SRT by about 3 dB SNR in the audiovisual SRT test. This improvement corresponds to an increase in speech comprehension of about 35% in critical conditions. Equally readable phone and word output provides similar benefit in speech comprehension. For equal ASR accuracies, both the readability and the benefit from the word output generally exceeded the benefits from the phone output. Presenting information about the ASR confidence level did not influence either the readability or the benefit obtained from the word output. Delaying the text relative to the speech moderately decreased the benefit. CONCLUSIONS: The present study indicates that speech comprehension improves considerably by textual ASR output with moderate accuracies. The study shows that this improvement depends on the readability of the ASR output. Word output has better accuracy and readability than phone output. Listeners are therefore better able to use the ASR word output than phone output to improve speech comprehension. The ability of older listeners and listeners with hearing impairments to use ASR output in speech comprehension requires further study.


Subject(s)
Communication Aids for Disabled , Noise , Speech Perception , Speech Recognition Software , Speech , Acoustic Stimulation , Adolescent , Adult , Deafness/rehabilitation , Female , Humans , Male , Middle Aged , Phonetics , Photic Stimulation , Reading , Speech Reception Threshold Test , Young Adult
14.
Int J Audiol ; 47(6): 287-95, 2008 Jun.
Article in English | MEDLINE | ID: mdl-18569101

ABSTRACT

It is generally recognized that poor results of speech-in-noise tests by hearing-impaired persons cannot be fully explained by the elevated pure-tone hearing threshold. Plomp has shown, among others, that an additional factor has to be taken into account, often referred to in general terms as distortion. In an attempt to specify auditory and cognitive functions which may underlie this distortion, various studies on this topic originating from Plomp's research group are reviewed, as well as other relevant studies which provide quantitative data on the correlations between various types of auditory or cognitive tests (the predictor tests) and speech-in-noise tests. The predictor variables considered include, besides the pure-tone audiogram, measures of spectral and temporal resolution, intensity difference limen, age, and some cognitive aspects. The results indicate that, by and large, these variables fall short in fully explaining the variance observed in the speech-in-noise tests. This strongly suggests that the predictor variables considered so far do not cover all sources of variance relevant for speech reception in noise.


Subject(s)
Cognition Disorders/diagnosis , Hearing Loss/diagnosis , Noise , Speech Reception Threshold Test , Age Factors , Audiometry, Pure-Tone , Cognition Disorders/complications , Hearing Loss/complications , Humans , Predictive Value of Tests , Research Design
15.
Ear Hear ; 29(1): 99-111, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18091101

ABSTRACT

OBJECTIVE: The aim of this study was to examine the support obtained from degraded visual information in the comprehension of speech in noise. DESIGN: We presented sentences auditorily (speech reception threshold test), visually (text reception threshold test), and audiovisually. Presenting speech in noise and masked written text enabled the quantification and systematic variation of the amount of information presented in both modalities. Eighteen persons with normal hearing (aged 19 to 31 yr) participated. For half of them a bar pattern masked the text and for the other half random dots masked the text. The text was presented simultaneously or delayed relative to the speech. Using an adaptive procedure, the amount of information required for a correct reproduction of 50% of the sentences was determined for both the unimodal and the audiovisual stimuli. Bimodal support was defined as the difference between the observed bimodal performance and that predicted by an independent channels model. Nonparametric tests were used to evaluate the bimodal support and the effect of delaying the text. RESULTS: Masked text substantially supported the comprehension of speech in noise; the bimodal support ranged from 15% to 25% correct. A negative effect of delaying the text was observed in some conditions for the participants who were presented the text masked by the bar pattern. CONCLUSIONS: The ability of participants to reproduce bimodally presented sentences exceeds the performance as predicted by an independent channels model. This indicates that a relatively small amount of visual information can substantially augment speech comprehension in noise, which supports the use of visual information to improve speech comprehension by participants with hearing impairment, even if the visual information is incomplete.


Subject(s)
Auditory Perception/physiology , Noise , Perceptual Masking , Reading , Speech Perception , Visual Perception/physiology , Adult , Auditory Threshold , Female , Humans , Male , Noise/adverse effects , Perceptual Masking/physiology
16.
J Acoust Soc Am ; 124(6): 3937-46, 2008 Dec.
Article in English | MEDLINE | ID: mdl-19206818

ABSTRACT

A new concept is proposed that relates to intelligibility of speech in noise. The concept combines traditional estimations of signal-to-noise ratios (S/N) with elements from the modulation transfer function model, which results in the definition of the signal-to-noise ratio in the modulation domain: the (SN)(mod). It is argued that this (SN)(mod), quantifying the strength of speech modulations relative to a floor of spurious modulations arising from the speech-noise interaction, is the key factor in relation to speech intelligibility. It is shown that, by using a specific test signal, the strength of these spurious modulations can be measured, allowing an estimation of the (SN)(mod) for various conditions of additive noise, noise suppression, and amplitude compression. By relating these results to intelligibility data for these same conditions, the relevance of the (SN)(mod) as the key factor underlying speech intelligibility is clearly illustrated. For instance, it is shown that the commonly observed limited effect of noise suppression on speech intelligibility is correctly "predicted" by the (SN)(mod), whereas traditional measures such as the speech transmission index, considering only the changes in the speech modulations, fall short in this respect. It is argued that (SN)(mod) may provide a relevant tool in the design of successful noise-suppression systems.


Subject(s)
Noise/adverse effects , Signal Detection, Psychological , Speech Intelligibility , Speech Perception , Algorithms , Cues , Humans , Models, Biological , Pattern Recognition, Physiological , Perceptual Masking , Reproducibility of Results , Signal Processing, Computer-Assisted , Sound Spectrography , Time Factors
17.
J Speech Lang Hear Res ; 50(3): 576-84, 2007 Jun.
Article in English | MEDLINE | ID: mdl-17538101

ABSTRACT

PURPOSE: In this study, the authors aimed to develop a visual analogue of the widely used Speech Reception Threshold (SRT; R. Plomp & A. M. Mimpen, 1979b) test. The Text Reception Threshold (TRT) test, in which visually presented sentences are masked by a bar pattern, enables the quantification of modality-aspecific variance in speech-in-noise comprehension to obtain more insight into interindividual differences in this ability. METHOD: Using an adaptive procedure similar to the SRT test, the TRT test determines the percentage of unmasked text needed to read 50% of sentences correctly. SRTs in stationary noise (SRT(STAT)), modulated noise (SRT(MOD)), and TRTs were determined for 34 participants with normal hearing, aged 19 to 78 years. RESULTS: The results indicate that about 30% of the variance in SRT(STAT) and SRT(MOD) is shared with variance in TRT, which reflects the shared involvement of a modality-aspecific cognitive or linguistic ability in forming meaningful wholes of fragments of sentences. CONCLUSION: The TRT test, a visual analogue of the SRT test, has been developed to measure the variance in speech-in-noise comprehension associated with modality-aspecific cognitive skills. In future research, normative data of the TRT test should be developed. It would also be interesting to measure TRTs of individuals experiencing difficulties understanding speech.


Subject(s)
Speech Perception , Speech Reception Threshold Test , Visual Perception , Adult , Aged , Female , Humans , Male , Middle Aged
18.
J Acoust Soc Am ; 121(4): 2362-75, 2007 Apr.
Article in English | MEDLINE | ID: mdl-17471748

ABSTRACT

Speech reception thresholds (SRTs) for sentences were determined in stationary and modulated background noise for two age-matched groups of normal-hearing (N = 13) and hearing-impaired listeners (N = 21). Correlations were studied between the SRT in noise and measures of auditory and nonauditory performance, after which stepwise regression analyses were performed within both groups separately. Auditory measures included the pure-tone audiogram and tests of spectral and temporal acuity. Nonauditory factors were assessed by measuring the text reception threshold (TRT), a visual analogue of the SRT, in which partially masked sentences were adaptively presented. Results indicate that, for the normal-hearing group, the variance in speech reception is mainly associated with nonauditory factors, both in stationary and in modulated noise. For the hearing-impaired group, speech reception in stationary noise is mainly related to the audiogram, even when audibility effects are accounted for. In modulated noise, both auditory (temporal acuity) and nonauditory factors (TRT) contribute to explaining interindividual differences in speech reception. Age was not a significant factor in the results. It is concluded that, under some conditions, nonauditory factors are relevant for the perception of speech in noise. Further evaluation of nonauditory factors might enable adapting the expectations from auditory rehabilitation in clinical settings.


Subject(s)
Aging/physiology , Noise/adverse effects , Speech Perception/physiology , Age Factors , Aged , Aged, 80 and over , Audiometry, Pure-Tone , Female , Hearing Loss, Sensorineural/epidemiology , Humans , Male , Middle Aged , Speech Reception Threshold Test
19.
Int J Audiol ; 46(3): 134-44, 2007 Mar.
Article in English | MEDLINE | ID: mdl-17365067

ABSTRACT

The objective of the study was to examine the ability to understand digits in different types of noise. Adaptive speech-in-noise tests were developed that measure the speech-reception-threshold (SRTn) i.e. signal-to-noise ratio that corresponds to 50% intelligibility. Digits were presented in continuous noise, 16-Hz interrupted noise, and 32-Hz interrupted noise. Also the standard Dutch triplet SRTn test in continuous noise was included. Results for forty-two ears of normal-hearing and hearing-impaired adult participants are presented. The ratio between the standard deviation in SRTn values between subjects and the measurement error determines the efficiency of the tests. A high efficiency could be achieved by using triplets instead of digits, or by using 16-Hz interrupted noise instead of continuous noise, because this resulted in a large spread in SRTn values. The simple calculation method of averaging presentation levels was highly efficient. The digit SRTn test in 16-Hz interrupted noise was very efficient in discriminating between normal-hearing listeners and hearing-impaired listeners, and might be used to screen for hearing loss as measured by pure-tone audiometry.


Subject(s)
Audiometry, Pure-Tone/methods , Hearing Loss, Sensorineural/diagnosis , Noise/adverse effects , Speech Perception/physiology , Adult , Auditory Threshold/physiology , Female , Humans , Male , Severity of Illness Index , Speech Reception Threshold Test
20.
J Acoust Soc Am ; 122(5): 2865-71, 2007 Nov.
Article in English | MEDLINE | ID: mdl-18189576

ABSTRACT

A wavelet representation of speech was used to display the instantaneous amplitude and phase within 14 octave frequency bands, representing the envelope and the carrier within each band. Adding stationary noise alters the wavelet pattern, which can be understood as a combination of three simultaneously occurring subeffects: two effects on the wavelet levels (one systematic and one stochastic) and one effect on the wavelet phases. Specific types of signal processing were applied to speech, which allowed each effect to be either included or excluded. The impact of each effect (and of combinations) on speech intelligibility was measured with CVC's. It appeared that the systematic level effect (i.e., the increase of each speech wavelet intensity with the mean noise intensity) has the most degrading effect on speech intelligibility, which is in accordance with measures such as the modulation transfer function and the speech transmission index. However, also the introduction of stochastic level fluctuations and disturbance of the carrier phase seriously contribute to reduced intelligibility in noise. It is argued that these stochastic effects are responsible for the limited success of spectral subtraction as a means to improve speech intelligibility. Results can provide clues for effective noise suppression with respect to intelligibility.


Subject(s)
Noise , Speech Intelligibility , Humans , Psychoacoustics , Stochastic Processes
SELECTION OF CITATIONS
SEARCH DETAIL
...