Pesquisa | Portal Regional da BVS

1.

The Optimal Speech-to-Background Ratio for Balancing Speech Recognition With Environmental Sound Recognition.

Johnson, Eric M; Healy, Eric W.

Ear Hear ; 2024 May 31.

Artigo em Inglês | MEDLINE | ID: mdl-38816900

RESUMO

OBJECTIVES: This study aimed to determine the speech-to-background ratios (SBRs) at which normal-hearing (NH) and hearing-impaired (HI) listeners can recognize both speech and environmental sounds when the two types of signals are mixed. Also examined were the effect of individual sounds on speech recognition and environmental sound recognition (ESR), and the impact of divided versus selective attention on these tasks. DESIGN: In Experiment 1 (divided attention), 11 NH and 10 HI listeners heard sentences mixed with environmental sounds at various SBRs and performed speech recognition and ESR tasks concurrently in each trial. In Experiment 2 (selective attention), 20 NH listeners performed these tasks in separate trials. Psychometric functions were generated for each task, listener group, and environmental sound. The range over which speech recognition and ESR were both high was determined, as was the optimal SBR for balancing recognition with ESR, defined as the point of intersection between each pair of normalized psychometric functions. RESULTS: The NH listeners achieved greater than 95% accuracy on concurrent speech recognition and ESR over an SBR range of approximately 20 dB or greater. The optimal SBR for maximizing both speech recognition and ESR for NH listeners was approximately +12 dB. For the HI listeners, the range over which 95% performance was observed on both tasks was far smaller (span of 1 dB), with an optimal value of +5 dB. Acoustic analyses indicated that the speech and environmental sound stimuli were similarly audible, regardless of the hearing status of the listener, but that the speech fluctuated more than the environmental sounds. Divided versus selective attention conditions produced differences in performance that were statistically significant yet only modest in magnitude. In all conditions and for both listener groups, recognition was higher for environmental sounds than for speech when presented at equal intensities (i.e., 0 dB SBR), indicating that the environmental sounds were more effective maskers of speech than the converse. Each of the 25 environmental sounds used in this study (with one exception) had a span of SBRs over which speech recognition and ESR were both higher than 95%. These ranges tended to overlap substantially. CONCLUSIONS: A range of SBRs exists over which speech and environmental sounds can be simultaneously recognized with high accuracy by NH and HI listeners, but this range is larger for NH listeners. The single optimal SBR for jointly maximizing speech recognition and ESR also differs between NH and HI listeners. The greater masking effectiveness of the environmental sounds relative to the speech may be related to the lower degree of fluctuation present in the environmental sounds as well as possibly task differences between speech recognition and ESR (open versus closed set). The observed differences between the NH and HI results may possibly be related to the HI listeners' smaller fluctuating masker benefit. As noise-reduction systems become increasingly effective, the current results could potentially guide the design of future systems that provide listeners with highly intelligible speech without depriving them of access to important environmental sounds.

2.

Progress made in the efficacy and viability of deep-learning-based noise reduction.

Healy, Eric W; Johnson, Eric M; Pandey, Ashutosh; Wang, DeLiang.

J Acoust Soc Am ; 153(5): 2751, 2023 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-37133814

RESUMO

Recent years have brought considerable advances to our ability to increase intelligibility through deep-learning-based noise reduction, especially for hearing-impaired (HI) listeners. In this study, intelligibility improvements resulting from a current algorithm are assessed. These benefits are compared to those resulting from the initial demonstration of deep-learning-based noise reduction for HI listeners ten years ago in Healy, Yoho, Wang, and Wang [(2013). J. Acoust. Soc. Am. 134, 3029-3038]. The stimuli and procedures were broadly similar across studies. However, whereas the initial study involved highly matched training and test conditions, as well as non-causal operation, preventing its ability to operate in the real world, the current attentive recurrent network employed different noise types, talkers, and speech corpora for training versus test, as required for generalization, and it was fully causal, as required for real-time operation. Significant intelligibility benefit was observed in every condition, which averaged 51% points across conditions for HI listeners. Further, benefit was comparable to that obtained in the initial demonstration, despite the considerable additional demands placed on the current algorithm. The retention of large benefit despite the systematic removal of various constraints as required for real-world operation reflects the substantial advances made to deep-learning-based noise reduction.

Assuntos

Aprendizado Profundo , Auxiliares de Audição , Perda Auditiva Neurossensorial , Perda Auditiva , Percepção da Fala , Humanos , Inteligibilidade da Fala , Limiar Auditivo

3.

The Application of Time-Frequency Masking To Improve Intelligibility of Dysarthric Speech in Background Noise.

Borrie, Stephanie A; Yoho, Sarah E; Healy, Eric W; Barrett, Tyson S.

J Speech Lang Hear Res ; 66(5): 1853-1866, 2023 05 09.

Artigo em Inglês | MEDLINE | ID: mdl-36944186

RESUMO

PURPOSE: Background noise reduces speech intelligibility. Time-frequency (T-F) masking is an established signal processing technique that improves intelligibility of neurotypical speech in background noise. Here, we investigated a novel application of T-F masking, assessing its potential to improve intelligibility of neurologically degraded speech in background noise. METHOD: Listener participants (N = 422) completed an intelligibility task either in the laboratory or online, listening to and transcribing audio recordings of neurotypical (control) and neurologically degraded (dysarthria) speech under three different processing types: speech in quiet (quiet), speech mixed with cafeteria noise (noise), and speech mixed with cafeteria noise and then subsequently processed by an ideal quantized mask (IQM) to remove the noise. RESULTS: We observed significant reductions in intelligibility of dysarthric speech, even at highly favorable signal-to-noise ratios (+11 to +23 dB) that did not impact neurotypical speech. We also observed significant intelligibility improvements from speech in noise to IQM-processed speech for both control and dysarthric speech across a wide range of noise levels. Furthermore, the overall benefit of IQM processing for dysarthric speech was comparable with that of the control speech in background noise, as was the intelligibility data collected in the laboratory versus online. CONCLUSIONS: This study demonstrates proof of concept, validating the application of T-F masks to a neurologically degraded speech signal. Given that intelligibility challenges greatly impact communication, and thus the lives of people with dysarthria and their communication partners, the development of clinical tools to enhance intelligibility in this clinical population is critical.

Assuntos

Disartria , Percepção da Fala , Humanos , Disartria/etiologia , Disartria/terapia , Inteligibilidade da Fala , Percepção Auditiva , Cognição , Laboratórios , Mascaramento Perceptivo

4.

The Influence of Noise Type and Semantic Predictability on Word Recall in Older Listeners and Listeners With Hearing Impairment.

Carter, Brittney L; Apoux, Frédéric; Healy, Eric W.

J Speech Lang Hear Res ; 65(9): 3548-3565, 2022 09 12.

Artigo em Inglês | MEDLINE | ID: mdl-35973100

RESUMO

PURPOSE: A dual-task paradigm was implemented to investigate how noise type and sentence context may interact with age and hearing loss to impact word recall during speech recognition. METHOD: Three noise types with varying degrees of temporal/spectrotemporal modulation were used: speech-shaped noise, speech-modulated noise, and three-talker babble. Participant groups included younger listeners with normal hearing (NH), older listeners with near-normal hearing, and older listeners with sensorineural hearing loss. An adaptive measure was used to establish the signal-to-noise ratio approximating 70% sentence recognition for each participant in each noise type. A word-recall task was then implemented while matching speech-recognition performance across noise types and participant groups. Random-intercept linear mixed-effects models were used to determine the effects of and interactions between noise type, sentence context, and participant group on word recall. RESULTS: The results suggest that noise type does not significantly impact word recall when word-recognition performance is controlled. When data from noise types were pooled and compared with quiet, and recall was assessed: older listeners with near-normal hearing performed well when either quiet backgrounds or high sentence context (or both) were present, but older listeners with hearing loss performed well only when both quiet backgrounds and high sentence context were present. Younger listeners with NH were robust to the detrimental effects of noise and low context. CONCLUSIONS: The general presence of noise has the potential to decrease word recall, but type of noise does not appear to significantly impact this observation when overall task difficulty is controlled. The presence of noise as well as deficits related to age and/or hearing loss appear to limit the availability of cognitive processing resources available for working memory during conversation in difficult listening environments. The conversation environments that impact these resources appear to differ depending on age and/or hearing status.

Assuntos

Perda Auditiva Neurossensorial , Perda Auditiva , Percepção da Fala , Idoso , Perda Auditiva Neurossensorial/psicologia , Humanos , Ruído , Semântica

5.

A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation.

Healy, Eric W; Taherian, Hassan; Johnson, Eric M; Wang, DeLiang.

J Acoust Soc Am ; 150(5): 3976, 2021 11.

Artigo em Inglês | MEDLINE | ID: mdl-34852625

RESUMO

The fundamental requirement for real-time operation of a speech-processing algorithm is causality-that it operate without utilizing future time frames. In the present study, the performance of a fully causal deep computational auditory scene analysis algorithm was assessed. Target sentences were isolated from complex interference consisting of an interfering talker and concurrent room reverberation. The talker- and corpus/channel-independent model used Dense-UNet and temporal convolutional networks and estimated both magnitude and phase of the target speech. It was found that mean algorithm benefit was significant in every condition. Mean benefit for hearing-impaired (HI) listeners across all conditions was 46.4 percentage points. The cost of converting the algorithm to causal processing was also assessed by comparing to a prior non-causal version. Intelligibility decrements for HI and normal-hearing listeners from non-causal to causal processing were present in most but not all conditions, and these decrements were statistically significant in half of the conditions tested-those representing the greater levels of complex interference. Although a cost associated with causal processing was present in most conditions, it may be considered modest relative to the overall level of benefit.

Assuntos

Aprendizado Profundo , Perda Auditiva Neurossensorial , Percepção da Fala , Algoritmos , Humanos , Inteligibilidade da Fala

6.

Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility.

Healy, Eric W; Johnson, Eric M; Delfarah, Masood; Krishnagiri, Divya S; Sevich, Victoria A; Taherian, Hassan; Wang, DeLiang.

J Acoust Soc Am ; 150(4): 2526, 2021 10.

Artigo em Inglês | MEDLINE | ID: mdl-34717521

RESUMO

The practical efficacy of deep learning based speaker separation and/or dereverberation hinges on its ability to generalize to conditions not employed during neural network training. The current study was designed to assess the ability to generalize across extremely different training versus test environments. Training and testing were performed using different languages having no known common ancestry and correspondingly large linguistic differences-English for training and Mandarin for testing. Additional generalizations included untrained speech corpus/recording channel, target-to-interferer energy ratios, reverberation room impulse responses, and test talkers. A deep computational auditory scene analysis algorithm, employing complex time-frequency masking to estimate both magnitude and phase, was used to segregate two concurrent talkers and simultaneously remove large amounts of room reverberation to increase the intelligibility of a target talker. Significant intelligibility improvements were observed for the normal-hearing listeners in every condition. Benefit averaged 43.5% points across conditions and was comparable to that obtained when training and testing were performed both in English. Benefit is projected to be considerably larger for individuals with hearing impairment. It is concluded that a properly designed and trained deep speaker separation/dereverberation network can be capable of generalization across vastly different acoustic environments that include different languages.

Assuntos

Aprendizado Profundo , Perda Auditiva , Percepção da Fala , Humanos , Idioma , Mascaramento Perceptivo , Inteligibilidade da Fala

7.

An effectively causal deep learning algorithm to increase intelligibility in untrained noises for hearing-impaired listeners.

Healy, Eric W; Tan, Ke; Johnson, Eric M; Wang, DeLiang.

J Acoust Soc Am ; 149(6): 3943, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-34241481

RESUMO

Real-time operation is critical for noise reduction in hearing technology. The essential requirement of real-time operation is causality-that an algorithm does not use future time-frame information and, instead, completes its operation by the end of the current time frame. This requirement is extended currently through the concept of "effectively causal," in which future time-frame information within the brief delay tolerance of the human speech-perception mechanism is used. Effectively causal deep learning was used to separate speech from background noise and improve intelligibility for hearing-impaired listeners. A single-microphone, gated convolutional recurrent network was used to perform complex spectral mapping. By estimating both the real and imaginary parts of the noise-free speech, both the magnitude and phase of the estimated noise-free speech were obtained. The deep neural network was trained using a large set of noises and tested using complex noises not employed during training. Significant algorithm benefit was observed in every condition, which was largest for those with the greatest hearing loss. Allowable delays across different communication settings are reviewed and assessed. The current work demonstrates that effectively causal deep learning can significantly improve intelligibility for one of the largest populations of need in challenging conditions involving untrained background noises.

Assuntos

Aprendizado Profundo , Auxiliares de Audição , Perda Auditiva Neurossensorial , Perda Auditiva , Percepção da Fala , Algoritmos , Audição , Perda Auditiva/diagnóstico , Perda Auditiva Neurossensorial/diagnóstico , Humanos , Inteligibilidade da Fala

8.

Spectro-temporal glimpsing of speech in noise: Regularity and coherence of masking patterns reduces uncertainty and increases intelligibility.

Fogerty, Daniel; Sevich, Victoria A; Healy, Eric W.

J Acoust Soc Am ; 148(3): 1552, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-33003879

RESUMO

Adverse listening conditions involve glimpses of spectro-temporal speech information. This study investigated if the acoustic organization of the spectro-temporal masking pattern affects speech glimpsing in "checkerboard" noise. The regularity and coherence of the masking pattern was varied. Regularity was reduced by randomizing the spectral or temporal gating of the masking noise. Coherence involved the spectral alignment of frequency bands across time or the temporal alignment of gated onsets/offsets across frequency bands. Experiment 1 investigated the effect of spectral or temporal coherence. Experiment 2 investigated independent and combined factors of regularity and coherence. Performance was best in spectro-temporally modulated noise having larger glimpses. Generally, performance also improved as the regularity and coherence of masker fluctuations increased, with regularity having a stronger effect than coherence. An acoustic glimpsing model suggested that the effect of regularity (but not coherence) could be partially attributed to the availability of glimpses retained after energetic masking. Performance tended to be better with maskers that were spectrally coherent as compared to temporally coherent. Overall, performance was best when the spectro-temporal masking pattern imposed even spectral sampling and minimal temporal uncertainty, indicating that listeners use reliable masking patterns to aid in spectro-temporal speech glimpsing.

Assuntos

Mascaramento Perceptivo , Percepção da Fala , Estimulação Acústica , Adolescente , Feminino , Humanos , Fala , Inteligibilidade da Fala , Incerteza , Adulto Jovem

9.

A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions.

Healy, Eric W; Johnson, Eric M; Delfarah, Masood; Wang, DeLiang.

J Acoust Soc Am ; 147(6): 4106, 2020 06.

Artigo em Inglês | MEDLINE | ID: mdl-32611178

RESUMO

Deep learning based speech separation or noise reduction needs to generalize to voices not encountered during training and to operate under multiple corruptions. The current study provides such a demonstration for hearing-impaired (HI) listeners. Sentence intelligibility was assessed under conditions of a single interfering talker and substantial amounts of room reverberation. A talker-independent deep computational auditory scene analysis (CASA) algorithm was employed, in which talkers were separated and dereverberated in each time frame (simultaneous grouping stage), then the separated frames were organized to form two streams (sequential grouping stage). The deep neural networks consisted of specialized convolutional neural networks, one based on U-Net and the other a temporal convolutional network. It was found that every HI (and normal-hearing, NH) listener received algorithm benefit in every condition. Benefit averaged across all conditions ranged from 52 to 76 percentage points for individual HI listeners and averaged 65 points. Further, processed HI intelligibility significantly exceeded unprocessed NH intelligibility. Although the current utterance-based model was not implemented as a real-time system, a perspective on this important issue is provided. It is concluded that deep CASA represents a powerful framework capable of producing large increases in HI intelligibility for potentially any two voices.

Assuntos

Aprendizado Profundo , Perda Auditiva Neurossensorial , Percepção da Fala , Algoritmos , Audição , Humanos , Inteligibilidade da Fala

10.

The optimal threshold for removing noise from speech is similar across normal and impaired hearing-a time-frequency masking study.

Healy, Eric W; Vasko, Jordan L; Wang, DeLiang.

J Acoust Soc Am ; 145(6): EL581, 2019 06.

Artigo em Inglês | MEDLINE | ID: mdl-31255108

RESUMO

Hearing-impaired listeners' intolerance to background noise during speech perception is well known. The current study employed speech materials free of ceiling effects to reveal the optimal trade-off between rejecting noise and retaining speech during time-frequency masking. This relative criterion value (-7 dB) was found to hold across noise types that differ in acoustic spectro-temporal complexity. It was also found that listeners with hearing impairment and those with normal hearing performed optimally at this same value, suggesting no true noise intolerance once time-frequency units containing speech are extracted.

Assuntos

Limiar Auditivo/fisiologia , Perda Auditiva/fisiopatologia , Ruído , Percepção da Fala/fisiologia , Fala/fisiologia , Adulto , Percepção Auditiva/fisiologia , Feminino , Perda Auditiva Neurossensorial/fisiopatologia , Humanos , Adulto Jovem

11.

A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation.

Healy, Eric W; Delfarah, Masood; Johnson, Eric M; Wang, DeLiang.

J Acoust Soc Am ; 145(3): 1378, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-31067936

RESUMO

For deep learning based speech segregation to have translational significance as a noise-reduction tool, it must perform in a wide variety of acoustic environments. In the current study, performance was examined when target speech was subjected to interference from a single talker and room reverberation. Conditions were compared in which an algorithm was trained to remove both reverberation and interfering speech, or only interfering speech. A recurrent neural network incorporating bidirectional long short-term memory was trained to estimate the ideal ratio mask corresponding to target speech. Substantial intelligibility improvements were found for hearing-impaired (HI) and normal-hearing (NH) listeners across a range of target-to-interferer ratios (TIRs). HI listeners performed better with reverberation removed, whereas NH listeners demonstrated no difference. Algorithm benefit averaged 56 percentage points for the HI listeners at the least-favorable TIR, allowing these listeners to perform numerically better than young NH listeners without processing. The current study highlights the difficulty associated with perceiving speech in reverberant-noisy environments, and it extends the range of environments in which deep learning based speech segregation can be effectively applied. This increasingly wide array of environments includes not only a variety of background noises and interfering speech, but also room reverberation.

Assuntos

Aprendizado Profundo , Auxiliares de Audição/normas , Perda Auditiva Neurossensorial/reabilitação , Inteligibilidade da Fala , Interface para o Reconhecimento da Fala , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Razão Sinal-Ruído , Percepção da Fala

12.

A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions.

Zhao, Yan; Wang, DeLiang; Johnson, Eric M; Healy, Eric W.

J Acoust Soc Am ; 144(3): 1627, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-30424625

RESUMO

Recently, deep learning based speech segregation has been shown to improve human speech intelligibility in noisy environments. However, one important factor not yet considered is room reverberation, which characterizes typical daily environments. The combination of reverberation and background noise can severely degrade speech intelligibility for hearing-impaired (HI) listeners. In the current study, a deep learning based time-frequency masking algorithm was proposed to address both room reverberation and background noise. Specifically, a deep neural network was trained to estimate the ideal ratio mask, where anechoic-clean speech was considered as the desired signal. Intelligibility testing was conducted under reverberant-noisy conditions with reverberation time T 60 = 0.6 s, plus speech-shaped noise or babble noise at various signal-to-noise ratios. The experiments demonstrated that substantial speech intelligibility improvements were obtained for HI listeners. The algorithm was also somewhat beneficial for normal-hearing (NH) listeners. In addition, sentence intelligibility scores for HI listeners with algorithm processing approached or matched those of young-adult NH listeners without processing. The current study represents a step toward deploying deep learning algorithms to help the speech understanding of HI listeners in everyday conditions.

Assuntos

Algoritmos , Aprendizado Profundo , Perda Auditiva Neurossensorial/terapia , Ruído , Mascaramento Perceptivo/fisiologia , Inteligibilidade da Fala/fisiologia , Idoso , Feminino , Auxiliares de Audição , Perda Auditiva Neurossensorial/fisiopatologia , Humanos , Masculino , Pessoa de Meia-Idade , Ruído/efeitos adversos , Adulto Jovem

13.

An ideal quantized mask to increase intelligibility and quality of speech in noise.

Healy, Eric W; Vasko, Jordan L.

J Acoust Soc Am ; 144(3): 1392, 2018 09.

Artigo em Inglês | MEDLINE | ID: mdl-30424638

RESUMO

Time-frequency (T-F) masks represent powerful tools to increase the intelligibility of speech in background noise. Translational relevance is provided by their accurate estimation based only on the signal-plus-noise mixture, using deep learning or other machine-learning techniques. In the current study, a technique is designed to capture the benefits of existing techniques. In the ideal quantized mask (IQM), speech and noise are partitioned into T-F units, and each unit receives one of N attenuations according to its signal-to-noise ratio. It was found that as few as four to eight attenuation steps (IQM4, IQM8) improved intelligibility over the ideal binary mask (IBM, having two attenuation steps), and equaled the intelligibility resulting from the ideal ratio mask (IRM, having a theoretically infinite number of steps). Sound-quality ratings and rankings of noisy speech processed by the IQM4 and IQM8 were also superior to that processed by the IBM and equaled or exceeded that processed by the IRM. It is concluded that the intelligibility and sound-quality advantages of infinite attenuation resolution can be captured by an IQM having only a very small number of steps. Further, the classification-based nature of the IQM might provide algorithmic advantages over the regression-based IRM during machine estimation.

Assuntos

Estimulação Acústica/métodos , Ruído , Mascaramento Perceptivo/fisiologia , Espectrografia do Som/métodos , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Adulto , Feminino , Humanos , Masculino , Acústica da Fala , Adulto Jovem

14.

Effect of Dual-Carrier Processing on the Intelligibility of Concurrent Vocoded Sentences.

Apoux, Frédéric; Carter, Brittney L; Healy, Eric W.

J Speech Lang Hear Res ; 61(11): 2804-2813, 2018 11 08.

Artigo em Inglês | MEDLINE | ID: mdl-30458525

RESUMO

Purpose: The goal of this study was to examine the role of carrier cues in sound source segregation and the possibility to enhance the intelligibility of 2 sentences presented simultaneously. Dual-carrier (DC) processing (Apoux, Youngdahl, Yoho, & Healy, 2015) was used to introduce synthetic carrier cues in vocoded speech. Method: Listeners with normal hearing heard sentences processed either with a DC or with a traditional single-carrier (SC) vocoder. One group was asked to repeat both sentences in a sentence pair (Experiment 1). The other group was asked to repeat only 1 sentence of the pair and was provided additional segregation cues involving onset asynchrony (Experiment 2). Results: Both experiments showed that not only is the "target" sentence more intelligible in DC compared with SC, but the "background" sentence intelligibility is equally enhanced. The participants did not benefit from the additional segregation cues. Conclusions: The data showed a clear benefit of using a distinct carrier to convey each sentence (i.e., DC processing). Accordingly, the poor speech intelligibility in noise typically observed with SC-vocoded speech may be partly attributed to the envelope of independent sound sources sharing the same carrier. Moreover, this work suggests that noise reduction may not be the only viable option to improve speech intelligibility in noise for users of cochlear implants. Alternative approaches aimed at enhancing sound source segregation such as DC processing may help to improve speech intelligibility while preserving and enhancing the background.

Assuntos

Implantes Cocleares , Inteligibilidade da Fala , Adulto , Audiometria da Fala , Feminino , Humanos , Masculino , Ruído , Mascaramento Perceptivo , Fonética , Adulto Jovem

15.

Glimpsing speech in temporally and spectro-temporally modulated noise.

Fogerty, Daniel; Carter, Brittney L; Healy, Eric W.

J Acoust Soc Am ; 143(5): 3047, 2018 05.

Artigo em Inglês | MEDLINE | ID: mdl-29857753

RESUMO

Speech recognition in fluctuating maskers is influenced by the spectro-temporal properties of the noise. Three experiments examined different temporal and spectro-temporal noise properties. Experiment 1 replicated previous work by highlighting maximum performance at a temporal gating rate of 4-8 Hz. Experiment 2 involved spectro-temporal glimpses. Performance was best with the largest glimpses, and performance with small glimpses approached that for continuous noise matched to the average level of the modulated noise. Better performance occurred with periodic than for random spectro-temporal glimpses. Finally, time and frequency for spectro-temporal glimpses were dissociated in experiment 3. Larger spectral glimpses were more beneficial than smaller, and minimum performance was observed at a gating rate of 4-8 Hz. The current results involving continuous speech in gated noise (slower and larger glimpses most advantageous) run counter to several results involving gated and/or filtered speech, where a larger number of smaller speech samples is often advantageous. This is because mechanisms of masking dominate, negating the advantages of better speech-information sampling. It is suggested that spectro-temporal glimpsing combines temporal glimpsing with additional processes of simultaneous masking and uncomodulation, and continuous speech in gated noise is a better model for real-world glimpsing than is gated and/or filtered speech.

Assuntos

Estimulação Acústica/métodos , Ruído , Mascaramento Perceptivo/fisiologia , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia , Feminino , Humanos , Masculino , Fatores de Tempo , Adulto Jovem

16.

The noise susceptibility of various speech bands.

Yoho, Sarah E; Apoux, Frédéric; Healy, Eric W.

J Acoust Soc Am ; 143(4): 2527, 2018 04.

Artigo em Inglês | MEDLINE | ID: mdl-29716288

RESUMO

The degrading influence of noise on various critical bands of speech was assessed. A modified version of the compound method [Apoux and Healy (2012) J. Acoust. Soc. Am. 132, 1078-1087] was employed to establish this noise susceptibility for each speech band. Noise was added to the target speech band at various signal-to-noise ratios to determine the amount of noise required to reduce the contribution of that band by 50%. It was found that noise susceptibility is not equal across the speech spectrum, as is commonly assumed and incorporated into modern indexes. Instead, the signal-to-noise ratio required to equivalently impact various speech bands differed by as much as 13 dB. This noise susceptibility formed an irregular pattern across frequency, despite the use of multi-talker speech materials designed to reduce the potential influence of a particular talker's voice. But basic trends in the pattern of noise susceptibility across the spectrum emerged. Further, no systematic relationship was observed between noise susceptibility and speech band importance. It is argued here that susceptibility to noise and band importance are different phenomena, and that this distinction may be underappreciated in previous works.

Assuntos

Estimulação Acústica/métodos , Percepção Auditiva/fisiologia , Limiar Auditivo/fisiologia , Audição/fisiologia , Ruído , Percepção da Fala/fisiologia , Adulto , Feminino , Humanos , Masculino , Razão Sinal-Ruído , Testes de Discriminação da Fala , Adulto Jovem

17.

Speech-material and talker effects in speech band importance.

Yoho, Sarah E; Healy, Eric W; Youngdahl, Carla L; Barrett, Tyson S; Apoux, Frédéric.

J Acoust Soc Am ; 143(3): 1417, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-29604719

RESUMO

Band-importance functions created using the compound method [Apoux and Healy (2012). J. Acoust. Soc. Am. 132, 1078-1087] provide more detail than those generated using the ANSI technique, necessitating and allowing a re-examination of the influences of speech material and talker on the shape of the band-importance function. More specifically, the detailed functions may reflect, to a larger extent, acoustic idiosyncrasies of the individual talker's voice. Twenty-one band functions were created using standard speech materials and recordings by different talkers. The band-importance functions representing the same speech-material type produced by different talkers were found to be more similar to one another than functions representing the same talker producing different speech-material types. Thus, the primary finding was the relative strength of a speech-material effect and weakness of a talker effect. This speech-material effect extended to other materials in the same broad class (different sentence corpora) despite considerable differences in the specific materials. Characteristics of individual talkers' voices were not readily apparent in the functions, and the talker effect was restricted to more global aspects of talker (i.e., gender). Finally, the use of multiple talkers diminished any residual effect of the talker.

Assuntos

Inteligibilidade da Fala , Percepção da Fala , Qualidade da Voz , Adulto , Audiometria de Tons Puros , Audiometria da Fala , Limiar Auditivo , Feminino , Humanos , Masculino , Ruído , Mascaramento Perceptivo , Fatores Sexuais , Acústica da Fala , Adulto Jovem

18.

The Effect of Remote Masking on the Reception of Speech by Young School-Age Children.

Youngdahl, Carla L; Healy, Eric W; Yoho, Sarah E; Apoux, Frédéric; Holt, Rachael Frush.

J Speech Lang Hear Res ; 61(2): 420-427, 2018 02 15.

Artigo em Inglês | MEDLINE | ID: mdl-29396579

RESUMO

Purpose: Psychoacoustic data indicate that infants and children are less likely than adults to focus on a spectral region containing an anticipated signal and are more susceptible to remote masking of a signal. These detection tasks suggest that infants and children, unlike adults, do not listen selectively. However, less is known about children's ability to listen selectively during speech recognition. Accordingly, the current study examines remote masking during speech recognition in children and adults. Method: Adults and 7- and 5-year-old children performed sentence recognition in the presence of various spectrally remote maskers. Intelligibility was determined for each remote-masker condition, and performance was compared across age groups. Results: It was found that speech recognition for 5-year-olds was reduced in the presence of spectrally remote noise, whereas the maskers had no effect on the 7-year-olds or adults. Maskers of different bandwidth and remoteness had similar effects. Conclusions: In accord with psychoacoustic data, young children do not appear to focus on a spectral region of interest and ignore other regions during speech recognition. This tendency may help account for their typically poorer speech perception in noise. This study also appears to capture an important developmental stage, during which a substantial refinement in spectral listening occurs.

Assuntos

Mascaramento Perceptivo , Percepção da Fala , Adulto , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Reconhecimento Fisiológico de Modelo , Psicoacústica , Reconhecimento Psicológico , Adulto Jovem

19.

An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Healy, Eric W; Delfarah, Masood; Vasko, Jordan L; Carter, Brittney L; Wang, DeLiang.

J Acoust Soc Am ; 141(6): 4230, 2017 06.

Artigo em Inglês | MEDLINE | ID: mdl-28618817

RESUMO

Individuals with hearing impairment have particular difficulty perceptually segregating concurrent voices and understanding a talker in the presence of a competing voice. In contrast, individuals with normal hearing perform this task quite well. This listening situation represents a very different problem for both the human and machine listener, when compared to perceiving speech in other types of background noise. A machine learning algorithm is introduced here to address this listening situation. A deep neural network was trained to estimate the ideal ratio mask for a male target talker in the presence of a female competing talker. The monaural algorithm was found to produce sentence-intelligibility increases for hearing-impaired (HI) and normal-hearing (NH) listeners at various signal-to-noise ratios (SNRs). This benefit was largest for the HI listeners and averaged 59%-points at the least-favorable SNR, with a maximum of 87%-points. The mean intelligibility achieved by the HI listeners using the algorithm was equivalent to that of young NH listeners without processing, under conditions of identical interference. Possible reasons for the limited ability of HI listeners to perceptually segregate concurrent voices are reviewed as are possible implementation considerations for algorithms like the current one.

Assuntos

Correção de Deficiência Auditiva/instrumentação , Aprendizado Profundo , Auxiliares de Audição , Perda Auditiva Neurossensorial/reabilitação , Mascaramento Perceptivo , Pessoas com Deficiência Auditiva/reabilitação , Processamento de Sinais Assistido por Computador , Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Idoso , Audiometria da Fala , Limiar Auditivo , Compreensão , Feminino , Audição , Perda Auditiva Neurossensorial/diagnóstico , Perda Auditiva Neurossensorial/fisiopatologia , Perda Auditiva Neurossensorial/psicologia , Humanos , Masculino , Pessoa de Meia-Idade , Pessoas com Deficiência Auditiva/psicologia , Razão Sinal-Ruído , Adulto Jovem

20.

The Effect of Stimulus Valence on Lexical Retrieval in Younger and Older Adults.

Blackett, Deena Schwen; Harnish, Stacy M; Lundine, Jennifer P; Zezinka, Alexandra; Healy, Eric W.

J Speech Lang Hear Res ; 60(7): 2081-2089, 2017 07 12.

Artigo em Inglês | MEDLINE | ID: mdl-28632840

RESUMO

Purpose: Although there is evidence that emotional valence of stimuli impacts lexical processes, there is limited work investigating its specific impact on lexical retrieval. The current study aimed to determine the degree to which emotional valence of pictured stimuli impacts naming latencies in healthy younger and older adults. Method: Eighteen healthy younger adults and 18 healthy older adults named positive, negative, and neutral images, and reaction time was measured. Results: Reaction times for positive and negative images were significantly longer than reaction times for neutral images. Reaction times for positive and negative images were not significantly different. Whereas older adults demonstrated significantly longer naming latencies overall than younger adults, the discrepancy in latency with age was far greater when naming emotional pictures. Conclusions: Emotional arousal of pictures appears to impact naming latency in younger and older adults. We hypothesize that the increase in naming latency for emotional stimuli is the result of a necessary disengagement of attentional resources from the emotional images prior to completion of the naming task. We propose that this process may affect older adults disproportionately due to a decline in attentional resources as part of normal aging, combined with a greater attentional preference for emotional stimuli.

Assuntos

Envelhecimento/psicologia , Emoções , Reconhecimento Visual de Modelos , Psicolinguística , Fala , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Análise de Variância , Feminino , Humanos , Modelos Lineares , Masculino , Pessoa de Meia-Idade , Tempo de Reação , Adulto Jovem

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA