Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 11.406
Filter
1.
Psychol Aging ; 39(3): 262-274, 2024 May.
Article in English | MEDLINE | ID: mdl-38829339

ABSTRACT

The redundancy hypothesis proposes that older listeners need a larger array of acoustic cues than younger listeners for effective speech perception. This research investigated this hypothesis by examining the aging effects on the use of prosodic cues in speech segmentation in Mandarin Chinese. We examined how younger and older listeners perceived prosodic boundaries using three main prosodic cues (pause, final lengthening, and pitch change) across eight conditions involving different cue combinations. The stimuli consisted of syntactically ambiguous phrase pairs, each containing two or three objects. Participants (22 younger listeners and 22 older listeners) performed a speech recognition task to judge the number of objects they heard. Both groups primarily relied on the pause cue for identifying prosodic boundaries, using final lengthening and pitch change as secondary cues. However, older listeners showed reduced sensitivity to these cues, compensating by integrating the primary cue pause with the secondary cue pitch change for more precise segmentation. The present study reveals older listeners' integration strategy in using prosodic cues for speech segmentation, supporting the redundancy hypothesis. (PsycInfo Database Record (c) 2024 APA, all rights reserved).


Subject(s)
Aging , Cues , Speech Perception , Humans , Speech Perception/physiology , Female , Male , Young Adult , Aged , Adult , Middle Aged , Aging/physiology , Aging/psychology , Pitch Perception/physiology , Age Factors
2.
Codas ; 36(3): e20230091, 2024.
Article in Portuguese, English | MEDLINE | ID: mdl-38836822

ABSTRACT

PURPOSE: To propose an instrument for assessing speech recognition in the presence of competing noise. To define its application strategy for use in clinical practice. To obtain evidence of criterion validity and present reference values. METHODS: The study was conducted in three stages: Organization of the material comprising the Word-with-Noise Test (Stage 1); Definition of the instrument's application strategy (Stage 2); Investigation of criterion validity and definition of reference values for the test (Stage 3) through the evaluation of 50 normal-hearing adult subjects and 12 subjects with hearing loss. RESULTS: The Word-with-Noise Test consists of lists of monosyllabic and disyllabic words and speech spectrum noise (Stage 1). The application strategy for the test was defined as the determination of the Speech Recognition Threshold with a fixed noise level at 55 dBHL (Stage 2). Regarding criterion validity, the instrument demonstrated adequate ability to distinguish between normal-hearing subjects and subjects with hearing loss (Stage 3). Reference values for the test were established as cut-off points expressed in terms of signal-to-noise ratio: 1.47 dB for the monosyllabic stimulus and -2.02 dB for the disyllabic stimulus. Conclusion: The Word-with-Noise Test proved to be quick to administer and interpret, making it a useful tool in audiological clinical practice. Furthermore, it showed satisfactory evidence of criterion validity, with established reference values.


OBJETIVO: Propor um instrumento para a avaliação do reconhecimento de fala na presença de ruído competitivo. Definir sua estratégia de aplicação, para ser aplicado na rotina clínica. Obter evidências de validade de critério e apresentar seus valores de referência. MÉTODO: Estudo realizado em três etapas: Organização do material que compôs o Teste de Palavras no Ruído (Etapa 1); Definição da estratégia de aplicação do instrumento (Etapa 2); Investigação da validade de critério e definição dos valores de referência para o teste (Etapa 3), por meio da avaliação de 50 sujeitos adultos normo-ouvintes e 12 sujeitos com perda auditiva. RESULTADOS: O Teste de Palavras no Ruído é composto por listas de vocábulos mono e dissilábicos e um ruído com espectro de fala (Etapa 1). Foi definida como estratégia de aplicação do teste, a realização do Limiar de Reconhecimento de Fala com ruído fixo em 55 dBNA (Etapa 2). Quanto à validade de critério, o instrumento apresentou adequada capacidade de distinção entre os sujeitos normo-ouvintes e os sujeitos com perda auditiva (Etapa 3). Foram definidos como valores de referência para o teste, os pontos de corte expressos em relação sinal/ruído de 1,47 dB para o estímulo monossilábico e de -2,02 dB para o dissilábico. CONCLUSÃO: O Teste de Palavras no Ruído demonstrou ser rápido e de fácil aplicação e interpretação dos resultados, podendo ser uma ferramenta útil a ser utilizada na rotina clínica audiológica. Além disso, apresentou evidências satisfatórias de validade de critério, com valores de referência estabelecidos.


Subject(s)
Noise , Humans , Reference Values , Adult , Female , Male , Young Adult , Reproducibility of Results , Middle Aged , Speech Perception/physiology , Signal-To-Noise Ratio , Auditory Threshold/physiology , Case-Control Studies , Hearing Loss/diagnosis , Hearing Loss/physiopathology , Speech Reception Threshold Test/methods , Speech Reception Threshold Test/standards , Aged , Adolescent
3.
Cogn Res Princ Implic ; 9(1): 35, 2024 Jun 05.
Article in English | MEDLINE | ID: mdl-38834918

ABSTRACT

Multilingual speakers can find speech recognition in everyday environments like restaurants and open-plan offices particularly challenging. In a world where speaking multiple languages is increasingly common, effective clinical and educational interventions will require a better understanding of how factors like multilingual contexts and listeners' language proficiency interact with adverse listening environments. For example, word and phrase recognition is facilitated when competing voices speak different languages. Is this due to a "release from masking" from lower-level acoustic differences between languages and talkers, or higher-level cognitive and linguistic factors? To address this question, we created a "one-man bilingual cocktail party" selective attention task using English and Mandarin speech from one bilingual talker to reduce low-level acoustic cues. In Experiment 1, 58 listeners more accurately recognized English targets when distracting speech was Mandarin compared to English. Bilingual Mandarin-English listeners experienced significantly more interference and intrusions from the Mandarin distractor than did English listeners, exacerbated by challenging target-to-masker ratios. In Experiment 2, 29 Mandarin-English bilingual listeners exhibited linguistic release from masking in both languages. Bilinguals experienced greater release from masking when attending to English, confirming an influence of linguistic knowledge on the "cocktail party" paradigm that is separate from primarily energetic masking effects. Effects of higher-order language processing and expertise emerge only in the most demanding target-to-masker contexts. The "one-man bilingual cocktail party" establishes a useful tool for future investigations and characterization of communication challenges in the large and growing worldwide community of Mandarin-English bilinguals.


Subject(s)
Attention , Multilingualism , Speech Perception , Humans , Speech Perception/physiology , Adult , Female , Male , Young Adult , Attention/physiology , Perceptual Masking/physiology , Psycholinguistics
4.
Sci Rep ; 14(1): 13241, 2024 Jun 09.
Article in English | MEDLINE | ID: mdl-38853168

ABSTRACT

Cochlear implants (CIs) do not offer the same level of effectiveness in noisy environments as in quiet settings. Current single-microphone noise reduction algorithms in hearing aids and CIs only remove predictable, stationary noise, and are ineffective against realistic, non-stationary noise such as multi-talker interference. Recent developments in deep neural network (DNN) algorithms have achieved noteworthy performance in speech enhancement and separation, especially in removing speech noise. However, more work is needed to investigate the potential of DNN algorithms in removing speech noise when tested with listeners fitted with CIs. Here, we implemented two DNN algorithms that are well suited for applications in speech audio processing: (1) recurrent neural network (RNN) and (2) SepFormer. The algorithms were trained with a customized dataset ( ∼ 30 h), and then tested with thirteen CI listeners. Both RNN and SepFormer algorithms significantly improved CI listener's speech intelligibility in noise without compromising the perceived quality of speech overall. These algorithms not only increased the intelligibility in stationary non-speech noise, but also introduced a substantial improvement in non-stationary noise, where conventional signal processing strategies fall short with little benefits. These results show the promise of using DNN algorithms as a solution for listening challenges in multi-talker noise interference.


Subject(s)
Algorithms , Cochlear Implants , Deep Learning , Noise , Speech Intelligibility , Humans , Female , Middle Aged , Male , Speech Perception/physiology , Aged , Adult , Neural Networks, Computer
5.
J Acoust Soc Am ; 155(5): 2934-2947, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38717201

ABSTRACT

Spatial separation and fundamental frequency (F0) separation are effective cues for improving the intelligibility of target speech in multi-talker scenarios. Previous studies predominantly focused on spatial configurations within the frontal hemifield, overlooking the ipsilateral side and the entire median plane, where localization confusion often occurs. This study investigated the impact of spatial and F0 separation on intelligibility under the above-mentioned underexplored spatial configurations. The speech reception thresholds were measured through three experiments for scenarios involving two to four talkers, either in the ipsilateral horizontal plane or in the entire median plane, utilizing monotonized speech with varying F0s as stimuli. The results revealed that spatial separation in symmetrical positions (front-back symmetry in the ipsilateral horizontal plane or front-back, up-down symmetry in the median plane) contributes positively to intelligibility. Both target direction and relative target-masker separation influence the masking release attributed to spatial separation. As the number of talkers exceeds two, the masking release from spatial separation diminishes. Nevertheless, F0 separation remains as a remarkably effective cue and could even facilitate spatial separation in improving intelligibility. Further analysis indicated that current intelligibility models encounter difficulties in accurately predicting intelligibility in scenarios explored in this study.


Subject(s)
Cues , Perceptual Masking , Sound Localization , Speech Intelligibility , Speech Perception , Humans , Female , Male , Young Adult , Adult , Speech Perception/physiology , Acoustic Stimulation , Auditory Threshold , Speech Acoustics , Speech Reception Threshold Test , Noise
6.
J Acoust Soc Am ; 155(5): 2990-3004, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38717206

ABSTRACT

Speakers can place their prosodic prominence on any locations within a sentence, generating focus prosody for listeners to perceive new information. This study aimed to investigate age-related changes in the bottom-up processing of focus perception in Jianghuai Mandarin by clarifying the perceptual cues and the auditory processing abilities involved in the identification of focus locations. Young, middle-aged, and older speakers of Jianghuai Mandarin completed a focus identification task and an auditory perception task. The results showed that increasing age led to a decrease in listeners' accuracy rate in identifying focus locations, with all participants performing the worst when dynamic pitch cues were inaccessible. Auditory processing abilities did not predict focus perception performance in young and middle-aged listeners but accounted significantly for the variance in older adults' performance. These findings suggest that age-related deteriorations in focus perception can be largely attributed to declined auditory processing of perceptual cues. Poor ability to extract frequency modulation cues may be the most important underlying psychoacoustic factor for older adults' difficulties in perceiving focus prosody in Jianghuai Mandarin. The results contribute to our understanding of the bottom-up mechanisms involved in linguistic prosody processing in aging adults, particularly in tonal languages.


Subject(s)
Aging , Cues , Speech Perception , Humans , Middle Aged , Aged , Male , Female , Aging/psychology , Aging/physiology , Young Adult , Adult , Speech Perception/physiology , Age Factors , Speech Acoustics , Acoustic Stimulation , Pitch Perception , Language , Voice Quality , Psychoacoustics , Audiometry, Speech
7.
JASA Express Lett ; 4(5)2024 May 01.
Article in English | MEDLINE | ID: mdl-38717469

ABSTRACT

The perceptual boundary between short and long categories depends on speech rate. We investigated the influence of speech rate on perceptual boundaries for short and long vowel and consonant contrasts by Spanish-English bilingual listeners and English monolinguals. Listeners tended to adapt their perceptual boundaries to speech rates, but the strategy differed between groups, especially for consonants. Understanding the factors that influence auditory processing in this population is essential for developing appropriate assessments of auditory comprehension. These findings have implications for the clinical care of older populations whose ability to rely on spectral and/or temporal information in the auditory signal may decline.


Subject(s)
Multilingualism , Speech Perception , Humans , Speech Perception/physiology , Female , Male , Adult , Phonetics , Young Adult
8.
Am Ann Deaf ; 168(5): 241-257, 2024.
Article in English | MEDLINE | ID: mdl-38766937

ABSTRACT

Our study investigated the differences in speech performance and neurophysiological response in groups of school-age children with unilateral hearing loss (UHL) who were otherwise typically developing (TD). We recruited a total of 16 primary school-age children for our study (UHL = 9/TD = 7), who were screened by doctors at Shin Kong Wu-Ho-Su Memorial Hospital. We used the Peabody Picture Vocabulary Test-Revised (PPVT-R) to test word comprehension, and the PPVT-R percentile rank (PR) value was proportional to the auditory memory score (by The Children's Oral Comprehension Test) in both groups. Later, we assessed the latency and amplitude of auditory ERP P300 and found that the latency of auditory ERP P300 in the UHL group was prolonged compared with that in the TD group. Although students with UHL have typical hearing in one ear, based on our results, long-term UHL might be the cause of atypical organization of brain areas responsible for auditory processing or even visual perceptions attributed to speech delay and learning difficulties.


Subject(s)
Event-Related Potentials, P300 , Hearing Loss, Unilateral , Humans , Child , Event-Related Potentials, P300/physiology , Male , Female , Hearing Loss, Unilateral/physiopathology , Hearing Loss, Unilateral/rehabilitation , Reaction Time/physiology , Speech Perception/physiology , Evoked Potentials, Auditory/physiology , China , Case-Control Studies , Language , Comprehension
9.
JASA Express Lett ; 4(5)2024 May 01.
Article in English | MEDLINE | ID: mdl-38717468

ABSTRACT

This study evaluated whether adaptive training with time-compressed speech produces an age-dependent improvement in speech recognition in 14 adult cochlear-implant users. The protocol consisted of a pretest, 5 h of training, and a posttest using time-compressed speech and an adaptive procedure. There were significant improvements in time-compressed speech recognition at the posttest session following training (>5% in the average time-compressed speech recognition threshold) but no effects of age. These results are promising for the use of adaptive training in aural rehabilitation strategies for cochlear-implant users across the adult lifespan and possibly using speech signals, such as time-compressed speech, to train temporal processing.


Subject(s)
Cochlear Implants , Speech Perception , Humans , Speech Perception/physiology , Aged , Male , Middle Aged , Female , Adult , Aged, 80 and over , Cochlear Implantation/methods , Time Factors
10.
Trends Hear ; 28: 23312165241239541, 2024.
Article in English | MEDLINE | ID: mdl-38738337

ABSTRACT

Cochlear synaptopathy, a form of cochlear deafferentation, has been demonstrated in a number of animal species, including non-human primates. Both age and noise exposure contribute to synaptopathy in animal models, indicating that it may be a common type of auditory dysfunction in humans. Temporal bone and auditory physiological data suggest that age and occupational/military noise exposure also lead to synaptopathy in humans. The predicted perceptual consequences of synaptopathy include tinnitus, hyperacusis, and difficulty with speech-in-noise perception. However, confirming the perceptual impacts of this form of cochlear deafferentation presents a particular challenge because synaptopathy can only be confirmed through post-mortem temporal bone analysis and auditory perception is difficult to evaluate in animals. Animal data suggest that deafferentation leads to increased central gain, signs of tinnitus and abnormal loudness perception, and deficits in temporal processing and signal-in-noise detection. If equivalent changes occur in humans following deafferentation, this would be expected to increase the likelihood of developing tinnitus, hyperacusis, and difficulty with speech-in-noise perception. Physiological data from humans is consistent with the hypothesis that deafferentation is associated with increased central gain and a greater likelihood of tinnitus perception, while human data on the relationship between deafferentation and hyperacusis is extremely limited. Many human studies have investigated the relationship between physiological correlates of deafferentation and difficulty with speech-in-noise perception, with mixed findings. A non-linear relationship between deafferentation and speech perception may have contributed to the mixed results. When differences in sample characteristics and study measurements are considered, the findings may be more consistent.


Subject(s)
Cochlea , Speech Perception , Tinnitus , Humans , Cochlea/physiopathology , Tinnitus/physiopathology , Tinnitus/diagnosis , Animals , Speech Perception/physiology , Hyperacusis/physiopathology , Noise/adverse effects , Auditory Perception/physiology , Synapses/physiology , Hearing Loss, Noise-Induced/physiopathology , Hearing Loss, Noise-Induced/diagnosis , Loudness Perception
11.
Trends Hear ; 28: 23312165241246596, 2024.
Article in English | MEDLINE | ID: mdl-38738341

ABSTRACT

The auditory brainstem response (ABR) is a valuable clinical tool for objective hearing assessment, which is conventionally detected by averaging neural responses to thousands of short stimuli. Progressing beyond these unnatural stimuli, brainstem responses to continuous speech presented via earphones have been recently detected using linear temporal response functions (TRFs). Here, we extend earlier studies by measuring subcortical responses to continuous speech presented in the sound-field, and assess the amount of data needed to estimate brainstem TRFs. Electroencephalography (EEG) was recorded from 24 normal hearing participants while they listened to clicks and stories presented via earphones and loudspeakers. Subcortical TRFs were computed after accounting for non-linear processing in the auditory periphery by either stimulus rectification or an auditory nerve model. Our results demonstrated that subcortical responses to continuous speech could be reliably measured in the sound-field. TRFs estimated using auditory nerve models outperformed simple rectification, and 16 minutes of data was sufficient for the TRFs of all participants to show clear wave V peaks for both earphones and sound-field stimuli. Subcortical TRFs to continuous speech were highly consistent in both earphone and sound-field conditions, and with click ABRs. However, sound-field TRFs required slightly more data (16 minutes) to achieve clear wave V peaks compared to earphone TRFs (12 minutes), possibly due to effects of room acoustics. By investigating subcortical responses to sound-field speech stimuli, this study lays the groundwork for bringing objective hearing assessment closer to real-life conditions, which may lead to improved hearing evaluations and smart hearing technologies.


Subject(s)
Acoustic Stimulation , Electroencephalography , Evoked Potentials, Auditory, Brain Stem , Speech Perception , Humans , Evoked Potentials, Auditory, Brain Stem/physiology , Male , Female , Speech Perception/physiology , Acoustic Stimulation/methods , Adult , Young Adult , Auditory Threshold/physiology , Time Factors , Cochlear Nerve/physiology , Healthy Volunteers
12.
Cogn Res Princ Implic ; 9(1): 29, 2024 05 12.
Article in English | MEDLINE | ID: mdl-38735013

ABSTRACT

Auditory stimuli that are relevant to a listener have the potential to capture focal attention even when unattended, the listener's own name being a particularly effective stimulus. We report two experiments to test the attention-capturing potential of the listener's own name in normal speech and time-compressed speech. In Experiment 1, 39 participants were tested with a visual word categorization task with uncompressed spoken names as background auditory distractors. Participants' word categorization performance was slower when hearing their own name rather than other names, and in a final test, they were faster at detecting their own name than other names. Experiment 2 used the same task paradigm, but the auditory distractors were time-compressed names. Three compression levels were tested with 25 participants in each condition. Participants' word categorization performance was again slower when hearing their own name than when hearing other names; the slowing was strongest with slight compression and weakest with intense compression. Personally relevant time-compressed speech has the potential to capture attention, but the degree of capture depends on the level of compression. Attention capture by time-compressed speech has practical significance and provides partial evidence for the duplex-mechanism account of auditory distraction.


Subject(s)
Attention , Names , Speech Perception , Humans , Attention/physiology , Female , Male , Speech Perception/physiology , Adult , Young Adult , Speech/physiology , Reaction Time/physiology , Acoustic Stimulation
13.
Cogn Sci ; 48(5): e13449, 2024 May.
Article in English | MEDLINE | ID: mdl-38773754

ABSTRACT

We recently reported strong, replicable (i.e., replicated) evidence for lexically mediated compensation for coarticulation (LCfC; Luthra et al., 2021), whereby lexical knowledge influences a prelexical process. Critically, evidence for LCfC provides robust support for interactive models of cognition that include top-down feedback and is inconsistent with autonomous models that allow only feedforward processing. McQueen, Jesse, and Mitterer (2023) offer five counter-arguments against our interpretation; we respond to each of those arguments here and conclude that top-down feedback provides the most parsimonious explanation of extant data.


Subject(s)
Speech Perception , Humans , Speech Perception/physiology , Cognition , Language
14.
Sci Rep ; 14(1): 11491, 2024 05 20.
Article in English | MEDLINE | ID: mdl-38769115

ABSTRACT

Several attempts for speech brain-computer interfacing (BCI) have been made to decode phonemes, sub-words, words, or sentences using invasive measurements, such as the electrocorticogram (ECoG), during auditory speech perception, overt speech, or imagined (covert) speech. Decoding sentences from covert speech is a challenging task. Sixteen epilepsy patients with intracranially implanted electrodes participated in this study, and ECoGs were recorded during overt speech and covert speech of eight Japanese sentences, each consisting of three tokens. In particular, Transformer neural network model was applied to decode text sentences from covert speech, which was trained using ECoGs obtained during overt speech. We first examined the proposed Transformer model using the same task for training and testing, and then evaluated the model's performance when trained with overt task for decoding covert speech. The Transformer model trained on covert speech achieved an average token error rate (TER) of 46.6% for decoding covert speech, whereas the model trained on overt speech achieved a TER of 46.3% ( p > 0.05 ; d = 0.07 ) . Therefore, the challenge of collecting training data for covert speech can be addressed using overt speech. The performance of covert speech can improve by employing several overt speeches.


Subject(s)
Brain-Computer Interfaces , Electrocorticography , Speech , Humans , Female , Male , Adult , Speech/physiology , Speech Perception/physiology , Young Adult , Feasibility Studies , Epilepsy/physiopathology , Neural Networks, Computer , Middle Aged , Adolescent
15.
Cereb Cortex ; 34(13): 84-93, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38696598

ABSTRACT

Multimodal integration is crucial for human interaction, in particular for social communication, which relies on integrating information from various sensory modalities. Recently a third visual pathway specialized in social perception was proposed, which includes the right superior temporal sulcus (STS) playing a key role in processing socially relevant cues and high-level social perception. Importantly, it has also recently been proposed that the left STS contributes to audiovisual integration of speech processing. In this article, we propose that brain areas along the right STS that support multimodal integration for social perception and cognition can be considered homologs to those in the left, language-dominant hemisphere, sustaining multimodal integration of speech and semantic concepts fundamental for social communication. Emphasizing the significance of the left STS in multimodal integration and associated processes such as multimodal attention to socially relevant stimuli, we underscore its potential relevance in comprehending neurodevelopmental conditions characterized by challenges in social communication such as autism spectrum disorder (ASD). Further research into this left lateral processing stream holds the promise of enhancing our understanding of social communication in both typical development and ASD, which may lead to more effective interventions that could improve the quality of life for individuals with atypical neurodevelopment.


Subject(s)
Social Cognition , Speech Perception , Temporal Lobe , Humans , Temporal Lobe/physiology , Temporal Lobe/physiopathology , Speech Perception/physiology , Social Perception , Autistic Disorder/physiopathology , Autistic Disorder/psychology , Functional Laterality/physiology
16.
J Neural Eng ; 21(3)2024 May 22.
Article in English | MEDLINE | ID: mdl-38729132

ABSTRACT

Objective.This study develops a deep learning (DL) method for fast auditory attention decoding (AAD) using electroencephalography (EEG) from listeners with hearing impairment (HI). It addresses three classification tasks: differentiating noise from speech-in-noise, classifying the direction of attended speech (left vs. right) and identifying the activation status of hearing aid noise reduction algorithms (OFF vs. ON). These tasks contribute to our understanding of how hearing technology influences auditory processing in the hearing-impaired population.Approach.Deep convolutional neural network (DCNN) models were designed for each task. Two training strategies were employed to clarify the impact of data splitting on AAD tasks: inter-trial, where the testing set used classification windows from trials that the training set had not seen, and intra-trial, where the testing set used unseen classification windows from trials where other segments were seen during training. The models were evaluated on EEG data from 31 participants with HI, listening to competing talkers amidst background noise.Main results.Using 1 s classification windows, DCNN models achieve accuracy (ACC) of 69.8%, 73.3% and 82.9% and area-under-curve (AUC) of 77.2%, 80.6% and 92.1% for the three tasks respectively on inter-trial strategy. In the intra-trial strategy, they achieved ACC of 87.9%, 80.1% and 97.5%, along with AUC of 94.6%, 89.1%, and 99.8%. Our DCNN models show good performance on short 1 s EEG samples, making them suitable for real-world applications. Conclusion: Our DCNN models successfully addressed three tasks with short 1 s EEG windows from participants with HI, showcasing their potential. While the inter-trial strategy demonstrated promise for assessing AAD, the intra-trial approach yielded inflated results, underscoring the important role of proper data splitting in EEG-based AAD tasks.Significance.Our findings showcase the promising potential of EEG-based tools for assessing auditory attention in clinical contexts and advancing hearing technology, while also promoting further exploration of alternative DL architectures and their potential constraints.


Subject(s)
Attention , Auditory Perception , Deep Learning , Electroencephalography , Hearing Loss , Humans , Attention/physiology , Female , Electroencephalography/methods , Male , Middle Aged , Hearing Loss/physiopathology , Hearing Loss/rehabilitation , Hearing Loss/diagnosis , Aged , Auditory Perception/physiology , Noise , Adult , Hearing Aids , Speech Perception/physiology , Neural Networks, Computer
17.
Multisens Res ; 37(2): 125-141, 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38714314

ABSTRACT

Trust is an aspect critical to human social interaction and research has identified many cues that help in the assimilation of this social trait. Two of these cues are the pitch of the voice and the width-to-height ratio of the face (fWHR). Additionally, research has indicated that the content of a spoken sentence itself has an effect on trustworthiness; a finding that has not yet been brought into multisensory research. The current research aims to investigate previously developed theories on trust in relation to vocal pitch, fWHR, and sentence content in a multimodal setting. Twenty-six female participants were asked to judge the trustworthiness of a voice speaking a neutral or romantic sentence while seeing a face. The average pitch of the voice and the fWHR were varied systematically. Results indicate that the content of the spoken message was an important predictor of trustworthiness extending into multimodality. Further, the mean pitch of the voice and fWHR of the face appeared to be useful indicators in a multimodal setting. These effects interacted with one another across modalities. The data demonstrate that trust in the voice is shaped by task-irrelevant visual stimuli. Future research is encouraged to clarify whether these findings remain consistent across genders, age groups, and languages.


Subject(s)
Face , Trust , Voice , Humans , Female , Voice/physiology , Young Adult , Adult , Face/physiology , Speech Perception/physiology , Pitch Perception/physiology , Facial Recognition/physiology , Cues , Adolescent
18.
Cereb Cortex ; 34(5)2024 May 02.
Article in English | MEDLINE | ID: mdl-38715408

ABSTRACT

Speech comprehension in noise depends on complex interactions between peripheral sensory and central cognitive systems. Despite having normal peripheral hearing, older adults show difficulties in speech comprehension. It remains unclear whether the brain's neural responses could indicate aging. The current study examined whether individual brain activation during speech perception in different listening environments could predict age. We applied functional near-infrared spectroscopy to 93 normal-hearing human adults (20 to 70 years old) during a sentence listening task, which contained a quiet condition and 4 different signal-to-noise ratios (SNR = 10, 5, 0, -5 dB) noisy conditions. A data-driven approach, the region-based brain-age predictive modeling was adopted. We observed a significant behavioral decrease with age under the 4 noisy conditions, but not under the quiet condition. Brain activations in SNR = 10 dB listening condition could successfully predict individual's age. Moreover, we found that the bilateral visual sensory cortex, left dorsal speech pathway, left cerebellum, right temporal-parietal junction area, right homolog Wernicke's area, and right middle temporal gyrus contributed most to prediction performance. These results demonstrate that the activations of regions about sensory-motor mapping of sound, especially in noisy conditions, could be sensitive measures for age prediction than external behavior measures.


Subject(s)
Aging , Brain , Comprehension , Noise , Spectroscopy, Near-Infrared , Speech Perception , Humans , Adult , Speech Perception/physiology , Male , Female , Spectroscopy, Near-Infrared/methods , Middle Aged , Young Adult , Aged , Comprehension/physiology , Brain/physiology , Brain/diagnostic imaging , Aging/physiology , Brain Mapping/methods , Acoustic Stimulation/methods
19.
Otol Neurotol ; 45(5): e381-e384, 2024 Jun 01.
Article in English | MEDLINE | ID: mdl-38728553

ABSTRACT

OBJECTIVE: To examine patient preference after stapedotomy versus cochlear implantation in a unique case of a patient with symmetrical profound mixed hearing loss and similar postoperative speech perception improvement. PATIENTS: An adult patient with bilateral symmetrical far advanced otosclerosis, with profound mixed hearing loss. INTERVENTION: Stapedotomy in the left ear, cochlear implantation in the right ear. MAIN OUTCOME MEASURE: Performance on behavioral audiometry, and subjective report of hearing and intervention preference. RESULTS: A patient successfully underwent left stapedotomy and subsequent cochlear implantation on the right side, per patient preference. Preoperative audiometric characteristics were similar between ears (pure-tone average [PTA] [R: 114; L: 113 dB]; word recognition score [WRS]: 22%). Postprocedural audiometry demonstrated significant improvement after stapedotomy (PTA: 59 dB, WRS: 75%) and from cochlear implant (PTA: 20 dB, WRS: 60%). The patient subjectively reported a preference for the cochlear implant ear despite having substantial gains from stapedotomy. A nuanced discussion highlighting potentially overlooked benefits of cochlear implants in far advanced otosclerosis is conducted. CONCLUSION: In comparison with stapedotomy and hearing aids, cochlear implantation generally permits greater access to sound among patients with far advanced otosclerosis. Though the cochlear implant literature mainly focuses on speech perception outcomes, an underappreciated benefit of cochlear implantation is the high likelihood of achieving "normal" sound levels across the audiogram.


Subject(s)
Cochlear Implantation , Otosclerosis , Speech Perception , Stapes Surgery , Humans , Otosclerosis/surgery , Stapes Surgery/methods , Cochlear Implantation/methods , Speech Perception/physiology , Treatment Outcome , Male , Middle Aged , Hearing Loss, Mixed Conductive-Sensorineural/surgery , Audiometry, Pure-Tone , Patient Preference , Female , Adult
20.
Int J Pediatr Otorhinolaryngol ; 180: 111968, 2024 May.
Article in English | MEDLINE | ID: mdl-38714045

ABSTRACT

AIM & OBJECTIVES: The study aimed to compare P1 latency and P1-N1 amplitude with receptive and expressive language ages in children using cochlear implant (CI) in one ear and a hearing aid (HA) in non-implanted ear. METHODS: The study included 30 children, consisting of 18 males and 12 females, aged between 48 and 96 months. The age at which the children received CI ranged from 42 to 69 months. A within-subject research design was utilized and participants were selected through purposive sampling. Auditory late latency responses (ALLR) were assessed using the Intelligent hearing system to measure P1 latency and P1-N1 amplitude. The assessment checklist for speech-language skills (ACSLS) was employed to evaluate receptive and expressive language age. Both assessments were conducted after cochlear implantation. RESULTS: A total of 30 children participated in the study, with a mean implant age of 20.03 months (SD: 8.14 months). The mean P1 latency and P1-N1 amplitude was 129.50 ms (SD: 15.05 ms) and 6.93 µV (SD: 2.24 µV) respectively. Correlation analysis revealed no significant association between ALLR measures and receptive or expressive language ages. However, there was significant negative correlation between the P1 latency and implant age (Spearman's rho = -0.371, p = 0.043). CONCLUSIONS: The study suggests that P1 latency which is an indicative of auditory maturation, may not be a reliable marker for predicting language outcomes. It can be concluded that language development is likely to be influenced by other factors beyond auditory maturation alone.


Subject(s)
Cochlear Implants , Language Development , Humans , Male , Female , Child, Preschool , Child , Cochlear Implantation/methods , Reaction Time/physiology , Deafness/surgery , Deafness/rehabilitation , Evoked Potentials, Auditory/physiology , Age Factors , Speech Perception/physiology
SELECTION OF CITATIONS
SEARCH DETAIL
...