Pesquisa | Portal Regional da BVS (teste)

The role of vowel and consonant onsets in neural tracking of natural speech.

Jalilpour Monesi, Mohammad; Vanthornhout, Jonas; Francart, Tom; Van Hamme, Hugo.

J Neural Eng ; 21(1)2024 01 11.

Artigo em Inglês | MEDLINE | ID: mdl-38205849

RESUMO

Objective. To investigate how the auditory system processes natural speech, models have been created to relate the electroencephalography (EEG) signal of a person listening to speech to various representations of the speech. Mainly the speech envelope has been used, but also phonetic representations. We investigated to which degree of granularity phonetic representations can be related to the EEG signal.Approach. We used recorded EEG signals from 105 subjects while they listened to fairy tale stories. We utilized speech representations, including onset of any phone, vowel-consonant onsets, broad phonetic class (BPC) onsets, and narrow phonetic class onsets, and related them to EEG using forward modeling and match-mismatch tasks. In forward modeling, we used a linear model to predict EEG from speech representations. In the match-mismatch task, we trained a long short term memory based model to determine which of two candidate speech segments matches with a given EEG segment.Main results. Our results show that vowel-consonant onsets outperform onsets of any phone in both tasks, which suggests that neural tracking of the vowel vs. consonant exists in the EEG to some degree. We also observed that vowel (syllable nucleus) onsets exhibit a more consistent representation in EEG compared to syllable onsets.Significance. Finally, our findings suggest that neural tracking previously thought to be associated with BPCs might actually originate from vowel-consonant onsets rather than the differentiation between different phonetic classes.

Assuntos

Eletroencefalografia , Fala , Humanos , Modelos Lineares

Robust neural tracking of linguistic speech representations using a convolutional neural network.

Puffay, Corentin; Vanthornhout, Jonas; Gillis, Marlies; Accou, Bernd; Van Hamme, Hugo; Francart, Tom.

J Neural Eng ; 20(4)2023 08 30.

Artigo em Inglês | MEDLINE | ID: mdl-37595606

RESUMO

Objective.When listening to continuous speech, populations of neurons in the brain track different features of the signal. Neural tracking can be measured by relating the electroencephalography (EEG) and the speech signal. Recent studies have shown a significant contribution of linguistic features over acoustic neural tracking using linear models. However, linear models cannot model the nonlinear dynamics of the brain. To overcome this, we use a convolutional neural network (CNN) that relates EEG to linguistic features using phoneme or word onsets as a control and has the capacity to model non-linear relations.Approach.We integrate phoneme- and word-based linguistic features (phoneme surprisal, cohort entropy (CE), word surprisal (WS) and word frequency (WF)) in our nonlinear CNN model and investigate if they carry additional information on top of lexical features (phoneme and word onsets). We then compare the performance of our nonlinear CNN with that of a linear encoder and a linearized CNN.Main results.For the non-linear CNN, we found a significant contribution of CE over phoneme onsets and of WS and WF over word onsets. Moreover, the non-linear CNN outperformed the linear baselines.Significance.Measuring coding of linguistic features in the brain is important for auditory neuroscience research and applications that involve objectively measuring speech understanding. With linear models, this is measurable, but the effects are very small. The proposed non-linear CNN model yields larger differences between linguistic and lexical models and, therefore, could show effects that would otherwise be unmeasurable and may, in the future, lead to improved within-subject measures and shorter recordings.

Assuntos

Neurônios , Fala , Humanos , Nervo Coclear , Linguística , Redes Neurais de Computação

Relating EEG to continuous speech using deep neural networks: a review.

Puffay, Corentin; Accou, Bernd; Bollens, Lies; Monesi, Mohammad Jalilpour; Vanthornhout, Jonas; Van Hamme, Hugo; Francart, Tom.

J Neural Eng ; 20(4)2023 08 03.

Artigo em Inglês | MEDLINE | ID: mdl-37442115

RESUMO

Objective.When a person listens to continuous speech, a corresponding response is elicited in the brain and can be recorded using electroencephalography (EEG). Linear models are presently used to relate the EEG recording to the corresponding speech signal. The ability of linear models to find a mapping between these two signals is used as a measure of neural tracking of speech. Such models are limited as they assume linearity in the EEG-speech relationship, which omits the nonlinear dynamics of the brain. As an alternative, deep learning models have recently been used to relate EEG to continuous speech.Approach.This paper reviews and comments on deep-learning-based studies that relate EEG to continuous speech in single- or multiple-speakers paradigms. We point out recurrent methodological pitfalls and the need for a standard benchmark of model analysis.Main results.We gathered 29 studies. The main methodological issues we found are biased cross-validations, data leakage leading to over-fitted models, or disproportionate data size compared to the model's complexity. In addition, we address requirements for a standard benchmark model analysis, such as public datasets, common evaluation metrics, and good practices for the match-mismatch task.Significance.We present a review paper summarizing the main deep-learning-based studies that relate EEG to speech while addressing methodological pitfalls and important considerations for this newly expanding field. Our study is particularly relevant given the growing application of deep learning in EEG-speech decoding.

Assuntos

Eletroencefalografia , Fala , Humanos , Fala/fisiologia , Eletroencefalografia/métodos , Redes Neurais de Computação , Encéfalo/fisiologia , Percepção Auditiva/fisiologia

Predicting speech intelligibility from EEG in a non-linear classification paradigm.

Accou, Bernd; Jalilpour Monesi, Mohammad; Van Hamme, Hugo; Francart, Tom.

J Neural Eng ; 18(6)2021 11 15.

Artigo em Inglês | MEDLINE | ID: mdl-34706347

RESUMO

Objective.Currently, only behavioral speech understanding tests are available, which require active participation of the person being tested. As this is infeasible for certain populations, an objective measure of speech intelligibility is required. Recently, brain imaging data has been used to establish a relationship between stimulus and brain response. Linear models have been successfully linked to speech intelligibility but require per-subject training. We present a deep-learning-based model incorporating dilated convolutions that operates in a match/mismatch paradigm. The accuracy of the model's match/mismatch predictions can be used as a proxy for speech intelligibility without subject-specific (re)training.Approach.We evaluated the performance of the model as a function of input segment length, electroencephalography (EEG) frequency band and receptive field size while comparing it to multiple baseline models. Next, we evaluated performance on held-out data and finetuning. Finally, we established a link between the accuracy of our model and the state-of-the-art behavioral MATRIX test.Main results.The dilated convolutional model significantly outperformed the baseline models for every input segment length, for all EEG frequency bands except the delta and theta band, and receptive field sizes between 250 and 500 ms. Additionally, finetuning significantly increased the accuracy on a held-out dataset. Finally, a significant correlation (r= 0.59,p= 0.0154) was found between the speech reception threshold (SRT) estimated using the behavioral MATRIX test and our objective method.Significance.Our method is the first to predict the SRT from EEG for unseen subjects, contributing to objective measures of speech intelligibility.

Assuntos

Inteligibilidade da Fala , Percepção da Fala , Estimulação Acústica , Encéfalo , Eletroencefalografia/métodos , Audição/fisiologia , Humanos , Inteligibilidade da Fala/fisiologia , Percepção da Fala/fisiologia

Linear versus deep learning methods for noisy speech separation for EEG-informed attention decoding.

Das, Neetha; Zegers, Jeroen; Van Hamme, Hugo; Francart, Tom; Bertrand, Alexander.

J Neural Eng ; 17(4): 046039, 2020 08 19.

Artigo em Inglês | MEDLINE | ID: mdl-32679578

RESUMO

OBJECTIVE: A hearing aid's noise reduction algorithm cannot infer to which speaker the user intends to listen to. Auditory attention decoding (AAD) algorithms allow to infer this information from neural signals, which leads to the concept of neuro-steered hearing aids. We aim to evaluate and demonstrate the feasibility of AAD-supported speech enhancement in challenging noisy conditions based on electroencephalography recordings. APPROACH: The AAD performance with a linear versus a deep neural network (DNN) based speaker separation was evaluated for same-gender speaker mixtures using three different speaker positions and three different noise conditions. MAIN RESULTS: AAD results based on the linear approach were found to be at least on par and sometimes even better than pure DNN-based approaches in terms of AAD accuracy in all tested conditions. However, when using the DNN to support a linear data-driven beamformer, a performance improvement over the purely linear approach was obtained in the most challenging scenarios. The use of multiple microphones was also found to improve speaker separation and AAD performance over single-microphone systems. SIGNIFICANCE: Recent proof-of-concept studies in this context each focus on a different method in a different experimental setting, which makes it hard to compare them. Furthermore, they are tested in highly idealized experimental conditions, which are still far from a realistic hearing aid setting. This work provides a systematic comparison of a linear and non-linear neuro-steered speech enhancement model, as well as a more realistic validation in challenging conditions.

Assuntos

Aprendizado Profundo , Percepção da Fala , Estimulação Acústica , Atenção , Eletroencefalografia , Fala

Automated speech analysis to improve TMS-based language mapping: Algorithm and proof of concept.

Seynaeve, Laura; Baby, Deepak; Van Hamme, Hugo; De Vleeschouwer, Steven; Dupont, Patrick; Van Paesschen, Wim.

Brain Stimul ; 13(1): 267-269, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-31631056

Monitoring activities of daily living using Wireless Acoustic Sensor Networks in clean and noisy conditions.

Vuegen, Lode; Van Den Broeck, Bert; Karsmakers, Peter; Van hamme, Hugo; Vanrumste, Bart.

Annu Int Conf IEEE Eng Med Biol Soc ; 2015: 4966-9, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26737406

RESUMO

This work examines the use of a Wireless Acoustic Sensor Network (WASN) for the classification of clinically relevant activities of daily living (ADL) of elderly people. The aim of this research is to automatically compile a summary report about the performed ADLs which can be easily interpreted by caregivers. In this work, the classification performance of the WASN will be evaluated in both clean and noisy conditions. Results indicate that the classification performance of the WASN is 75.3±4.3% on clean acoustic data selected from the node receiving with the highest SNR. By incorporating spatial information extracted by the WASN, the classification accuracy further increases to 78.6±1.4%. In addition, the classification performance of the WASN in noisy conditions is in absolute average 8.1% to 9.0% more accurate compared to highest obtained single microphone results.

Assuntos

Acústica/instrumentação , Atividades Cotidianas , Monitorização Ambulatorial/métodos , Tecnologia sem Fio , Idoso , Cuidadores , Humanos , Monitorização Ambulatorial/instrumentação , Processamento de Sinais Assistido por Computador , Razão Sinal-Ruído

Speech recognition technology in CI rehabilitation.

Nogueira, Waldo; Vanpoucke, Filiep; Dykmans, Philippe; De Raeve, Leo; Van Hamme, Hugo; Roelens, Jan.

Cochlear Implants Int ; 11 Suppl 1: 449-53, 2010 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-21756671

Assuntos

Implante Coclear/reabilitação , Surdez/cirurgia , Interface para o Reconhecimento da Fala , Fonoterapia/instrumentação , Adolescente , Adulto , Bélgica , Criança , Implante Coclear/métodos , Implantes Cocleares , Surdez/diagnóstico , Surdez/reabilitação , Feminino , Humanos , Masculino , Adulto Jovem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA