Search | VHL Regional Portal

1.

Electrocochleographic frequency-following responses as a potential marker of age-related cochlear neural degeneration.

Temboury-Gutierrez, Miguel; Märcher-Rørsted, Jonatan; Bille, Michael; Yde, Jesper; Encina-Llamas, Gerard; Hjortkjær, Jens; Dau, Torsten.

Hear Res ; 446: 109005, 2024 May.

Article in English | MEDLINE | ID: mdl-38598943

ABSTRACT

Auditory nerve (AN) fibers that innervate inner hair cells in the cochlea degenerate with advancing age. It has been proposed that age-related reductions in brainstem frequency-following responses (FFR) to the carrier of low-frequency, high-intensity pure tones may partially reflect this neural loss in the cochlea (Märcher-Rørsted et al., 2022). If the loss of AN fibers is the primary factor contributing to age-related changes in the brainstem FFR, then the FFR could serve as an indicator of cochlear neural degeneration. In this study, we employed electrocochleography (ECochG) to investigate the effects of age on frequency-following neurophonic potentials, i.e., neural responses phase-locked to the carrier frequency of the tone stimulus. We compared these findings to the brainstem-generated FFRs obtained simultaneously using the same stimulation. We conducted recordings in young and older individuals with normal hearing. Responses to pure tones (250 ms, 516 and 1086 Hz, 85 dB SPL) and clicks were recorded using both ECochG at the tympanic membrane and traditional scalp electroencephalographic (EEG) recordings of the FFR. Distortion product otoacoustic emissions (DPOAE) were also collected. In the ECochG recordings, sustained AN neurophonic (ANN) responses to tonal stimulation, as well as the click-evoked compound action potential (CAP) of the AN, were significantly reduced in the older listeners compared to young controls, despite normal audiometric thresholds. In the EEG recordings, brainstem FFRs to the same tone stimulation were also diminished in the older participants. Unlike the reduced AN CAP response, the transient-evoked wave-V remained unaffected. These findings could indicate that a decreased number of AN fibers contributes to the response in the older participants. The results suggest that the scalp-recorded FFR, as opposed to the clinical standard wave-V of the auditory brainstem response, may serve as a more reliable indicator of age-related cochlear neural degeneration.

Subject(s)

Acoustic Stimulation , Aging , Audiometry, Evoked Response , Cochlea , Cochlear Nerve , Evoked Potentials, Auditory, Brain Stem , Nerve Degeneration , Humans , Female , Cochlea/physiopathology , Cochlea/innervation , Adult , Aged , Male , Middle Aged , Young Adult , Age Factors , Cochlear Nerve/physiopathology , Aging/physiology , Electroencephalography , Audiometry, Pure-Tone , Auditory Threshold , Presbycusis/physiopathology , Presbycusis/diagnosis , Predictive Value of Tests , Time Factors

2.

Modulation transfer functions for audiovisual speech.

Pedersen, Nicolai F; Dau, Torsten; Hansen, Lars Kai; Hjortkjær, Jens.

PLoS Comput Biol ; 18(7): e1010273, 2022 07.

Article in English | MEDLINE | ID: mdl-35852989

ABSTRACT

Temporal synchrony between facial motion and acoustic modulations is a hallmark feature of audiovisual speech. The moving face and mouth during natural speech is known to be correlated with low-frequency acoustic envelope fluctuations (below 10 Hz), but the precise rates at which envelope information is synchronized with motion in different parts of the face are less clear. Here, we used regularized canonical correlation analysis (rCCA) to learn speech envelope filters whose outputs correlate with motion in different parts of the speakers face. We leveraged recent advances in video-based 3D facial landmark estimation allowing us to examine statistical envelope-face correlations across a large number of speakers (â¼4000). Specifically, rCCA was used to learn modulation transfer functions (MTFs) for the speech envelope that significantly predict correlation with facial motion across different speakers. The AV analysis revealed bandpass speech envelope filters at distinct temporal scales. A first set of MTFs showed peaks around 3-4 Hz and were correlated with mouth movements. A second set of MTFs captured envelope fluctuations in the 1-2 Hz range correlated with more global face and head motion. These two distinctive timescales emerged only as a property of natural AV speech statistics across many speakers. A similar analysis of fewer speakers performing a controlled speech task highlighted only the well-known temporal modulations around 4 Hz correlated with orofacial motion. The different bandpass ranges of AV correlation align notably with the average rates at which syllables (3-4 Hz) and phrases (1-2 Hz) are produced in natural speech. Whereas periodicities at the syllable rate are evident in the envelope spectrum of the speech signal itself, slower 1-2 Hz regularities thus only become prominent when considering crossmodal signal statistics. This may indicate a motor origin of temporal regularities at the timescales of syllables and phrases in natural speech.

Subject(s)

Speech Perception , Speech , Acoustic Stimulation , Acoustics , Time Factors

3.

Editorial: Neural Tracking: Closing the Gap Between Neurophysiology and Translational Medicine.

Di Liberto, Giovanni M; Hjortkjær, Jens; Mesgarani, Nima.

Front Neurosci ; 16: 872600, 2022.

Article in English | MEDLINE | ID: mdl-35368278

4.

Mapping cortico-subcortical sensitivity to 4 Hz amplitude modulation depth in human auditory system with functional MRI.

Fuglsang, Søren A; Madsen, Kristoffer H; Puonti, Oula; Hjortkjær, Jens; Siebner, Hartwig R.

Neuroimage ; 246: 118745, 2022 02 01.

Article in English | MEDLINE | ID: mdl-34808364

ABSTRACT

Temporal modulations in the envelope of acoustic waveforms at rates around 4 Hz constitute a strong acoustic cue in speech and other natural sounds. It is often assumed that the ascending auditory pathway is increasingly sensitive to slow amplitude modulation (AM), but sensitivity to AM is typically considered separately for individual stages of the auditory system. Here, we used blood oxygen level dependent (BOLD) fMRI in twenty human subjects (10 male) to measure sensitivity of regional neural activity in the auditory system to 4 Hz temporal modulations. Participants were exposed to AM noise stimuli varying parametrically in modulation depth to characterize modulation-depth effects on BOLD responses. A Bayesian hierarchical modeling approach was used to model potentially nonlinear relations between AM depth and group-level BOLD responses in auditory regions of interest (ROIs). Sound stimulation activated the auditory brainstem and cortex structures in single subjects. BOLD responses to noise exposure in core and belt auditory cortices scaled positively with modulation depth. This finding was corroborated by whole-brain cluster-level inference. Sensitivity to AM depth variations was particularly pronounced in the Heschl's gyrus but also found in higher-order auditory cortical regions. None of the sound-responsive subcortical auditory structures showed a BOLD response profile that reflected the parametric variation in AM depth. The results are compatible with the notion that early auditory cortical regions play a key role in processing low-rate modulation content of sounds in the human auditory system.

Subject(s)

Auditory Cortex/physiology , Auditory Perception/physiology , Brain Mapping/methods , Brain Stem/physiology , Magnetic Resonance Imaging/methods , Acoustic Stimulation , Adult , Auditory Cortex/diagnostic imaging , Brain Stem/diagnostic imaging , Female , Humans , Male , Young Adult

5.

Age-related reduction in frequency-following responses as a potential marker of cochlear neural degeneration.

Märcher-Rørsted, Jonatan; Encina-Llamas, Gerard; Dau, Torsten; Liberman, M Charles; Wu, Pei-Zhe; Hjortkjær, Jens.

Hear Res ; 414: 108411, 2022 02.

Article in English | MEDLINE | ID: mdl-34929535

ABSTRACT

Healthy aging may be associated with neural degeneration in the cochlea even before clinical hearing loss emerges. Reduction in frequency-following responses (FFRs) to tonal carriers in older clinically normal-hearing listeners has previously been reported, and has been argued to reflect an age-dependent decline in temporal processing in the central auditory system. Alternatively, age-dependent loss of auditory nerve fibers (ANFs) may have little effect on audiometric sensitivity and yet compromise the precision of neural phase-locking relying on joint activity across populations of fibers. This peripheral loss may, in turn, contribute to reduced neural synchrony in the brainstem as reflected in the FFR. Here, we combined human electrophysiology and auditory nerve (AN) modeling to investigate whether age-related changes in the FFR would be consistent with peripheral neural degeneration. FFRs elicited by pure tones and frequency sweeps at carrier frequencies between 200 and 1200 Hz were obtained in older (ages 48-76) and younger (ages 20-30) listeners, both groups having clinically normal audiometric thresholds up to 6 kHz. The same stimuli were presented to a computational model of the AN in which age-related loss of hair cells or ANFs was modelled using human histopathological data. In the older human listeners, the measured FFRs to both sweeps and pure tones were found to be reduced across the carrier frequencies examined. These FFR reductions were consistent with model simulations of age-related ANF loss. In model simulations, the phase-locked response produced by the population of remaining fibers decreased proportionally with increasing loss of the ANFs. Basal-turn loss of inner hair cells also reduced synchronous activity at lower frequencies, albeit to a lesser degree. Model simulations of age-related threshold elevation further indicated that outer hair cell dysfunction had no negative effect on phase-locked AN responses. These results are consistent with a peripheral source of the FFR reductions observed in older normal-hearing listeners, and indicate that FFRs at lower carrier frequencies may potentially be a sensitive marker of peripheral neural degeneration.

Subject(s)

Cochlear Nerve , Hearing Loss, Sensorineural , Adult , Aged , Audiometry , Auditory Threshold/physiology , Hair Cells, Auditory, Inner , Hair Cells, Auditory, Outer , Humans , Middle Aged , Young Adult

6.

Identification and Discrimination of Sound Textures in Hearing-Impaired and Older Listeners.

Scheuregger, Oliver; Hjortkjær, Jens; Dau, Torsten.

Trends Hear ; 25: 23312165211065608, 2021.

Article in English | MEDLINE | ID: mdl-34939472

ABSTRACT

Sound textures are a broad class of sounds defined by their homogeneous temporal structure. It has been suggested that sound texture perception is mediated by time-averaged summary statistics measured from early stages of the auditory system. The ability of young normal-hearing (NH) listeners to identify synthetic sound textures increases as the statistics of the synthetic texture approach those of its real-world counterpart. In sound texture discrimination, young NH listeners utilize the fine temporal stimulus information for short-duration stimuli, whereas they switch to a time-averaged statistical representation as the stimulus' duration increases. The present study investigated how younger and older listeners with a sensorineural hearing impairment perform in the corresponding texture identification and discrimination tasks in which the stimuli were amplified to compensate for the individual listeners' loss of audibility. In both hearing impaired (HI) listeners and NH controls, sound texture identification performance increased as the number of statistics imposed during the synthesis stage increased, but hearing impairment was accompanied by a significant reduction in overall identification accuracy. Sound texture discrimination performance was measured across listener groups categorized by age and hearing loss. Sound texture discrimination performance was unaffected by hearing loss at all excerpt durations. The older listeners' sound texture and exemplar discrimination performance decreased for signals of short excerpt duration, with older HI listeners performing better than older NH listeners. The results suggest that the time-averaged statistic representations of sound textures provide listeners with cues which are robust to the effects of age and sensorineural hearing loss.

Subject(s)

Hearing Loss, Sensorineural , Hearing Loss , Speech Perception , Auditory Perception , Auditory Threshold , Hearing , Hearing Loss, Sensorineural/diagnosis , Humans , Sound

7.

Auditory stimulus-response modeling with a match-mismatch task.

de Cheveigné, Alain; Slaney, Malcolm; Fuglsang, Søren A; Hjortkjaer, Jens.

J Neural Eng ; 18(4)2021 05 04.

Article in English | MEDLINE | ID: mdl-33849003

ABSTRACT

Objective.An auditory stimulus can be related to the brain response that it evokes by a stimulus-response model fit to the data. This offers insight into perceptual processes within the brain and is also of potential use for devices such as brain computer interfaces (BCIs). The quality of the model can be quantified by measuring the fit with a regression problem, or by applying it to a classification task and measuring its performance.Approach.Here we focus on amatch-mismatch(MM) task that entails deciding whether a segment of brain signal matches, via a model, the auditory stimulus that evoked it.Main results. Using these metrics, we describe a range of models of increasing complexity that we compare to methods in the literature, showing state-of-the-art performance. We document in detail one particular implementation, calibrated on a publicly-available database, that can serve as a robust reference to evaluate future developments.Significance.The MM task allows stimulus-response models to be evaluated in the limit of very high model accuracy, making it an attractive alternative to the more commonly used task of auditory attention detection. The MM task does not require class labels, so it is immune to mislabeling, and it is applicable to data recorded in listening scenarios with only one sound source, thus it is cheap to obtain large quantities of training and testing data. Performance metrics from this task, associated with regression accuracy, provide complementary insights into the relation between stimulus and response, as well as information about discriminatory power directly applicable to BCI applications.

Subject(s)

Brain-Computer Interfaces , Electroencephalography , Attention , Auditory Perception , Brain

8.

Neural Measures of Pitch Processing in EEG Responses to Running Speech.

Bachmann, Florine L; MacDonald, Ewen N; Hjortkjær, Jens.

Front Neurosci ; 15: 738408, 2021.

Article in English | MEDLINE | ID: mdl-35002597

ABSTRACT

Linearized encoding models are increasingly employed to model cortical responses to running speech. Recent extensions to subcortical responses suggest clinical perspectives, potentially complementing auditory brainstem responses (ABRs) or frequency-following responses (FFRs) that are current clinical standards. However, while it is well-known that the auditory brainstem responds both to transient amplitude variations and the stimulus periodicity that gives rise to pitch, these features co-vary in running speech. Here, we discuss challenges in disentangling the features that drive the subcortical response to running speech. Cortical and subcortical electroencephalographic (EEG) responses to running speech from 19 normal-hearing listeners (12 female) were analyzed. Using forward regression models, we confirm that responses to the rectified broadband speech signal yield temporal response functions consistent with wave V of the ABR, as shown in previous work. Peak latency and amplitude of the speech-evoked brainstem response were correlated with standard click-evoked ABRs recorded at the vertex electrode (Cz). Similar responses could be obtained using the fundamental frequency (F0) of the speech signal as model predictor. However, simulations indicated that dissociating responses to temporal fine structure at the F0 from broadband amplitude variations is not possible given the high co-variance of the features and the poor signal-to-noise ratio (SNR) of subcortical EEG responses. In cortex, both simulations and data replicated previous findings indicating that envelope tracking on frontal electrodes can be dissociated from responses to slow variations in F0 (relative pitch). Yet, no association between subcortical F0-tracking and cortical responses to relative pitch could be detected. These results indicate that while subcortical speech responses are comparable to click-evoked ABRs, dissociating pitch-related processing in the auditory brainstem may be challenging with natural speech stimuli.

9.

Brain-informed speech separation (BISS) for enhancement of target speaker in multitalker speech perception.

Ceolini, Enea; Hjortkjær, Jens; Wong, Daniel D E; O'Sullivan, James; Raghavan, Vinay S; Herrero, Jose; Mehta, Ashesh D; Liu, Shih-Chii; Mesgarani, Nima.

Neuroimage ; 223: 117282, 2020 12.

Article in English | MEDLINE | ID: mdl-32828921

ABSTRACT

Hearing-impaired people often struggle to follow the speech stream of an individual talker in noisy environments. Recent studies show that the brain tracks attended speech and that the attended talker can be decoded from neural data on a single-trial level. This raises the possibility of "neuro-steered" hearing devices in which the brain-decoded intention of a hearing-impaired listener is used to enhance the voice of the attended speaker from a speech separation front-end. So far, methods that use this paradigm have focused on optimizing the brain decoding and the acoustic speech separation independently. In this work, we propose a novel framework called brain-informed speech separation (BISS)1 in which the information about the attended speech, as decoded from the subject's brain, is directly used to perform speech separation in the front-end. We present a deep learning model that uses neural data to extract the clean audio signal that a listener is attending to from a multi-talker speech mixture. We show that the framework can be applied successfully to the decoded output from either invasive intracranial electroencephalography (iEEG) or non-invasive electroencephalography (EEG) recordings from hearing-impaired subjects. It also results in improved speech separation, even in scenes with background noise. The generalization capability of the system renders it a perfect candidate for neuro-steered hearing-assistive devices.

Subject(s)

Brain/physiology , Electroencephalography , Signal Processing, Computer-Assisted , Speech Acoustics , Speech Perception/physiology , Acoustic Stimulation , Adult , Algorithms , Deep Learning , Hearing Loss/physiopathology , Humans , Middle Aged

10.

Effects of Sensorineural Hearing Loss on Cortical Synchronization to Competing Speech during Selective Attention.

Fuglsang, Søren A; Märcher-Rørsted, Jonatan; Dau, Torsten; Hjortkjær, Jens.

J Neurosci ; 40(12): 2562-2572, 2020 03 18.

Article in English | MEDLINE | ID: mdl-32094201

ABSTRACT

When selectively attending to a speech stream in multi-talker scenarios, low-frequency cortical activity is known to synchronize selectively to fluctuations in the attended speech signal. Older listeners with age-related sensorineural hearing loss (presbycusis) often struggle to understand speech in such situations, even when wearing a hearing aid. Yet, it is unclear whether a peripheral hearing loss degrades the attentional modulation of cortical speech tracking. Here, we used psychoacoustics and electroencephalography (EEG) in male and female human listeners to examine potential effects of hearing loss on EEG correlates of speech envelope synchronization in cortex. Behaviorally, older hearing-impaired (HI) listeners showed degraded speech-in-noise recognition and reduced temporal acuity compared with age-matched normal-hearing (NH) controls. During EEG recordings, we used a selective attention task with two spatially separated simultaneous speech streams where NH and HI listeners both showed high speech recognition performance. Low-frequency (<10 Hz) envelope-entrained EEG responses were enhanced in the HI listeners, both for the attended speech, but also for tone sequences modulated at slow rates (4 Hz) during passive listening. Compared with the attended speech, responses to the ignored stream were found to be reduced in both HI and NH listeners, allowing for the attended target to be classified from single-trial EEG data with similar high accuracy in the two groups. However, despite robust attention-modulated speech entrainment, the HI listeners rated the competing speech task to be more difficult. These results suggest that speech-in-noise problems experienced by older HI listeners are not necessarily associated with degraded attentional selection.SIGNIFICANCE STATEMENT People with age-related sensorineural hearing loss often struggle to follow speech in the presence of competing talkers. It is currently unclear whether hearing impairment may impair the ability to use selective attention to suppress distracting speech in situations when the distractor is well segregated from the target. Here, we report amplified envelope-entrained cortical EEG responses to attended speech and to simple tones modulated at speech rates (4 Hz) in listeners with age-related hearing loss. Critically, despite increased self-reported listening difficulties, cortical synchronization to speech mixtures was robustly modulated by selective attention in listeners with hearing loss. This allowed the attended talker to be classified from single-trial EEG responses with high accuracy in both older hearing-impaired listeners and age-matched normal-hearing controls.

Subject(s)

Attention/physiology , Cortical Synchronization , Hearing Loss, Sensorineural/physiopathology , Hearing Loss, Sensorineural/psychology , Acoustic Stimulation , Aged , Electroencephalography , Evoked Potentials, Auditory , Female , Humans , Male , Middle Aged , Psychoacoustics , Psychomotor Performance , Recognition, Psychology , Speech Perception

11.

Cortical oscillations and entrainment in speech processing during working memory load.

Hjortkjaer, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A; Dau, Torsten.

Eur J Neurosci ; 51(5): 1279-1289, 2020 03.

Article in English | MEDLINE | ID: mdl-29392835

ABSTRACT

Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task. The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels of background noise. Increasing WM load at higher n-back levels was associated with a decrease in posterior alpha power as well as increased pupil dilations. Frontal theta power increased at the start of the trial and increased additionally with higher n-back level. The observed alpha-theta power changes are consistent with visual n-back paradigms suggesting general oscillatory correlates of WM processing load. Speech entrainment was measured as a linear mapping between the envelope of the speech signal and low-frequency cortical activity (< 13 Hz). We found that increases in both types of WM load (background noise and n-back level) decreased cortical speech envelope entrainment. Although entrainment persisted under high load, our results suggest a top-down influence of WM processing on cortical speech entrainment.

Subject(s)

Auditory Cortex , Speech , Auditory Perception , Electroencephalography , Humans , Memory, Short-Term

12.

Perception of Musical Tension in Cochlear Implant Listeners.

Spangmose, Steffen; Hjortkjær, Jens; Marozeau, Jeremy.

Front Neurosci ; 13: 987, 2019.

Article in English | MEDLINE | ID: mdl-31680795

ABSTRACT

Despite the difficulties experienced by cochlear implant (CI) users in perceiving pitch and harmony, it is not uncommon to see CI users listening to music, or even playing an instrument. Listening to music is a complex process that relies not only on low-level percepts, such as pitch or timbre, but also on emotional reactions or the ability to perceive musical sequences as patterns of tension and release. CI users engaged in musical activities might experience some of these higher-level musical features. The goal of this study is to evaluate CI users' ability to perceive musical tension. Nine CI listeners (CIL) and nine normal-hearing listeners (NHL) were asked to rate musical tension on a continuous visual analog slider during music listening. The subjects listened to a 4 min recording of Mozart's Piano Sonata No. 4 (K282) performed by an experienced pianist. In addition to the original piece, four modified versions were also tested to identify which features might influence the responses to the music in the two groups. In each version, one musical feature of the piece was altered: tone pitch, intensity, rhythm, or tempo. Surprisingly, CIL and NHL rated overall musical tension in a very similar way in the original piece. However, the results from the different modifications revealed that while NHL ratings were strongly affected by music with random pitch tones (but preserved intensity and timing information), CIL ratings were not. Rating judgments of both groups were similarly affected by modifications of rhythm and tempo. Our study indicates that CI users can understand higher-level musical aspects as indexed by musical tension ratings. The results suggest that although most CI users have difficulties perceiving pitch, additional music cues, such as tempo and dynamics might contribute positively to their experience of music.

13.

Multiway canonical correlation analysis of brain data.

de Cheveigné, Alain; Di Liberto, Giovanni M; Arzounian, Dorothée; Wong, Daniel D E; Hjortkjær, Jens; Fuglsang, Søren; Parra, Lucas C.

Neuroimage ; 186: 728-740, 2019 02 01.

Article in English | MEDLINE | ID: mdl-30496819

ABSTRACT

Brain data recorded with electroencephalography (EEG), magnetoencephalography (MEG) and related techniques often have poor signal-to-noise ratios due to the presence of multiple competing sources and artifacts. A common remedy is to average responses over repeats of the same stimulus, but this is not applicable for temporally extended stimuli that are presented only once (speech, music, movies, natural sound). An alternative is to average responses over multiple subjects that were presented with identical stimuli, but differences in geometry of brain sources and sensors reduce the effectiveness of this solution. Multiway canonical correlation analysis (MCCA) brings a solution to this problem by allowing data from multiple subjects to be fused in such a way as to extract components common to all. This paper reviews the method, offers application examples that illustrate its effectiveness, and outlines the caveats and risks entailed by the method.

Subject(s)

Brain/physiology , Data Interpretation, Statistical , Electroencephalography/methods , Magnetoencephalography/methods , Models, Theoretical , Adult , Humans

14.

A Comparison of Regularization Methods in Forward and Backward Models for Auditory Attention Decoding.

Wong, Daniel D E; Fuglsang, Søren A; Hjortkjær, Jens; Ceolini, Enea; Slaney, Malcolm; de Cheveigné, Alain.

Front Neurosci ; 12: 531, 2018.

Article in English | MEDLINE | ID: mdl-30131670

ABSTRACT

The decoding of selective auditory attention from noninvasive electroencephalogram (EEG) data is of interest in brain computer interface and auditory perception research. The current state-of-the-art approaches for decoding the attentional selection of listeners are based on linear mappings between features of sound streams and EEG responses (forward model), or vice versa (backward model). It has been shown that when the envelope of attended speech and EEG responses are used to derive such mapping functions, the model estimates can be used to discriminate between attended and unattended talkers. However, the predictive/reconstructive performance of the models is dependent on how the model parameters are estimated. There exist a number of model estimation methods that have been published, along with a variety of datasets. It is currently unclear if any of these methods perform better than others, as they have not yet been compared side by side on a single standardized dataset in a controlled fashion. Here, we present a comparative study of the ability of different estimation methods to classify attended speakers from multi-channel EEG data. The performance of the model estimation methods is evaluated using different performance metrics on a set of labeled EEG data from 18 subjects listening to mixtures of two speech streams. We find that when forward models predict the EEG from the attended audio, regularized models do not improve regression or classification accuracies. When backward models decode the attended speech from the EEG, regularization provides higher regression and classification accuracies.

15.

Decoding the auditory brain with canonical component analysis.

de Cheveigné, Alain; Wong, Daniel D E; Di Liberto, Giovanni M; Hjortkjær, Jens; Slaney, Malcolm; Lalor, Edmund.

Neuroimage ; 172: 206-216, 2018 05 15.

Article in English | MEDLINE | ID: mdl-29378317

ABSTRACT

The relation between a stimulus and the evoked brain response can shed light on perceptual processes within the brain. Signals derived from this relation can also be harnessed to control external devices for Brain Computer Interface (BCI) applications. While the classic event-related potential (ERP) is appropriate for isolated stimuli, more sophisticated "decoding" strategies are needed to address continuous stimuli such as speech, music or environmental sounds. Here we describe an approach based on Canonical Correlation Analysis (CCA) that finds the optimal transform to apply to both the stimulus and the response to reveal correlations between the two. Compared to prior methods based on forward or backward models for stimulus-response mapping, CCA finds significantly higher correlation scores, thus providing increased sensitivity to relatively small effects, and supports classifier schemes that yield higher classification scores. CCA strips the brain response of variance unrelated to the stimulus, and the stimulus representation of variance that does not affect the response, and thus improves observations of the relation between stimulus and response.

Subject(s)

Brain Mapping/methods , Brain/physiology , Signal Processing, Computer-Assisted , Acoustic Stimulation , Electroencephalography/methods , Evoked Potentials, Auditory/physiology , Humans , Magnetoencephalography/methods

16.

Task-Modulated Cortical Representations of Natural Sound Source Categories.

Hjortkjær, Jens; Kassuba, Tanja; Madsen, Kristoffer H; Skov, Martin; Siebner, Hartwig R.

Cereb Cortex ; 28(1): 295-306, 2018 01 01.

Article in English | MEDLINE | ID: mdl-29069292

ABSTRACT

In everyday sound environments, we recognize sound sources and events by attending to relevant aspects of an acoustic input. Evidence about the cortical mechanisms involved in extracting relevant category information from natural sounds is, however, limited to speech. Here, we used functional MRI to measure cortical response patterns while human listeners categorized real-world sounds created by objects of different solid materials (glass, metal, wood) manipulated by different sound-producing actions (striking, rattling, dropping). In different sessions, subjects had to identify either material or action categories in the same sound stimuli. The sound-producing action and the material of the sound source could be decoded from multivoxel activity patterns in auditory cortex, including Heschl's gyrus and planum temporale. Importantly, decoding success depended on task relevance and category discriminability. Action categories were more accurately decoded in auditory cortex when subjects identified action information. Conversely, the material of the same sound sources was decoded with higher accuracy in the inferior frontal cortex during material identification. Representational similarity analyses indicated that both early and higher-order auditory cortex selectively enhanced spectrotemporal features relevant to the target category. Together, the results indicate a cortical selection mechanism that favors task-relevant information in the processing of nonvocal sound categories.

Subject(s)

Auditory Perception/physiology , Cerebral Cortex/physiology , Acoustic Stimulation/methods , Adult , Attention/physiology , Brain Mapping , Cerebral Cortex/diagnostic imaging , Cerebrovascular Circulation/physiology , Female , Humans , Magnetic Resonance Imaging , Male , Neuropsychological Tests , Oxygen/blood , Young Adult

17.

Subcortical and cortical correlates of pitch discrimination: Evidence for two levels of neuroplasticity in musicians.

Bianchi, Federica; Hjortkjær, Jens; Santurette, Sébastien; Zatorre, Robert J; Siebner, Hartwig R; Dau, Torsten.

Neuroimage ; 163: 398-412, 2017 12.

Article in English | MEDLINE | ID: mdl-28774646

ABSTRACT

Musicians are highly trained to discriminate fine pitch changes but the neural bases of this ability are poorly understood. It is unclear whether such training-dependent differences in pitch processing arise already in the subcortical auditory system or are linked to more central stages. To address this question, we combined psychoacoustic testing with functional MRI to measure cortical and subcortical responses in musicians and non-musicians during a pitch-discrimination task. First, we estimated behavioral pitch-discrimination thresholds for complex tones with harmonic components that were either resolved or unresolved in the auditory system. Musicians outperformed non-musicians, showing lower pitch-discrimination thresholds in both conditions. The same participants underwent task-related functional MRI, while they performed a similar pitch-discrimination task. To account for the between-group differences in pitch-discrimination, task difficulty was adjusted to each individual's pitch-discrimination ability. Relative to non-musicians, musicians showed increased neural responses to complex tones with either resolved or unresolved harmonics especially in right-hemispheric areas, comprising the right superior temporal gyrus, Heschl's gyrus, insular cortex, inferior frontal gyrus, and in the inferior colliculus. Both subcortical and cortical neural responses predicted the individual pitch-discrimination performance. However, functional activity in the inferior colliculus correlated with differences in pitch discrimination across all participants, but not within the musicians group alone. Only neural activity in the right auditory cortex scaled with the fine pitch-discrimination thresholds within the musicians. These findings suggest two levels of neuroplasticity in musicians, whereby training-dependent changes in pitch processing arise at the collicular level and are preserved and further enhanced in the right auditory cortex.

Subject(s)

Auditory Cortex/physiology , Pitch Discrimination/physiology , Adult , Evoked Potentials, Auditory/physiology , Female , Humans , Magnetic Resonance Imaging , Male , Neuronal Plasticity , Young Adult

18.

Noise-robust cortical tracking of attended speech in real-world acoustic scenes.

Fuglsang, Søren Asp; Dau, Torsten; Hjortkjær, Jens.

Neuroimage ; 156: 435-444, 2017 08 01.

Article in English | MEDLINE | ID: mdl-28412441

ABSTRACT

Selectively attending to one speaker in a multi-speaker scenario is thought to synchronize low-frequency cortical activity to the attended speech signal. In recent studies, reconstruction of speech from single-trial electroencephalogram (EEG) data has been used to decode which talker a listener is attending to in a two-talker situation. It is currently unclear how this generalizes to more complex sound environments. Behaviorally, speech perception is robust to the acoustic distortions that listeners typically encounter in everyday life, but it is unknown whether this is mirrored by a noise-robust neural tracking of attended speech. Here we used advanced acoustic simulations to recreate real-world acoustic scenes in the laboratory. In virtual acoustic realities with varying amounts of reverberation and number of interfering talkers, listeners selectively attended to the speech stream of a particular talker. Across the different listening environments, we found that the attended talker could be accurately decoded from single-trial EEG data irrespective of the different distortions in the acoustic input. For highly reverberant environments, speech envelopes reconstructed from neural responses to the distorted stimuli resembled the original clean signal more than the distorted input. With reverberant speech, we observed a late cortical response to the attended speech stream that encoded temporal modulations in the speech signal without its reverberant distortion. Single-trial attention decoding accuracies based on 40-50s long blocks of data from 64 scalp electrodes were equally high (80-90% correct) in all considered listening environments and remained statistically significant using down to 10 scalp electrodes and short (<30-s) unaveraged EEG segments. In contrast to the robust decoding of the attended talker we found that decoding of the unattended talker deteriorated with the acoustic distortions. These results suggest that cortical activity tracks an attended speech signal in a way that is invariant to acoustic distortions encountered in real-life sound environments. Noise-robust attention decoding additionally suggests a potential utility of stimulus reconstruction techniques in attention-controlled brain-computer interfaces.

Subject(s)

Attention/physiology , Auditory Cortex/physiology , Speech Perception/physiology , Acoustic Stimulation , Adult , Electroencephalography , Female , Humans , Male , Noise , Young Adult

19.

Spectral and temporal cues for perception of material and action categories in impacted sound sources.

Hjortkjær, Jens; McAdams, Stephen.

J Acoust Soc Am ; 140(1): 409, 2016 07.

Article in English | MEDLINE | ID: mdl-27475165

ABSTRACT

In two experiments, similarity ratings and categorization performance with recorded impact sounds representing three material categories (wood, metal, glass) being manipulated by three different categories of action (drop, strike, rattle) were examined. Previous research focusing on single impact sounds suggests that temporal cues related to damping are essential for material discrimination, but spectral cues are potentially more efficient for discriminating materials manipulated by different actions that include multiple impacts (e.g., dropping, rattling). Perceived similarity between material categories across different actions was correlated with the distribution of long-term spectral energy (spectral centroid). Similarity between action categories was described by the temporal distribution of envelope energy (temporal centroid) or by the density of impacts. Moreover, perceptual similarity correlated with the pattern of confusion in categorization judgments. Listeners tended to confuse materials with similar spectral centroids, and actions with similar temporal centroids and onset densities. To confirm the influence of these different features, spectral cues were removed by applying the envelopes of the original sounds to a broadband noise carrier. Without spectral cues, listeners retained sensitivity to action categories but not to material categories. Conversely, listeners recognized material but not action categories after envelope scrambling that preserved long-term spectral content.

20.

Impact of Background Noise and Sentence Complexity on Processing Demands during Sentence Comprehension.

Wendt, Dorothea; Dau, Torsten; Hjortkjær, Jens.

Front Psychol ; 7: 345, 2016.

Article in English | MEDLINE | ID: mdl-27014152

ABSTRACT

Speech comprehension in adverse listening conditions can be effortful even when speech is fully intelligible. Acoustical distortions typically make speech comprehension more effortful, but effort also depends on linguistic aspects of the speech signal, such as its syntactic complexity. In the present study, pupil dilations, and subjective effort ratings were recorded in 20 normal-hearing participants while performing a sentence comprehension task. The sentences were either syntactically simple (subject-first sentence structure) or complex (object-first sentence structure) and were presented in two levels of background noise both corresponding to high intelligibility. A digit span and a reading span test were used to assess individual differences in the participants' working memory capacity (WMC). The results showed that the subjectively rated effort was mostly affected by the noise level and less by syntactic complexity. Conversely, pupil dilations increased with syntactic complexity but only showed a small effect of the noise level. Participants with higher WMC showed increased pupil responses in the higher-level noise condition but rated sentence comprehension as being less effortful compared to participants with lower WMC. Overall, the results demonstrate that pupil dilations and subjectively rated effort represent different aspects of effort. Furthermore, the results indicate that effort can vary in situations with high speech intelligibility.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL