Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 84
Filter
1.
Brain Sci ; 14(3)2024 Feb 21.
Article in English | MEDLINE | ID: mdl-38539585

ABSTRACT

Brain-Computer Interfaces (BCIs) aim to establish a pathway between the brain and an external device without the involvement of the motor system, relying exclusively on neural signals. Such systems have the potential to provide a means of communication for patients who have lost the ability to speak due to a neurological disorder. Traditional methodologies for decoding imagined speech directly from brain signals often deploy static classifiers, that is, decoders that are computed once at the beginning of the experiment and remain unchanged throughout the BCI use. However, this approach might be inadequate to effectively handle the non-stationary nature of electroencephalography (EEG) signals and the learning that accompanies BCI use, as parameters are expected to change, and all the more in a real-time setting. To address this limitation, we developed an adaptive classifier that updates its parameters based on the incoming data in real time. We first identified optimal parameters (the update coefficient, UC) to be used in an adaptive Linear Discriminant Analysis (LDA) classifier, using a previously recorded EEG dataset, acquired while healthy participants controlled a binary BCI based on imagined syllable decoding. We subsequently tested the effectiveness of this optimization in a real-time BCI control setting. Twenty healthy participants performed two BCI control sessions based on the imagery of two syllables, using a static LDA and an adaptive LDA classifier, in randomized order. As hypothesized, the adaptive classifier led to better performances than the static one in this real-time BCI control task. Furthermore, the optimal parameters for the adaptive classifier were closely aligned in both datasets, acquired using the same syllable imagery task. These findings highlight the effectiveness and reliability of adaptive LDA classifiers for real-time imagined speech decoding. Such an improvement can shorten the training time and favor the development of multi-class BCIs, representing a clear interest for non-invasive systems notably characterized by low decoding accuracies.

2.
bioRxiv ; 2024 Jan 21.
Article in English | MEDLINE | ID: mdl-37961305

ABSTRACT

Traditional models of speech perception posit that neural activity encodes speech through a hierarchy of cognitive processes, from low-level representations of acoustic and phonetic features to high-level semantic encoding. Yet it remains unknown how neural representations are transformed across levels of the speech hierarchy. Here, we analyzed unique microelectrode array recordings of neuronal spiking activity from the human left anterior superior temporal gyrus, a brain region at the interface between phonetic and semantic speech processing, during a semantic categorization task and natural speech perception. We identified distinct neural manifolds for semantic and phonetic features, with a functional separation of the corresponding low-dimensional trajectories. Moreover, phonetic and semantic representations were encoded concurrently and reflected in power increases in the beta and low-gamma local field potentials, suggesting top-down predictive and bottom-up cumulative processes. Our results are the first to demonstrate mechanisms for hierarchical speech transformations that are specific to neuronal population dynamics.

3.
PLoS Comput Biol ; 19(11): e1011595, 2023 Nov.
Article in English | MEDLINE | ID: mdl-37934766

ABSTRACT

Natural speech perception requires processing the ongoing acoustic input while keeping in mind the preceding one and predicting the next. This complex computational problem could be handled by a dynamic multi-timescale hierarchical inferential process that coordinates the information flow up and down the language network hierarchy. Using a predictive coding computational model (Precoss-ß) that identifies online individual syllables from continuous speech, we address the advantage of a rhythmic modulation of up and down information flows, and whether beta oscillations could be optimal for this. In the model, and consistent with experimental data, theta and low-gamma neural frequency scales ensure syllable-tracking and phoneme-level speech encoding, respectively, while the beta rhythm is associated with inferential processes. We show that a rhythmic alternation of bottom-up and top-down processing regimes improves syllable recognition, and that optimal efficacy is reached when the alternation of bottom-up and top-down regimes, via oscillating prediction error precisions, is in the beta range (around 20-30 Hz). These results not only demonstrate the advantage of a rhythmic alternation of up- and down-going information, but also that the low-beta range is optimal given sensory analysis at theta and low-gamma scales. While specific to speech processing, the notion of alternating bottom-up and top-down processes with frequency multiplexing might generalize to other cognitive architectures.


Subject(s)
Speech Perception , Speech , Beta Rhythm , Language , Recognition, Psychology
4.
J Neurosci ; 43(40): 6779-6795, 2023 10 04.
Article in English | MEDLINE | ID: mdl-37607822

ABSTRACT

Communication difficulties are one of the core criteria in diagnosing autism spectrum disorder (ASD), and are often characterized by speech reception difficulties, whose biological underpinnings are not yet identified. This deficit could denote atypical neuronal ensemble activity, as reflected by neural oscillations. Atypical cross-frequency oscillation coupling, in particular, could disrupt the joint tracking and prediction of dynamic acoustic stimuli, a dual process that is essential for speech comprehension. Whether such oscillatory anomalies already exist in very young children with ASD, and with what specificity they relate to individual language reception capacity is unknown. We collected neural activity data using electroencephalography (EEG) in 64 very young children with and without ASD (mean age 3; 17 females, 47 males) while they were exposed to naturalistic-continuous speech. EEG power of frequency bands typically associated with phrase-level chunking (δ, 1-3 Hz), phonemic encoding (low-γ, 25-35 Hz), and top-down control (ß, 12-20 Hz) were markedly reduced in ASD relative to typically developing (TD) children. Speech neural tracking by δ and θ (4-8 Hz) oscillations was also weaker in ASD compared with TD children. After controlling gaze-pattern differences, we found that the classical θ/γ coupling was replaced by an atypical ß/γ coupling in children with ASD. This anomaly was the single most specific predictor of individual speech reception difficulties in ASD children. These findings suggest that early interventions (e.g., neurostimulation) targeting the disruption of ß/γ coupling and the upregulation of θ/γ coupling could improve speech processing coordination in young children with ASD and help them engage in oral interactions.SIGNIFICANCE STATEMENT Very young children already present marked alterations of neural oscillatory activity in response to natural speech at the time of autism spectrum disorder (ASD) diagnosis. Hierarchical processing of phonemic-range and syllabic-range information (θ/γ coupling) is disrupted in ASD children. Abnormal bottom-up (low-γ) and top-down (low-ß) coordination specifically predicts speech reception deficits in very young ASD children, and no other cognitive deficit.


Subject(s)
Autism Spectrum Disorder , Autistic Disorder , Male , Female , Humans , Child , Child, Preschool , Speech/physiology , Autism Spectrum Disorder/diagnosis , Electroencephalography , Acoustic Stimulation
5.
Cell Rep Med ; 4(7): 101115, 2023 07 18.
Article in English | MEDLINE | ID: mdl-37467714

ABSTRACT

Tang et al.1 report a noninvasive brain-computer interface (BCI) that reconstructs perceived and intended continuous language from semantic brain responses. The study offers new possibilities to radically facilitate neural speech decoder applications and addresses concerns about misuse in non-medical scenarios.


Subject(s)
Brain-Computer Interfaces , Reading , Brain , Language
6.
PLoS Biol ; 21(3): e3002046, 2023 03.
Article in English | MEDLINE | ID: mdl-36947552

ABSTRACT

Understanding speech requires mapping fleeting and often ambiguous soundwaves to meaning. While humans are known to exploit their capacity to contextualize to facilitate this process, how internal knowledge is deployed online remains an open question. Here, we present a model that extracts multiple levels of information from continuous speech online. The model applies linguistic and nonlinguistic knowledge to speech processing, by periodically generating top-down predictions and incorporating bottom-up incoming evidence in a nested temporal hierarchy. We show that a nonlinguistic context level provides semantic predictions informed by sensory inputs, which are crucial for disambiguating among multiple meanings of the same word. The explicit knowledge hierarchy of the model enables a more holistic account of the neurophysiological responses to speech compared to using lexical predictions generated by a neural network language model (GPT-2). We also show that hierarchical predictions reduce peripheral processing via minimizing uncertainty and prediction error. With this proof-of-concept model, we demonstrate that the deployment of hierarchical predictions is a possible strategy for the brain to dynamically utilize structured knowledge and make sense of the speech input.


Subject(s)
Comprehension , Speech Perception , Humans , Comprehension/physiology , Speech , Speech Perception/physiology , Brain/physiology , Language
7.
Nat Commun ; 13(1): 48, 2022 01 10.
Article in English | MEDLINE | ID: mdl-35013268

ABSTRACT

Reconstructing intended speech from neural activity using brain-computer interfaces holds great promises for people with severe speech production deficits. While decoding overt speech has progressed, decoding imagined speech has met limited success, mainly because the associated neural signals are weak and variable compared to overt speech, hence difficult to decode by learning algorithms. We obtained three electrocorticography datasets from 13 patients, with electrodes implanted for epilepsy evaluation, who performed overt and imagined speech production tasks. Based on recent theories of speech neural processing, we extracted consistent and specific neural features usable for future brain computer interfaces, and assessed their performance to discriminate speech items in articulatory, phonetic, and vocalic representation spaces. While high-frequency activity provided the best signal for overt speech, both low- and higher-frequency power and local cross-frequency contributed to imagined speech decoding, in particular in phonetic and vocalic, i.e. perceptual, spaces. These findings show that low-frequency power and cross-frequency dynamics contain key information for imagined speech decoding.


Subject(s)
Brain-Computer Interfaces , Electrocorticography , Language , Speech , Adult , Brain/diagnostic imaging , Brain Mapping , Electrodes , Female , Humans , Imagination , Male , Middle Aged , Phonetics , Young Adult
8.
Neuroimage ; 231: 117864, 2021 05 01.
Article in English | MEDLINE | ID: mdl-33592241

ABSTRACT

Both electroencephalography (EEG) and functional Magnetic Resonance Imaging (fMRI) are non-invasive methods that show complementary aspects of human brain activity. Despite measuring different proxies of brain activity, both the measured blood-oxygenation (fMRI) and neurophysiological recordings (EEG) are indirectly coupled. The electrophysiological and BOLD signal can map the underlying functional connectivity structure at the whole brain scale at different timescales. Previous work demonstrated a moderate but significant correlation between resting-state functional connectivity of both modalities, however there is a wide range of technical setups to measure simultaneous EEG-fMRI and the reliability of those measures between different setups remains unknown. This is true notably with respect to different magnetic field strengths (low and high field) and different spatial sampling of EEG (medium to high-density electrode coverage). Here, we investigated the reproducibility of the bimodal EEG-fMRI functional connectome in the most comprehensive resting-state simultaneous EEG-fMRI dataset compiled to date including a total of 72 subjects from four different imaging centers. Data was acquired from 1.5T, 3T and 7T scanners with simultaneously recorded EEG using 64 or 256 electrodes. We demonstrate that the whole-brain monomodal connectivity reproducibly correlates across different datasets and that a moderate crossmodal correlation between EEG and fMRI connectivity of r ≈ 0.3 can be reproducibly extracted in low- and high-field scanners. The crossmodal correlation was strongest in the EEG-ß frequency band but exists across all frequency bands. Both homotopic and within intrinsic connectivity network (ICN) connections contributed the most to the crossmodal relationship. This study confirms, using a considerably diverse range of recording setups, that simultaneous EEG-fMRI offers a consistent estimate of multimodal functional connectomes in healthy subjects that are dominantly linked through a functional core of ICNs across spanning across the different timescales measured by EEG and fMRI. This opens new avenues for estimating the dynamics of brain function and provides a better understanding of interactions between EEG and fMRI measures. This observed level of reproducibility also defines a baseline for the study of alterations of this coupling in pathological conditions and their role as potential clinical markers.


Subject(s)
Brain/diagnostic imaging , Connectome/standards , Databases, Factual/standards , Electroencephalography/standards , Magnetic Resonance Imaging/standards , Nerve Net/diagnostic imaging , Adolescent , Adult , Brain/physiology , Connectome/methods , Electroencephalography/methods , Female , Humans , Magnetic Resonance Imaging/methods , Male , Middle Aged , Nerve Net/physiology , Reproducibility of Results , Young Adult
9.
Reprod Biomed Soc Online ; 11: 89-95, 2020 Nov.
Article in English | MEDLINE | ID: mdl-33336088

ABSTRACT

Human choice and interventions that could seem to threaten the course of 'nature' or 'chance' are at the heart of controversies over assisted reproductive technology across Western countries. These debates focus predominately on so-called 'selective reproductive technology'. While today, the technique of in-vitro fertilization (IVF) raises few political and bioethical debates in France and other Western countries, concerns remain that human intervention might replace 'natural' processes, threatening human procreation. These polemics focus on situations that require a decision, notably embryo selection and the fate of spare frozen embryos. The choices involved are induced by the technology and organized by the law. In the French legal system, IVF patients and professionals have the opportunity and, to a certain extent, the responsibility to decide on the status of in-vitro embryos. This article shows that, in these situations, both IVF patients and professionals invoke outside agencies ('instances tierces'), both to avoid making decisions and to recover a world order in which procreation is not entirely subject to human decision. In short, there is a need to feel that procreation is not entirely dependent on human intervention; that individuals do not decide everything. It appears that the choices that are made, their nature and the type of outside agency that is invoked are highly situated.

10.
Sci Adv ; 6(45)2020 11.
Article in English | MEDLINE | ID: mdl-33148648

ABSTRACT

When we see our interlocutor, our brain seamlessly extracts visual cues from their face and processes them along with the sound of their voice, making speech an intrinsically multimodal signal. Visual cues are especially important in noisy environments, when the auditory signal is less reliable. Neuronal oscillations might be involved in the cortical processing of audiovisual speech by selecting which sensory channel contributes more to perception. To test this, we designed computer-generated naturalistic audiovisual speech stimuli where one mismatched phoneme-viseme pair in a key word of sentences created bistable perception. Neurophysiological recordings (high-density scalp and intracranial electroencephalography) revealed that the precise phase angle of theta-band oscillations in posterior temporal and occipital cortex of the right hemisphere was crucial to select whether the auditory or the visual speech cue drove perception. We demonstrate that the phase of cortical oscillations acts as an instrument for sensory selection in audiovisual speech processing.


Subject(s)
Speech Perception , Speech , Acoustic Stimulation , Cues , Speech/physiology , Speech Perception/physiology , Visual Perception/physiology
11.
Sci Rep ; 10(1): 18009, 2020 10 22.
Article in English | MEDLINE | ID: mdl-33093570

ABSTRACT

In face-to-face communication, audio-visual (AV) stimuli can be fused, combined or perceived as mismatching. While the left superior temporal sulcus (STS) is presumably the locus of AV integration, the process leading to combination is unknown. Based on previous modelling work, we hypothesize that combination results from a complex dynamic originating in a failure to integrate AV inputs, followed by a reconstruction of the most plausible AV sequence. In two different behavioural tasks and one MEG experiment, we observed that combination is more time demanding than fusion. Using time-/source-resolved human MEG analyses with linear and dynamic causal models, we show that both fusion and combination involve early detection of AV incongruence in the STS, whereas combination is further associated with enhanced activity of AV asynchrony-sensitive regions (auditory and inferior frontal cortices). Based on neural signal decoding, we finally show that only combination can be decoded from the IFG activity and that combination is decoded later than fusion in the STS. These results indicate that the AV speech integration outcome primarily depends on whether the STS converges or not onto an existing multimodal syllable representation, and that combination results from subsequent temporal processing, presumably the off-line re-ordering of incongruent AV stimuli.

12.
J Neural Eng ; 17(5): 056028, 2020 10 15.
Article in English | MEDLINE | ID: mdl-33055383

ABSTRACT

OBJECTIVE: A current challenge of neurotechnologies is to develop speech brain-computer interfaces aiming at restoring communication in people unable to speak. To achieve a proof of concept of such system, neural activity of patients implanted for clinical reasons can be recorded while they speak. Using such simultaneously recorded audio and neural data, decoders can be built to predict speech features using features extracted from brain signals. A typical neural feature is the spectral power of field potentials in the high-gamma frequency band, which happens to overlap the frequency range of speech acoustic signals, especially the fundamental frequency of the voice. Here, we analyzed human electrocorticographic and intracortical recordings during speech production and perception as well as a rat microelectrocorticographic recording during sound perception. We observed that several datasets, recorded with different recording setups, contained spectrotemporal features highly correlated with those of the sound produced by or delivered to the participants, especially within the high-gamma band and above, strongly suggesting a contamination of electrophysiological recordings by the sound signal. This study investigated the presence of acoustic contamination and its possible source. APPROACH: We developed analysis methods and a statistical criterion to objectively assess the presence or absence of contamination-specific correlations, which we used to screen several datasets from five centers worldwide. MAIN RESULTS: Not all but several datasets, recorded in a variety of conditions, showed significant evidence of acoustic contamination. Three out of five centers were concerned by the phenomenon. In a recording showing high contamination, the use of high-gamma band features dramatically facilitated the performance of linear decoding of acoustic speech features, while such improvement was very limited for another recording showing no significant contamination. Further analysis and in vitro replication suggest that the contamination is caused by the mechanical action of the sound waves onto the cables and connectors along the recording chain, transforming sound vibrations into an undesired electrical noise affecting the biopotential measurements. SIGNIFICANCE: Although this study does not per se question the presence of speech-relevant physiological information in the high-gamma range and above (multiunit activity), it alerts on the fact that acoustic contamination of neural signals should be proofed and eliminated before investigating the cortical dynamics of these processes. To this end, we make available a toolbox implementing the proposed statistical approach to quickly assess the extent of contamination in an electrophysiological recording (https://doi.org/10.5281/zenodo.3929296).


Subject(s)
Speech Perception , Speech , Acoustic Stimulation , Acoustics , Animals , Brain , Humans , Noise , Rats
13.
PLoS Biol ; 18(9): e3000833, 2020 09.
Article in English | MEDLINE | ID: mdl-32898188

ABSTRACT

The phonological deficit in dyslexia is associated with altered low-gamma oscillatory function in left auditory cortex, but a causal relationship between oscillatory function and phonemic processing has never been established. After confirming a deficit at 30 Hz with electroencephalography (EEG), we applied 20 minutes of transcranial alternating current stimulation (tACS) to transiently restore this activity in adults with dyslexia. The intervention significantly improved phonological processing and reading accuracy as measured immediately after tACS. The effect occurred selectively for a 30-Hz stimulation in the dyslexia group. Importantly, we observed that the focal intervention over the left auditory cortex also decreased 30-Hz activity in the right superior temporal cortex, resulting in reinstating a left dominance for the oscillatory response. These findings establish a causal role of neural oscillations in phonological processing and offer solid neurophysiological grounds for a potential correction of low-gamma anomalies and for alleviating the phonological deficit in dyslexia.


Subject(s)
Dyslexia/therapy , Reading , Speech Perception , Adolescent , Adult , Auditory Cortex/physiopathology , Auditory Cortex/radiation effects , Dyslexia/physiopathology , Electroencephalography , Evoked Potentials, Auditory/physiology , Evoked Potentials, Auditory/radiation effects , Female , Humans , Male , Middle Aged , Phonetics , Speech Perception/physiology , Speech Perception/radiation effects , Transcranial Direct Current Stimulation/methods , Verbal Behavior/physiology , Verbal Behavior/radiation effects , Young Adult
14.
Sci Rep ; 10(1): 15540, 2020 09 23.
Article in English | MEDLINE | ID: mdl-32968127

ABSTRACT

Natural speech is processed in the brain as a mixture of auditory and visual features. An example of the importance of visual speech is the McGurk effect and related perceptual illusions that result from mismatching auditory and visual syllables. Although the McGurk effect has widely been applied to the exploration of audio-visual speech processing, it relies on isolated syllables, which severely limits the conclusions that can be drawn from the paradigm. In addition, the extreme variability and the quality of the stimuli usually employed prevents comparability across studies. To overcome these limitations, we present an innovative methodology using 3D virtual characters with realistic lip movements synchronized on computer-synthesized speech. We used commercially accessible and affordable tools to facilitate reproducibility and comparability, and the set-up was validated on 24 participants performing a perception task. Within complete and meaningful French sentences, we paired a labiodental fricative viseme (i.e. /v/) with a bilabial occlusive phoneme (i.e. /b/). This audiovisual mismatch is known to induce the illusion of hearing /v/ in a proportion of trials. We tested the rate of the illusion while varying the magnitude of background noise and audiovisual lag. Overall, the effect was observed in 40% of trials. The proportion rose to about 50% with added background noise and up to 66% when controlling for phonetic features. Our results conclusively demonstrate that computer-generated speech stimuli are judicious, and that they can supplement natural speech with higher control over stimulus timing and content.


Subject(s)
Speech Perception , Speech , Visual Perception , Adult , Auditory Perception , Female , Humans , Male , Semantics , Young Adult
15.
Netw Neurosci ; 4(3): 658-677, 2020.
Article in English | MEDLINE | ID: mdl-32885120

ABSTRACT

Concurrent electroencephalography (EEG) and functional magnetic resonance imaging (fMRI) bridge brain connectivity across timescales. During concurrent EEG-fMRI resting-state recordings, whole-brain functional connectivity (FC) strength is spatially correlated across modalities. However, cross-modal investigations have commonly remained correlational, and joint analysis of EEG-fMRI connectivity is largely unexplored. Here we investigated if there exist (spatially) independent FC networks linked between modalities. We applied the recently proposed hybrid connectivity independent component analysis (connICA) framework to two concurrent EEG-fMRI resting-state datasets (total 40 subjects). Two robust components were found across both datasets. The first component has a uniformly distributed EEG frequency fingerprint linked mainly to intrinsic connectivity networks (ICNs) in both modalities. Conversely, the second component is sensitive to different EEG frequencies and is primarily linked to intra-ICN connectivity in fMRI but to inter-ICN connectivity in EEG. The first hybrid component suggests that connectivity dynamics within well-known ICNs span timescales, from millisecond range in all canonical frequencies of FCEEG to second range of FCfMRI. Conversely, the second component additionally exposes linked but spatially divergent neuronal processing at the two timescales. This work reveals the existence of joint spatially independent components, suggesting that parts of resting-state connectivity are co-expressed in a linked manner across EEG and fMRI over individuals.

16.
Neuroimage ; 219: 116998, 2020 10 01.
Article in English | MEDLINE | ID: mdl-32480035

ABSTRACT

Long-range connectivity has become the most studied feature of human functional Magnetic Resonance Imaging (fMRI), yet the spatial and temporal relationship between its whole-brain dynamics and electrophysiological connectivity remains largely unknown. FMRI-derived functional connectivity exhibits spatial reconfigurations or time-varying dynamics at infraslow (<0.1Hz) speeds. Conversely, electrophysiological connectivity is based on cross-region coupling of fast oscillations (~1-100Hz). It is unclear whether such fast oscillation-based coupling varies at infraslow speeds, temporally coinciding with infraslow dynamics across the fMRI-based connectome. If so, does the association of fMRI-derived and electrophysiological dynamics spatially vary over the connectome across the functionally distinct electrophysiological oscillation bands? In two concurrent electroencephalography (EEG)-fMRI resting-state datasets, oscillation-based coherence in all canonical bands (delta through gamma) indeed reconfigured at infraslow speeds in tandem with fMRI-derived connectivity changes in corresponding region-pairs. Interestingly, irrespective of EEG frequency-band the cross-modal tie of connectivity dynamics comprised a large proportion of connections distributed across the entire connectome. However, there were frequency-specific differences in the relative strength of the cross-modal association. This association was strongest in visual to somatomotor connections for slower EEG-bands, and in connections involving the Default Mode Network for faster EEG-bands. Methodologically, the findings imply that neural connectivity dynamics can be reliably measured by fMRI despite heavy susceptibility to noise, and by EEG despite shortcomings of source reconstruction. Biologically, the findings provide evidence that contrast with known territories of oscillation power, oscillation coupling in all bands slowly reconfigures in a highly distributed manner across the whole-brain connectome.


Subject(s)
Brain/physiology , Connectome/methods , Electroencephalography/methods , Magnetic Resonance Imaging/methods , Nerve Net/physiology , Adolescent , Adult , Brain/diagnostic imaging , Female , Humans , Male , Nerve Net/diagnostic imaging , Young Adult
17.
Nat Commun ; 11(1): 3117, 2020 06 19.
Article in English | MEDLINE | ID: mdl-32561726

ABSTRACT

On-line comprehension of natural speech requires segmenting the acoustic stream into discrete linguistic elements. This process is argued to rely on theta-gamma oscillation coupling, which can parse syllables and encode them in decipherable neural activity. Speech comprehension also strongly depends on contextual cues that help predicting speech structure and content. To explore the effects of theta-gamma coupling on bottom-up/top-down dynamics during on-line syllable identification, we designed a computational model (Precoss-predictive coding and oscillations for speech) that can recognise syllable sequences in continuous speech. The model uses predictions from internal spectro-temporal representations of syllables and theta oscillations to signal syllable onsets and duration. Syllable recognition is best when theta-gamma coupling is used to temporally align spectro-temporal predictions with the acoustic input. This neurocomputational modelling work demonstrates that the notions of predictive coding and neural oscillations can be brought together to account for on-line dynamic sensory processing.


Subject(s)
Auditory Cortex/physiology , Gamma Rhythm/physiology , Models, Neurological , Speech Perception/physiology , Theta Rhythm/physiology , Acoustic Stimulation , Comprehension/physiology , Computer Simulation , Cues , Humans , Phonetics
18.
Sci Rep ; 10(1): 7637, 2020 05 06.
Article in English | MEDLINE | ID: mdl-32376909

ABSTRACT

The traditional approach in neuroscience relies on encoding models where brain responses are related to different stimuli in order to establish dependencies. In decoding tasks, on the contrary, brain responses are used to predict the stimuli, and traditionally, the signals are assumed stationary within trials, which is rarely the case for natural stimuli. We hypothesize that a decoding model assuming each experimental trial as a realization of a random process more likely reflects the statistical properties of the undergoing process compared to the assumption of stationarity. Here, we propose a Coherence-based spectro-spatial filter that allows for reconstructing stimulus features from brain signal's features. The proposed method extracts common patterns between features of the brain signals and the stimuli that produced them. These patterns, originating from different recording electrodes are combined, forming a spatial filter that produces a unified prediction of the presented stimulus. This approach takes into account frequency, phase, and spatial distribution of brain features, hence avoiding the need to predefine specific frequency bands of interest or phase relationships between stimulus and brain responses manually. Furthermore, the model does not require the tuning of hyper-parameters, reducing significantly the computational load attached to it. Using three different cognitive tasks (motor movements, speech perception, and speech production), we show that the proposed method consistently improves stimulus feature predictions in terms of correlation (group averages of 0.74 for motor movements, 0.84 for speech perception, and 0.74 for speech production) in comparison with other methods based on regularized multivariate regression, probabilistic graphical models and artificial neural networks. Furthermore, the model parameters revealed those anatomical regions and spectral components that were discriminant in the different cognitive tasks. This novel method does not only provide a useful tool to address fundamental neuroscience questions, but could also be applied to neuroprosthetics.


Subject(s)
Cerebral Cortex/physiology , Electroencephalography , Electrophysiological Phenomena , Models, Neurological , Sense of Coherence , Adult , Algorithms , Brain Mapping , Cerebral Cortex/diagnostic imaging , Electroencephalography/methods , Female , Humans , Magnetic Resonance Imaging , Magnetoencephalography , Psychomotor Performance , Speech Perception , Young Adult
19.
Neuroimage ; 218: 116882, 2020 09.
Article in English | MEDLINE | ID: mdl-32439539

ABSTRACT

Neural oscillations in auditory cortex are argued to support parsing and representing speech constituents at their corresponding temporal scales. Yet, how incoming sensory information interacts with ongoing spontaneous brain activity, what features of the neuronal microcircuitry underlie spontaneous and stimulus-evoked spectral fingerprints, and what these fingerprints entail for stimulus encoding, remain largely open questions. We used a combination of human invasive electrophysiology, computational modeling and decoding techniques to assess the information encoding properties of brain activity and to relate them to a plausible underlying neuronal microarchitecture. We analyzed intracortical auditory EEG activity from 10 patients while they were listening to short sentences. Pre-stimulus neural activity in early auditory cortical regions often exhibited power spectra with a shoulder in the delta range and a small bump in the beta range. Speech decreased power in the beta range, and increased power in the delta-theta and gamma ranges. Using multivariate machine learning techniques, we assessed the spectral profile of information content for two aspects of speech processing: detection and discrimination. We obtained better phase than power information decoding, and a bimodal spectral profile of information content with better decoding at low (delta-theta) and high (gamma) frequencies than at intermediate (beta) frequencies. These experimental data were reproduced by a simple rate model made of two subnetworks with different timescales, each composed of coupled excitatory and inhibitory units, and connected via a negative feedback loop. Modeling and experimental results were similar in terms of pre-stimulus spectral profile (except for the iEEG beta bump), spectral modulations with speech, and spectral profile of information content. Altogether, we provide converging evidence from both univariate spectral analysis and decoding approaches for a dual timescale processing infrastructure in human auditory cortex, and show that it is consistent with the dynamics of a simple rate model.


Subject(s)
Auditory Cortex/physiology , Computer Simulation , Speech Perception/physiology , Adult , Electrocorticography , Female , Humans , Male , Signal Processing, Computer-Assisted
20.
Elife ; 92020 03 30.
Article in English | MEDLINE | ID: mdl-32223894

ABSTRACT

Speech perception presumably arises from internal models of how specific sensory features are associated with speech sounds. These features change constantly (e.g. different speakers, articulation modes etc.), and listeners need to recalibrate their internal models by appropriately weighing new versus old evidence. Models of speech recalibration classically ignore this volatility. The effect of volatility in tasks where sensory cues were associated with arbitrary experimenter-defined categories were well described by models that continuously adapt the learning rate while keeping a single representation of the category. Using neurocomputational modelling we show that recalibration of natural speech sound categories is better described by representing the latter at different time scales. We illustrate our proposal by modeling fast recalibration of speech sounds after experiencing the McGurk effect. We propose that working representations of speech categories are driven both by their current environment and their long-term memory representations.


People can distinguish words or syllables even though they may sound different with every speaker. This striking ability reflects the fact that our brain is continually modifying the way we recognise and interpret the spoken word based on what we have heard before, by comparing past experience with the most recent one to update expectations. This phenomenon also occurs in the McGurk effect: an auditory illusion in which someone hears one syllable but sees a person saying another syllable and ends up perceiving a third distinct sound. Abstract models, which provide a functional rather than a mechanistic description of what the brain does, can test how humans use expectations and prior knowledge to interpret the information delivered by the senses at any given moment. Olasagasti and Giraud have now built an abstract model of how brains recalibrate perception of natural speech sounds. By fitting the model with existing experimental data using the McGurk effect, the results suggest that, rather than using a single sound representation that is adjusted with each sensory experience, the brain recalibrates sounds at two different timescales. Over and above slow "procedural" learning, the findings show that there is also rapid recalibration of how different sounds are interpreted. This working representation of speech enables adaptation to changing or noisy environments and illustrates that the process is far more dynamic and flexible than previously thought.


Subject(s)
Computer Simulation , Phonetics , Speech Perception , Speech/classification , Acoustic Stimulation , Auditory Perception , Humans , Speech/physiology , Time Factors
SELECTION OF CITATIONS
SEARCH DETAIL
...