Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 13 de 13
Filter
Add more filters










Publication year range
1.
J Acoust Soc Am ; 155(4): 2314-2326, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38557736

ABSTRACT

Sound zones are used to reproduce individual audio content to multiple people in a room using a set of loudspeakers with controllable input signals. To allow the reproduction of individual audio to dynamically change, e.g., due to moving listeners, changes in the number of listeners, or changing room transfer functions, an adaptive formulation is proposed. This formulation is based on frequency domain block adaptive filters and given room transfer functions. To reduce computational complexity, the system is extended to subband processing without cross-adaptive filters. The computational savings come from recognizing that sound zones consist of part-solutions which are inherently band limited, hence, several subbands can be ignored. To validate the theoretical findings, a 27-channel loudspeaker array was constructed, and measurements were performed in anechoic and reflective environments. The results show that the subband solution performs identically to a full-rate solution but at a reduced computational complexity.

2.
Eur J Neurosci ; 59(8): 2059-2074, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38303522

ABSTRACT

Linear models are becoming increasingly popular to investigate brain activity in response to continuous and naturalistic stimuli. In the context of auditory perception, these predictive models can be 'encoding', when stimulus features are used to reconstruct brain activity, or 'decoding' when neural features are used to reconstruct the audio stimuli. These linear models are a central component of some brain-computer interfaces that can be integrated into hearing assistive devices (e.g., hearing aids). Such advanced neurotechnologies have been widely investigated when listening to speech stimuli but rarely when listening to music. Recent attempts at neural tracking of music show that the reconstruction performances are reduced compared with speech decoding. The present study investigates the performance of stimuli reconstruction and electroencephalogram prediction (decoding and encoding models) based on the cortical entrainment of temporal variations of the audio stimuli for both music and speech listening. Three hypotheses that may explain differences between speech and music stimuli reconstruction were tested to assess the importance of the speech-specific acoustic and linguistic factors. While the results obtained with encoding models suggest different underlying cortical processing between speech and music listening, no differences were found in terms of reconstruction of the stimuli or the cortical data. The results suggest that envelope-based linear modelling can be used to study both speech and music listening, despite the differences in the underlying cortical mechanisms.


Subject(s)
Music , Speech Perception , Auditory Perception/physiology , Speech , Speech Perception/physiology , Electroencephalography , Acoustic Stimulation
3.
Ear Hear ; 45(3): 721-729, 2024.
Article in English | MEDLINE | ID: mdl-38287477

ABSTRACT

OBJECTIVES: Background noise and linguistic violations have been shown to increase the listening effort. The present study aims to examine the effects of the interaction between background noise and linguistic violations on subjective listening effort and frontal theta oscillations during effortful listening. DESIGN: Thirty-two normal-hearing listeners participated in this study. The linguistic violation was operationalized as sentences versus random words (strings). Behavioral and electroencephalography data were collected while participants listened to sentences and strings in background noise at different signal to noise ratios (SNRs) (-9, -6, -3, 0 dB), maintained them in memory for about 3 sec in the presence of background noise, and then chose the correct sequence of words from a base matrix of words. RESULTS: Results showed the interaction effects of SNR and speech type on effort ratings. Although strings were inherently more effortful than sentences, decreasing SNR from 0 to -9 dB (in 3 dB steps), increased effort rating more for sentences than strings in each step, suggesting the more pronounced effect of noise on sentence processing that strings in low SNRs. Results also showed a significant interaction between SNR and speech type on frontal theta event-related synchronization during the retention interval. This interaction indicated that strings exhibited higher frontal theta event-related synchronization than sentences at SNR of 0 dB, suggesting increased verbal working memory demand for strings under challenging listening conditions. CONCLUSIONS: The study demonstrated that the interplay between linguistic violation and background noise shapes perceived effort and cognitive load during speech comprehension under challenging listening conditions. The differential impact of noise on processing sentences versus strings highlights the influential role of context and cognitive resource allocation in the processing of speech.


Subject(s)
Speech Perception , Humans , Noise , Linguistics , Hearing Tests , Memory, Short-Term
4.
J Acoust Soc Am ; 155(1): 757-768, 2024 Jan 01.
Article in English | MEDLINE | ID: mdl-38284823

ABSTRACT

Sound zone methods aim to control the sound field produced by an array of loudspeakers to render a given audio content in specific areas while making it almost inaudible in others. At low frequencies, control filters are based on information of the electro-acoustical path between loudspeakers and listening areas, contained in the room impulse responses (RIRs). This information can be acquired wirelessly through ubiquitous networks of microphones. In that case and for real-time applications in general, short acquisition and processing times are critical. In addition, limiting the amount of data that should be retrieved and processed can also reduce computational demands. Furthermore, such a framework would enable fast adaptation of control filters in changing acoustic environments. This work explores reducing the amount of time and information required to compute control filters when rendering and updating low-frequency sound zones. Using real RIR measurements, it is demonstrated that in some standard acoustic rooms, acquisition times on the order of a few hundred milliseconds are sufficient for accurately rendering sound zones. Moreover, an additional amount of information can be removed from the acquired RIRs without degrading the performance.

5.
Eur J Neurosci ; 58(11): 4357-4370, 2023 12.
Article in English | MEDLINE | ID: mdl-37984406

ABSTRACT

Listening effort can be defined as a measure of cognitive resources used by listeners to perform a listening task. Various methods have been proposed to measure this effort, yet their reliability remains unestablished, a crucial step before their application in research or clinical settings. This study encompassed 32 participants undertaking speech-in-noise tasks across two sessions, approximately a week apart. They listened to sentences and word lists at varying signal-to-noise ratios (SNRs) (-9, -6, -3 and 0 dB), then retaining them for roughly 3 s. We evaluated the test-retest reliability of self-reported effort ratings, theta (4-7 Hz) and alpha (8-13 Hz) oscillatory power, suggested previously as neural markers of listening effort. Additionally, we examined the reliability of correct word percentages. Both relative and absolute reliability were assessed using intraclass correlation coefficients (ICC) and Bland-Altman analysis. We also computed the standard error of measurement (SEM) and smallest detectable change (SDC). Our findings indicated heightened frontal midline theta power for word lists compared to sentences during the retention phase under high SNRs (0 dB, -3 dB), likely indicating a greater memory load for word lists. We observed SNR's impact on alpha power in the right central region during the listening phase and frontal theta power during the retention phase in sentences. Overall, the reliability analysis demonstrated satisfactory between-session variability for correct words and effort ratings. However, neural measures (frontal midline theta power and right central alpha power) displayed substantial variability, even though group-level outcomes appeared consistent across sessions.


Subject(s)
Listening Effort , Speech Perception , Humans , Self Report , Reproducibility of Results , Noise
6.
Article in English | MEDLINE | ID: mdl-37390005

ABSTRACT

It has been demonstrated that from cortical recordings, it is possible to detect which speaker a person is attending in a cocktail party scenario. The stimulus reconstruction approach, based on linear regression, has been shown to be useable to reconstruct an approximation of the envelopes of the sounds attended to and not attended to by a listener from the electroencephalogram data (EEG). Comparing the reconstructed envelopes with the envelopes of the stimuli, a higher correlation between the envelopes of the attended sound is observed. Most of the studies focused on speech listening, and only a few studies investigated the performances and the mechanisms of auditory attention decoding during music listening. In the present study, auditory attention detection (AAD) techniques that have been proven successful for speech listening were applied to a situation where the listener is actively listening to music concomitant with a distracting sound. Results show that AAD can be successful for both speech and music listening while showing differences in the reconstruction accuracy. The results of this study also highlighted the importance of the training data used in the construction of the model. This study is a first attempt to decode auditory attention from EEG data in situations where music and speech are present. The results of this study indicate that linear regression can also be used for AAD when listening to music if the model is trained for musical signals.


Subject(s)
Music , Speech Perception , Humans , Speech , Acoustic Stimulation/methods , Auditory Perception , Electroencephalography/methods
7.
J Cogn Neurosci ; 35(8): 1301-1311, 2023 08 01.
Article in English | MEDLINE | ID: mdl-37379482

ABSTRACT

The envelope of a speech signal is tracked by neural activity in the cerebral cortex. The cortical tracking occurs mainly in two frequency bands, theta (4-8 Hz) and delta (1-4 Hz). Tracking in the faster theta band has been mostly associated with lower-level acoustic processing, such as the parsing of syllables, whereas the slower tracking in the delta band relates to higher-level linguistic information of words and word sequences. However, much regarding the more specific association between cortical tracking and acoustic as well as linguistic processing remains to be uncovered. Here, we recorded EEG responses to both meaningful sentences and random word lists in different levels of signal-to-noise ratios (SNRs) that lead to different levels of speech comprehension as well as listening effort. We then related the neural signals to the acoustic stimuli by computing the phase-locking value (PLV) between the EEG recordings and the speech envelope. We found that the PLV in the delta band increases with increasing SNR for sentences but not for the random word lists, showing that the PLV in this frequency band reflects linguistic information. When attempting to disentangle the effects of SNR, speech comprehension, and listening effort, we observed a trend that the PLV in the delta band might reflect listening effort rather than the other two variables, although the effect was not statistically significant. In summary, our study shows that the PLV in the delta band reflects linguistic information and might be related to listening effort.


Subject(s)
Auditory Cortex , Speech Perception , Humans , Speech/physiology , Electroencephalography , Speech Perception/physiology , Auditory Cortex/physiology , Linguistics , Acoustic Stimulation
8.
Front Neurosci ; 16: 932959, 2022.
Article in English | MEDLINE | ID: mdl-36017182

ABSTRACT

Objectives: Comprehension of speech in adverse listening conditions is challenging for hearing-impaired (HI) individuals. Noise reduction (NR) schemes in hearing aids (HAs) have demonstrated the capability to help HI to overcome these challenges. The objective of this study was to investigate the effect of NR processing (inactive, where the NR feature was switched off, vs. active, where the NR feature was switched on) on correlates of listening effort across two different background noise levels [+3 dB signal-to-noise ratio (SNR) and +8 dB SNR] by using a phase synchrony analysis of electroencephalogram (EEG) signals. Design: The EEG was recorded while 22 HI participants fitted with HAs performed a continuous speech in noise (SiN) task in the presence of background noise and a competing talker. The phase synchrony within eight regions of interest (ROIs) and four conventional EEG bands was computed by using a multivariate phase synchrony measure. Results: The results demonstrated that the activation of NR in HAs affects the EEG phase synchrony in the parietal ROI at low SNR differently than that at high SNR. The relationship between conditions of the listening task and phase synchrony in the parietal ROI was nonlinear. Conclusion: We showed that the activation of NR schemes in HAs can non-linearly reduce correlates of listening effort as estimated by EEG-based phase synchrony. We contend that investigation of the phase synchrony within ROIs can reflect the effects of HAs in HI individuals in ecological listening conditions.

9.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 531-534, 2021 11.
Article in English | MEDLINE | ID: mdl-34891349

ABSTRACT

Comprehension of speech in noise is a challenge for hearing-impaired (HI) individuals. Electroencephalography (EEG) provides a tool to investigate the effect of different levels of signal-to-noise ratio (SNR) of the speech. Most studies with EEG have focused on spectral power in well-defined frequency bands such as alpha band. In this study, we investigate how local functional connectivity, i.e. functional connectivity within a localized region of the brain, is affected by two levels of SNR. Twenty-two HI participants performed a continuous speech in noise task at two different SNRs (+3 dB and +8 dB). The local connectivity within eight regions of interest was computed by using a multivariate phase synchrony measure on EEG data. The results showed that phase synchrony increased in the parietal and frontal area as a response to increasing SNR. We contend that local connectivity measures can be used to discriminate between speech-evoked EEG responses at different SNRs.


Subject(s)
Speech Perception , Speech , Electroencephalography Phase Synchronization , Humans , Noise , Signal-To-Noise Ratio
10.
Entropy (Basel) ; 23(8)2021 Jul 24.
Article in English | MEDLINE | ID: mdl-34441087

ABSTRACT

We study the increase in per-sample differential entropy rate of random sequences and processes after being passed through a non minimum-phase (NMP) discrete-time, linear time-invariant (LTI) filter G. For LTI discrete-time filters and random processes, it has long been established by Theorem 14 in Shannon's seminal paper that this entropy gain, G(G), equals the integral of log|G(ejω)|. In this note, we first show that Shannon's Theorem 14 does not hold in general. Then, we prove that, when comparing the input differential entropy to that of the entire (longer) output of G, the entropy gain equals G(G). We show that the entropy gain between equal-length input and output sequences is upper bounded by G(G) and arises if and only if there exists an output additive disturbance with finite differential entropy (no matter how small) or a random initial state. Unlike what happens with linear maps, the entropy gain in this case depends on the distribution of all the signals involved. We illustrate some of the consequences of these results by presenting their implications in three different problems. Specifically: conditions for equality in an information inequality of importance in networked control problems; extending to a much broader class of sources the existing results on the rate-distortion function for non-stationary Gaussian sources, and an observation on the capacity of auto-regressive Gaussian channels with feedback.

11.
Entropy (Basel) ; 23(5)2021 Apr 26.
Article in English | MEDLINE | ID: mdl-33925905

ABSTRACT

We present novel data-processing inequalities relating the mutual information and the directed information in systems with feedback. The internal deterministic blocks within such systems are restricted only to be causal mappings, but are allowed to be non-linear and time varying, and randomized by their own external random input, can yield any stochastic mapping. These randomized blocks can for example represent source encoders, decoders, or even communication channels. Moreover, the involved signals can be arbitrarily distributed. Our first main result relates mutual and directed information and can be interpreted as a law of conservation of information flow. Our second main result is a pair of data-processing inequalities (one the conditional version of the other) between nested pairs of random sequences entirely within the closed loop. Our third main result introduces and characterizes the notion of in-the-loop (ITL) transmission rate for channel coding scenarios in which the messages are internal to the loop. Interestingly, in this case the conventional notions of transmission rate associated with the entropy of the messages and of channel capacity based on maximizing the mutual information between the messages and the output turn out to be inadequate. Instead, as we show, the ITL transmission rate is the unique notion of rate for which a channel code attains zero error probability if and only if such an ITL rate does not exceed the corresponding directed information rate from messages to decoded messages. We apply our data-processing inequalities to show that the supremum of achievable (in the usual channel coding sense) ITL transmission rates is upper bounded by the supremum of the directed information rate across the communication channel. Moreover, we present an example in which this upper bound is attained. Finally, we further illustrate the applicability of our results by discussing how they make possible the generalization of two fundamental inequalities known in networked control literature.

12.
Entropy (Basel) ; 22(8)2020 Jul 30.
Article in English | MEDLINE | ID: mdl-33286612

ABSTRACT

In this paper, we derive lower and upper bounds on the OPTA of a two-user multi-input multi-output (MIMO) causal encoding and causal decoding problem. Each user's source model is described by a multidimensional Markov source driven by additive i.i.d. noise process subject to three classes of spatio-temporal distortion constraints. To characterize the lower bounds, we use state augmentation techniques and a data processing theorem, which recovers a variant of rate distortion function as an information measure known in the literature as nonanticipatory ϵ-entropy, sequential or nonanticipative RDF. We derive lower bound characterizations for a system driven by an i.i.d. Gaussian noise process, which we solve using the SDP algorithm for all three classes of distortion constraints. We obtain closed form solutions when the system's noise is possibly non-Gaussian for both users and when only one of the users is described by a source model driven by a Gaussian noise process. To obtain the upper bounds, we use the best linear forward test channel realization that corresponds to the optimal test channel realization when the system is driven by a Gaussian noise process and apply a sequential causal DPCM-based scheme with a feedback loop followed by a scaled ECDQ scheme that leads to upper bounds with certain performance guarantees. Then, we use the linear forward test channel as a benchmark to obtain upper bounds on the OPTA, when the system is driven by an additive i.i.d. non-Gaussian noise process. We support our framework with various simulation studies.

13.
Entropy (Basel) ; 22(10)2020 Oct 03.
Article in English | MEDLINE | ID: mdl-33286893

ABSTRACT

We propose a new estimator to measure directed dependencies in time series. The dimensionality of data is first reduced using a new non-uniform embedding technique, where the variables are ranked according to a weighted sum of the amount of new information and improvement of the prediction accuracy provided by the variables. Then, using a greedy approach, the most informative subsets are selected in an iterative way. The algorithm terminates, when the highest ranked variable is not able to significantly improve the accuracy of the prediction as compared to that obtained using the existing selected subsets. In a simulation study, we compare our estimator to existing state-of-the-art methods at different data lengths and directed dependencies strengths. It is demonstrated that the proposed estimator has a significantly higher accuracy than that of existing methods, especially for the difficult case, where the data are highly correlated and coupled. Moreover, we show its false detection of directed dependencies due to instantaneous couplings effect is lower than that of existing measures. We also show applicability of the proposed estimator on real intracranial electroencephalography data.

SELECTION OF CITATIONS
SEARCH DETAIL
...