Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 95
Filter
1.
PLoS One ; 17(12): e0277801, 2022.
Article in English | MEDLINE | ID: mdl-36454948

ABSTRACT

The human brain networks responsible for selectively listening to a voice amid other talkers remain to be clarified. The present study aimed to investigate relationships between cortical activity and performance in a speech-in-speech task, before (Experiment I) and after training-induced improvements (Experiment II). In Experiment I, 74 participants performed a speech-in-speech task while their cortical activity was measured using a functional near infrared spectroscopy (fNIRS) device. One target talker and one masker talker were simultaneously presented at three different target-to-masker ratios (TMRs): adverse, intermediate and favorable. Behavioral results show that performance may increase monotonically with TMR in some participants and failed to decrease, or even improved, in the adverse-TMR condition for others. On the neural level, an extensive brain network including the frontal (left prefrontal cortex, right dorsolateral prefrontal cortex and bilateral inferior frontal gyri) and temporal (bilateral auditory cortex) regions was more solicited by the intermediate condition than the two others. Additionally, bilateral frontal gyri and left auditory cortex activities were found to be positively correlated with behavioral performance in the adverse-TMR condition. In Experiment II, 27 participants, whose performance was the poorest in the adverse-TMR condition of Experiment I, were trained to improve performance in that condition. Results show significant performance improvements along with decreased activity in bilateral inferior frontal gyri, the right dorsolateral prefrontal cortex, the left inferior parietal cortex and the right auditory cortex in the adverse-TMR condition after training. Arguably, lower neural activity reflects higher efficiency in processing masker inhibition after speech-in-speech training. As speech-in-noise tasks also imply frontal and temporal regions, we suggest that regardless of the type of masking (speech or noise) the complexity of the task will prompt the implication of a similar brain network. Furthermore, the initial significant cognitive recruitment will be reduced following a training leading to an economy of cognitive resources.


Subject(s)
Auditory Cortex , Speech Intelligibility , Humans , Prefrontal Cortex/diagnostic imaging , Parietal Lobe , Dorsolateral Prefrontal Cortex
2.
Heliyon ; 8(6): e09631, 2022 Jun.
Article in English | MEDLINE | ID: mdl-35734572

ABSTRACT

Numerous studies showed that task-evoked pupil dilation is an objective marker of cognitive activity and listening effort. However, these studies differ in their experimental and analysis methods. Whereas most studies focus on a single method, the present study sought to compare different pupil-dilation data analysis methods, including different normalization techniques, baseline periods, and baseline durations, in order to assess their influence on the outcomes of pupillometry results obtained in an auditory task. To that purpose, we used pupillometry data recorded in response to words in noise in hearing-impaired individuals. The start-time of the baseline relative to stimulus timing turned out to have a significant influence on conclusions. In particular, a significant interaction in the effects of signal-to-noise ratio and hearing-aid use on pupil dilation was observed when the baseline period used started early relative to the word-an effect likely related to anticipatory, pre-stimulus cognitive processes, such as attention mobilization. This was the case even with only correct-response trials included in analyses, so that any confounding effect of performance in the word-repetition task was eliminated. Different normalization methods and baseline durations showed similar results, however the use of z-score transformation homogenized variability across conditions without affecting the qualitative aspect of the results. The consistency of results regardless of normalization methods, and the fact that differences in pupil dilation and subjective measures of listening effort could be observed despite perfect performance in the task, underlines the relevance of pupillometry as an objective measure of listening effort.

3.
J Acoust Soc Am ; 151(3): 1557, 2022 03.
Article in English | MEDLINE | ID: mdl-35364949

ABSTRACT

It is not always easy to follow a conversation in a noisy environment. To distinguish between two speakers, a listener must mobilize many perceptual and cognitive processes to maintain attention on a target voice and avoid shifting attention to the background noise. The development of an intelligibility task with long stimuli-the Long-SWoRD test-is introduced. This protocol allows participants to fully benefit from the cognitive resources, such as semantic knowledge, to separate two talkers in a realistic listening environment. Moreover, this task also provides the experimenters with a means to infer fluctuations in auditory selective attention. Two experiments document the performance of normal-hearing listeners in situations where the perceptual separability of the competing voices ranges from easy to hard using a combination of voice and binaural cues. The results show a strong effect of voice differences when the voices are presented diotically. In addition, analyzing the influence of the semantic context on the pattern of responses indicates that the semantic information induces a response bias in situations where the competing voices are distinguishable and indistinguishable from one another.


Subject(s)
Speech Perception , Speech , Cues , Humans , Perceptual Masking , Semantics , Speech Perception/physiology
4.
Front Neurosci ; 15: 674112, 2021.
Article in English | MEDLINE | ID: mdl-34966252

ABSTRACT

During the past decade, several studies have identified electroencephalographic (EEG) correlates of selective auditory attention to speech. In these studies, typically, listeners are instructed to focus on one of two concurrent speech streams (the "target"), while ignoring the other (the "masker"). EEG signals are recorded while participants are performing this task, and subsequently analyzed to recover the attended stream. An assumption often made in these studies is that the participant's attention can remain focused on the target throughout the test. To check this assumption, and assess when a participant's attention in a concurrent speech listening task was directed toward the target, the masker, or neither, we designed a behavioral listen-then-recall task (the Long-SWoRD test). After listening to two simultaneous short stories, participants had to identify keywords from the target story, randomly interspersed among words from the masker story and words from neither story, on a computer screen. To modulate task difficulty, and hence, the likelihood of attentional switches, masker stories were originally uttered by the same talker as the target stories. The masker voice parameters were then manipulated to parametrically control the similarity of the two streams, from clearly dissimilar to almost identical. While participants listened to the stories, EEG signals were measured and subsequently, analyzed using a temporal response function (TRF) model to reconstruct the speech stimuli. Responses in the behavioral recall task were used to infer, retrospectively, when attention was directed toward the target, the masker, or neither. During the model-training phase, the results of these behavioral-data-driven inferences were used as inputs to the model in addition to the EEG signals, to determine if this additional information would improve stimulus reconstruction accuracy, relative to performance of models trained under the assumption that the listener's attention was unwaveringly focused on the target. Results from 21 participants show that information regarding the actual - as opposed to, assumed - attentional focus can be used advantageously during model training, to enhance subsequent (test phase) accuracy of auditory stimulus-reconstruction based on EEG signals. This is the case, especially, in challenging listening situations, where the participants' attention is less likely to remain focused entirely on the target talker. In situations where the two competing voices are clearly distinct and easily separated perceptually, the assumption that listeners are able to stay focused on the target is reasonable. The behavioral recall protocol introduced here provides experimenters with a means to behaviorally track fluctuations in auditory selective attention, including, in combined behavioral/neurophysiological studies.

5.
J Speech Lang Hear Res ; 64(4): 1413-1419, 2021 04 14.
Article in English | MEDLINE | ID: mdl-33820426

ABSTRACT

Purpose The aim of the study was to investigate changes in autonomic function, as measured by heart rate variability, in individuals with tinnitus following acoustic therapy implemented using tinnitus maskers presented via hearing aids. Method Twenty-six individuals with tinnitus and hearing impairment completed an 8-week field trial wearing hearing aids providing acoustic therapy via three tinnitus masker options set just below minimum masking level. Tinnitus handicap was measured using the Tinnitus Handicap Inventory at baseline (before starting acoustic therapy) and posttreatment (at end of 8-week trial). Resting heart rate and heart rate variability were measured using electrocardiography at baseline and posttreatment. Results There was a significant decrease in tinnitus handicap posttreatment compared to baseline. There was no change in heart rate, but there was a significant increase in heart rate variability posttreatment compared to baseline. Conclusions Acoustic therapy using tinnitus maskers delivered via hearing aids provided tinnitus relief and produced a concurrent increase in heart rate variability, suggesting a decrease in stress. Heart rate variability is a potential biomarker for tracking efficacy of acoustic therapy; however, further research is required.


Subject(s)
Hearing Aids , Hearing Loss , Tinnitus , Acoustic Stimulation , Acoustics , Heart Rate , Humans , Tinnitus/therapy
6.
J Acoust Soc Am ; 149(1): 259, 2021 01.
Article in English | MEDLINE | ID: mdl-33514136

ABSTRACT

The ability to discriminate frequency differences between pure tones declines as the duration of the interstimulus interval (ISI) increases. The conventional explanation for this finding is that pitch representations gradually decay from auditory short-term memory. Gradual decay means that internal noise increases with increasing ISI duration. Another possibility is that pitch representations experience "sudden death," disappearing without a trace from memory. Sudden death means that listeners guess (respond at random) more often when the ISIs are longer. Since internal noise and guessing probabilities influence the shape of psychometric functions in different ways, they can be estimated simultaneously. Eleven amateur musicians performed a two-interval, two-alternative forced-choice frequency-discrimination task. The frequencies of the first tones were roved, and frequency differences and ISI durations were manipulated across trials. Data were analyzed using Bayesian models that simultaneously estimated internal noise and guessing probabilities. On average across listeners, internal noise increased monotonically as a function of increasing ISI duration, suggesting that gradual decay occurred. The guessing rate decreased with an increasing ISI duration between 0.5 and 2 s but then increased with further increases in ISI duration, suggesting that sudden death occurred but perhaps only at longer ISIs. Results are problematic for decay-only models of discrimination and contrast with those from a study on visual short-term memory, which found that over similar durations, visual representations experienced little gradual decay yet substantial sudden death.


Subject(s)
Memory, Short-Term , Music , Pitch Discrimination , Bayes Theorem , Humans , Noise
7.
J Acoust Soc Am ; 147(1): 371, 2020 01.
Article in English | MEDLINE | ID: mdl-32006971

ABSTRACT

Perceptual anchors are representations of stimulus features stored in long-term memory rather than short-term memory. The present study investigated whether listeners use perceptual anchors to improve pure-tone frequency discrimination. Ten amateur musicians performed a two-interval, two-alternative forced-choice frequency-discrimination experiment. In one half of the experiment, the frequency of the first tone was fixed across trials, and in the other half, the frequency of the first tone was roved widely across trials. The durations of the interstimulus intervals (ISIs) and the frequency differences between the tones on each trial were also manipulated. The data were analyzed with a Bayesian model that assumed that performance was limited by sensory noise (related to the initial encoding of the stimuli), memory noise (which increased proportionally to the ISI), fluctuations in attention, and response bias. It was hypothesized that memory-noise variance increased more rapidly during roved-frequency discrimination than fixed-frequency discrimination because listeners used perceptual anchors in the latter condition. The results supported this hypothesis. The results also suggested that listeners experienced more lapses in attention during roved-frequency discrimination.


Subject(s)
Auditory Perception , Memory, Long-Term , Pitch Discrimination , Acoustic Stimulation , Adult , Bayes Theorem , Female , Humans , Male , Psychophysics , Young Adult
8.
Trends Hear ; 23: 2331216519886707, 2019.
Article in English | MEDLINE | ID: mdl-31722636

ABSTRACT

There is increasing evidence that hearing-impaired (HI) individuals do not use the same listening strategies as normal-hearing (NH) individuals, even when wearing optimally fitted hearing aids. In this perspective, better characterization of individual perceptual strategies is an important step toward designing more effective speech-processing algorithms. Here, we describe two complementary approaches for (a) revealing the acoustic cues used by a participant in a /d/-/g/ categorization task in noise and (b) measuring the relative contributions of these cues to decision. These two approaches involve natural speech recordings altered by the addition of a "bump noise." The bumps were narrowband bursts of noise localized on the spectrotemporal locations of the acoustic cues, allowing the experimenter to manipulate the consonant percept. The cue-weighting strategies were estimated for three groups of participants: 17 NH listeners, 18 HI listeners with high-frequency loss, and 15 HI listeners with flat loss. HI participants were provided with individual frequency-dependent amplification to compensate for their hearing loss. Although all listeners relied more heavily on the high-frequency cue than on the low-frequency cue, an important variability was observed in the individual weights, mostly explained by differences in internal noise. Individuals with high-frequency loss relied slightly less heavily on the high-frequency cue relative to the low-frequency cue, compared with NH individuals, suggesting a possible influence of supra-threshold deficits on cue-weighting strategies. Altogether, these results suggest a need for individually tailored speech-in-noise processing in hearing aids, if more effective speech discriminability in noise is to be achieved.


Subject(s)
Hearing Loss, High-Frequency/pathology , Hearing Loss, Sensorineural/pathology , Persons With Hearing Impairments/statistics & numerical data , Speech Perception , Adult , Aged , Cues , Female , Hearing Aids , Humans , Male , Middle Aged , Noise , Young Adult
9.
Ear Hear ; 40(4): 938-950, 2019.
Article in English | MEDLINE | ID: mdl-30461444

ABSTRACT

OBJECTIVES: The objective of this work was to build a 15-item short-form of the Speech Spatial and Qualities of Hearing Scale (SSQ) that maintains the three-factor structure of the full form, using a data-driven approach consistent with internationally recognized procedures for short-form building. This included the validation of the new short-form on an independent sample and an in-depth, comparative analysis of all existing, full and short SSQ forms. DESIGN: Data from a previous study involving 98 normal-hearing (NH) individuals and 196 people with hearing impairments (HI), non hearing aid wearers, along with results from several other published SSQ studies, were used for developing the short-form. Data from a new and independent sample of 35 NH and 88 HI hearing aid wearers were used to validate the new short-form. Factor and hierarchical cluster analyses were used to check the factor structure and internal consistency of the new short-form. In addition, the new short-form was compared with all other SSQ forms, including the full SSQ, the German SSQ15, the SSQ12, and the SSQ5. Construct validity was further assessed by testing statistical relationships between scores and audiometric factors, including pure-tone threshold averages (PTAs) and left/right PTA asymmetry. Receiver-operating characteristic analyses were used to compare the ability of different SSQ forms to discriminate between NH and HI (HI non hearing aid wearers and HI hearing aid wearers) individuals. RESULTS: Compared all other SSQ forms, including the full SSQ, the new short-form showed negligible cross-loading across the three main subscales and greater discriminatory power between NH and HI subjects (as indicated by a larger area under the receiver-operating characteristic curve), as well as between the main subscales (especially Speech and Qualities). Moreover, the new, 5-item Spatial subscale showed increased sensitivity to left/right PTA asymmetry. Very good internal consistency and homogeneity and high correlations with the SSQ were obtained for all short-forms. CONCLUSIONS: While maintaining the three-factor structure of the full SSQ, and exceeding the latter in terms of construct validity and sensitivity to audiometric variables, the new 15-item SSQ affords a substantial reduction in the number of items and, thus, in test time. Based on overall scores, Speech subscores, or Spatial subscores, but not Qualities subscores, the 15-item SSQ appears to be more sensitive to differences in self-evaluated hearing abilities between NH and HI subjects than the full SSQ.


Subject(s)
Hearing Aids , Hearing Loss/rehabilitation , Patient Reported Outcome Measures , Adolescent , Adult , Aged , Aged, 80 and over , Audiometry, Pure-Tone , Case-Control Studies , Cluster Analysis , Factor Analysis, Statistical , Female , Hearing Loss/physiopathology , Humans , Male , Middle Aged , ROC Curve , Reproducibility of Results , Speech Perception , Young Adult
10.
J Acoust Soc Am ; 144(4): 2462, 2018 10.
Article in English | MEDLINE | ID: mdl-30404465

ABSTRACT

In order to perceive meaningful speech, the auditory system must recognize different phonemes amidst a noisy and variable acoustic signal. To better understand the processing mechanisms underlying this ability, evoked cortical responses to different spoken consonants were measured with electroencephalography (EEG). Using multivariate pattern analysis (MVPA), binary classifiers attempted to discriminate between the EEG activity evoked by two given consonants at each peri-stimulus time sample, providing a dynamic measure of their cortical dissimilarity. To examine the relationship between representations at the auditory periphery and cortex, MVPA was also applied to modelled auditory-nerve (AN) responses of consonants, and time-evolving AN-based and EEG-based dissimilarities were compared with one another. Cortical dissimilarities between consonants were commensurate with their articulatory distinctions, particularly their manner of articulation, and to a lesser extent, their voicing. Furthermore, cortical distinctions between consonants in two periods of activity, centered at 130 and 400 ms after onset, aligned with their peripheral dissimilarities in distinct onset and post-onset periods, respectively. In relating speech representations across articulatory, peripheral, and cortical domains, the understanding of crucial transformations in the auditory pathway underlying the ability to perceive speech is advanced.


Subject(s)
Auditory Cortex/physiology , Auditory Pathways/physiology , Speech Perception , Adult , Female , Humans , Male , Phonetics
11.
J Acoust Soc Am ; 143(6): 3665, 2018 06.
Article in English | MEDLINE | ID: mdl-29960504

ABSTRACT

Using a same-different discrimination task, it has been shown that discrimination performance for sequences of complex tones varying just detectably in pitch is less dependent on sequence length (1, 2, or 4 elements) when the tones contain resolved harmonics than when they do not [Cousineau, Demany, and Pessnitzer (2009). J. Acoust. Soc. Am. 126, 3179-3187]. This effect had been attributed to the activation of automatic frequency-shift detectors (FSDs) by the shifts in resolved harmonics. The present study provides evidence against this hypothesis by showing that the sequence-processing advantage found for complex tones with resolved harmonics is not found for pure tones or other sounds supposed to activate FSDs (narrow bands of noise and wide-band noises eliciting pitch sensations due to interaural phase shifts). The present results also indicate that for pitch sequences, processing performance is largely unrelated to pitch salience per se: for a fixed level of discriminability between sequence elements, sequences of elements with salient pitches are not necessarily better processed than sequences of elements with less salient pitches. An ideal-observer model for the same-different binary-sequence discrimination task is also developed in the present study. The model allows the computation of d' for this task using numerical methods.

12.
J Acoust Soc Am ; 142(4): 2386, 2017 10.
Article in English | MEDLINE | ID: mdl-29092591

ABSTRACT

To better understand issues of hearing-aid benefit during natural listening, this study examined the added demand placed by the goal of understanding speech over the more typically studied goal of simply recognizing speech sounds. The study compared hearing-aid benefit in two conditions, and examined factors that might account for the observed benefits. In the phonetic condition, listeners needed only identify the correct sound to make a correct response. In the semantic condition, listeners had to understand what they had heard to respond correctly, because the answer did not include any keywords from the spoken speech. Hearing aids provided significant benefit for listeners in the phonetic condition. In the semantic condition on the other hand, there were large inter-individual differences, with many listeners not experiencing any benefit of aiding. Neither a set of cognitive and linguistic tests, nor age, could explain this variability. Furthermore, analysis of psychometric functions showed that enhancement of the target speech fidelity through improvement of signal-to-noise ratio had a larger impact on listeners' performance in the phonetic condition than in the semantic condition. These results demonstrate the importance of incorporating naturalistic elements in the simulation of multi-talker listening for assessing the benefits of intervention in communication success.


Subject(s)
Hearing Aids , Hearing Loss, Sensorineural , Speech Perception , Aged , Aged, 80 and over , Analysis of Variance , Auditory Threshold , Female , Hearing Loss, Sensorineural/rehabilitation , Humans , Intelligence Tests , Male , Middle Aged , Neuropsychological Tests , Signal-To-Noise Ratio , Vocabulary
13.
Curr Biol ; 27(5): 743-750, 2017 Mar 06.
Article in English | MEDLINE | ID: mdl-28238657

ABSTRACT

Noise is a ubiquitous source of errors in all forms of communication [1]. Noise-induced errors in speech communication, for example, make it difficult for humans to converse in noisy social settings, a challenge aptly named the "cocktail party problem" [2]. Many nonhuman animals also communicate acoustically in noisy social groups and thus face biologically analogous problems [3]. However, we know little about how the perceptual systems of receivers are evolutionarily adapted to avoid the costs of noise-induced errors in communication. In this study of Cope's gray treefrog (Hyla chrysoscelis; Hylidae), we investigated whether receivers exploit a potential statistical regularity present in noisy acoustic scenes to reduce errors in signal recognition and discrimination. We developed an anatomical/physiological model of the peripheral auditory system to show that temporal correlation in amplitude fluctuations across the frequency spectrum ("comodulation") [4-6] is a feature of the noise generated by large breeding choruses of sexually advertising males. In four psychophysical experiments, we investigated whether females exploit comodulation in background noise to mitigate noise-induced errors in evolutionarily critical mate-choice decisions. Subjects experienced fewer errors in recognizing conspecific calls and in selecting the calls of high-quality mates in the presence of simulated chorus noise that was comodulated. These data show unequivocally, and for the first time, that exploiting statistical regularities present in noisy acoustic scenes is an important biological strategy for solving cocktail-party-like problems in nonhuman animal communication.


Subject(s)
Anura/physiology , Perceptual Masking , Sound Localization , Vocalization, Animal , Animals , Female , Male , Psychophysics
14.
Ear Hear ; 38(4): 465-474, 2017.
Article in English | MEDLINE | ID: mdl-28169839

ABSTRACT

OBJECTIVES: The goal of this study was to examine whether individuals are using speech intelligibility to determine how much noise they are willing to accept while listening to running speech. Previous research has shown that the amount of background noise that an individual is willing to accept while listening to speech is predictive of his or her likelihood of success with hearing aids. If it were possible to determine the criterion by which individuals make this judgment, then it may be possible to alter this cue, especially for those who are unlikely to be successful with hearing aids, and thereby improve their chances of success with hearing aids. DESIGN: Twenty-one individuals with normal hearing and 21 with sensorineural hearing loss participated in this study. In each group, there were 7 with a low, moderate, and high acceptance of background noise, as determined by the Acceptable Noise Level (ANL) test. (During the ANL test, listeners adjusted speech to their most comfortable listening level, then background noise was added, and they adjusted it to the maximum level that they were "willing to put up with" while listening to the speech.) Participants also performed a modified version of the ANL test in which the speech was fixed at four different levels (50, 63, 75, and 88 dBA), and they adjusted only the level of the background noise. The authors calculated speech intelligibility index (SII) scores for each participant and test level. SII scores ranged from 0 (no speech information is present) to 1 (100% of the speech information is present). The authors considered a participant's results to be consistent with a speech intelligibility-based listening criterion if his or her SIIs remained constant across all of the test conditions. RESULTS: For all but one of the participants with normal hearing, their SIIs remained constant across the entire 38-dB range of speech levels. For all participants with hearing loss, the SII increased with speech level. CONCLUSIONS: For most listeners with normal hearing, their ANLs were consistent with the use of speech intelligibility as a listening cue; for listeners with hearing impairment, they were not. Future studies should determine what cues these individuals are using when selecting an ANL. Having a better understanding of these cues may help audiologists design and optimize treatment options for their patients.


Subject(s)
Hearing Loss, Sensorineural/physiopathology , Noise , Speech Perception , Case-Control Studies , Cues , Hearing Aids , Hearing Loss, Sensorineural/rehabilitation , Humans , Prognosis , Speech Intelligibility , Treatment Outcome
15.
PLoS One ; 11(8): e0159975, 2016.
Article in English | MEDLINE | ID: mdl-27536884

ABSTRACT

Binaural pitch diplacusis refers to a perceptual anomaly whereby the same sound is perceived as having a different pitch depending on whether it is presented in the left or the right ear. Results in the literature suggest that this phenomenon is more prevalent, and larger, in individuals with asymmetric hearing loss than in individuals with symmetric hearing. However, because studies devoted to this effect have thus far involved small samples, the prevalence of the effect, and its relationship with interaural asymmetries in hearing thresholds, remain unclear. In this study, psychometric functions for interaural pitch comparisons were measured in 55 subjects, including 12 normal-hearing and 43 hearing-impaired participants. Statistically significant pitch differences between the left and right ears were observed in normal-hearing participants, but the effect was usually small (less than 1.5/16 octave, or about 7%). For the hearing-impaired participants, statistically significant interaural pitch differences were found in about three-quarters of the cases. Moreover, for about half of these participants, the difference exceeded 1.5/16 octaves and, in some participants, was as large as or larger than 1/4 octave. This was the case even for the lowest frequency tested, 500 Hz. The pitch differences were weakly, but significantly, correlated with the difference in hearing thresholds between the two ears, such that larger threshold asymmetries were statistically associated with larger pitch differences. For the vast majority of the hearing-impaired participants, the direction of the pitch differences was such that pitch was perceived as higher on the side with the higher (i.e., 'worse') hearing thresholds than on the opposite side. These findings are difficult to reconcile with purely temporal models of pitch perception, but may be accounted for by place-based or spectrotemporal models.


Subject(s)
Auditory Threshold , Functional Laterality , Pitch Perception , Adult , Aged , Case-Control Studies , Female , Hearing Loss/physiopathology , Hearing Tests , Humans , Male , Middle Aged , Young Adult
16.
eNeuro ; 3(3)2016.
Article in English | MEDLINE | ID: mdl-27294198

ABSTRACT

Successful speech perception in real-world environments requires that the auditory system segregate competing voices that overlap in frequency and time into separate streams. Vowels are major constituents of speech and are comprised of frequencies (harmonics) that are integer multiples of a common fundamental frequency (F0). The pitch and identity of a vowel are determined by its F0 and spectral envelope (formant structure), respectively. When two spectrally overlapping vowels differing in F0 are presented concurrently, they can be readily perceived as two separate "auditory objects" with pitches at their respective F0s. A difference in pitch between two simultaneous vowels provides a powerful cue for their segregation, which in turn, facilitates their individual identification. The neural mechanisms underlying the segregation of concurrent vowels based on pitch differences are poorly understood. Here, we examine neural population responses in macaque primary auditory cortex (A1) to single and double concurrent vowels (/a/ and /i/) that differ in F0 such that they are heard as two separate auditory objects with distinct pitches. We find that neural population responses in A1 can resolve, via a rate-place code, lower harmonics of both single and double concurrent vowels. Furthermore, we show that the formant structures, and hence the identities, of single vowels can be reliably recovered from the neural representation of double concurrent vowels. We conclude that A1 contains sufficient spectral information to enable concurrent vowel segregation and identification by downstream cortical areas.


Subject(s)
Auditory Cortex/physiology , Neurons/physiology , Speech Perception/physiology , Acoustic Stimulation , Animals , Macaca fascicularis , Male , Microelectrodes , Phonetics , Pitch Perception/physiology
17.
Annu Int Conf IEEE Eng Med Biol Soc ; 2016: 72-76, 2016 Aug.
Article in English | MEDLINE | ID: mdl-28268284

ABSTRACT

The first revolution in hearing aids came from nonlinear amplification, which allows better compensation for both soft and loud sounds. The second revolution stemmed from the introduction of digital signal processing, which allows better programmability and more sophisticated algorithms. The third revolution in hearing aids is wireless, which allows seamless connectivity between a pair of hearing aids and with more and more external devices. Each revolution has fundamentally transformed hearing aids and pushed the entire industry forward significantly. Machine learning has received significant attention in recent years and has been applied in many other industries, e.g., robotics, speech recognition, genetics, and crowdsourcing. We argue that the next revolution in hearing aids is machine intelligence. In fact, this revolution is already quietly happening. We will review the development in at least three major areas: applications of machine learning in speech enhancement; applications of machine learning in individualization and customization of signal processing algorithms; applications of machine learning in improving the efficiency and effectiveness of clinical tests. With the advent of the internet of things, the above developments will accelerate. This revolution will bring patient satisfactions to a new level that has never been seen before.


Subject(s)
Hearing Aids , Machine Learning , Algorithms , Hearing Tests/instrumentation , Hearing Tests/methods , Humans , Precision Medicine/methods , Signal Processing, Computer-Assisted , Speech Intelligibility , Speech Perception
18.
Atten Percept Psychophys ; 77(4): 1448-60, 2015 May.
Article in English | MEDLINE | ID: mdl-25724517

ABSTRACT

Proportion correct (Pc) is a fundamental measure of task performance in psychophysics. The maximum Pc score that can be achieved by an optimal (maximum-likelihood) observer in a given task is of both theoretical and practical importance, because it sets an upper limit on human performance. Within the framework of signal detection theory, analytical solutions for computing the maximum Pc score have been established for several common experimental paradigms under the assumption of Gaussian additive internal noise. However, as the scope of applications of psychophysical signal detection theory expands, the need is growing for psychophysicists to compute maximum Pc scores for situations involving non-Gaussian (internal or stimulus-induced) noise. In this article, we provide a general formula for computing the maximum Pc in various psychophysical experimental paradigms for arbitrary probability distributions of sensory activity. Moreover, easy-to-use MATLAB code implementing the formula is provided. Practical applications of the formula are illustrated, and its accuracy is evaluated, for two paradigms and two types of probability distributions (uniform and Gaussian). The results demonstrate that Pc scores computed using the formula remain accurate even for continuous probability distributions, as long as the conversion from continuous probability density functions to discrete probability mass functions is supported by a sufficiently high sampling resolution. We hope that the exposition in this article, and the freely available MATLAB code, facilitates calculations of maximum performance for a wider range of experimental situations, as well as explorations of the impact of different assumptions concerning internal-noise distributions on maximum performance in psychophysical experiments.


Subject(s)
Normal Distribution , Probability , Psychophysics/methods , Signal Detection, Psychological , Humans
19.
J Exp Psychol Hum Percept Perform ; 40(6): 2338-47, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25365571

ABSTRACT

The question of what makes a good melody has interested composers, music theorists, and psychologists alike. Many of the observed principles of good "melodic continuation" involve melodic contour-the pattern of rising and falling pitch within a sequence. Previous work has shown that contour perception can extend beyond pitch to other auditory dimensions, such as brightness and loudness. Here, we show that the generalization of contour perception to nontraditional dimensions also extends to melodic expectations. In the first experiment, subjective ratings for 3-tone sequences that vary in brightness or loudness conformed to the same general contour-based expectations as pitch sequences. In the second experiment, we modified the sequence of melody presentation such that melodies with the same beginning were blocked together. This change produced substantively different results, but the patterns of ratings remained similar across the 3 auditory dimensions. Taken together, these results suggest that (a) certain well-known principles of melodic expectation (such as the expectation for a reversal following a skip) are dependent on long-term context, and (b) these expectations are not unique to the dimension of pitch and may instead reflect more general principles of perceptual organization.


Subject(s)
Anticipation, Psychological , Auditory Perception , Judgment , Loudness Perception , Music , Pitch Discrimination , Adolescent , Adult , Female , Humans , Male , Psychoacoustics , Sound Spectrography , Young Adult
20.
J Neurosci ; 34(37): 12425-43, 2014 Sep 10.
Article in English | MEDLINE | ID: mdl-25209282

ABSTRACT

The ability to attend to a particular sound in a noisy environment is an essential aspect of hearing. To accomplish this feat, the auditory system must segregate sounds that overlap in frequency and time. Many natural sounds, such as human voices, consist of harmonics of a common fundamental frequency (F0). Such harmonic complex tones (HCTs) evoke a pitch corresponding to their F0. A difference in pitch between simultaneous HCTs provides a powerful cue for their segregation. The neural mechanisms underlying concurrent sound segregation based on pitch differences are poorly understood. Here, we examined neural responses in monkey primary auditory cortex (A1) to two concurrent HCTs that differed in F0 such that they are heard as two separate "auditory objects" with distinct pitches. We found that A1 can resolve, via a rate-place code, the lower harmonics of both HCTs, a prerequisite for deriving their pitches and for their perceptual segregation. Onset asynchrony between the HCTs enhanced the neural representation of their harmonics, paralleling their improved perceptual segregation in humans. Pitches of the concurrent HCTs could also be temporally represented by neuronal phase-locking at their respective F0s. Furthermore, a model of A1 responses using harmonic templates could qualitatively reproduce psychophysical data on concurrent sound segregation in humans. Finally, we identified a possible intracortical homolog of the "object-related negativity" recorded noninvasively in humans, which correlates with the perceptual segregation of concurrent sounds. Findings indicate that A1 contains sufficient spectral and temporal information for segregating concurrent sounds based on differences in pitch.


Subject(s)
Auditory Cortex/physiology , Evoked Potentials, Auditory/physiology , Nerve Net/physiology , Pattern Recognition, Physiological/physiology , Pitch Perception/physiology , Animals , Brain Mapping , Cues , Haplorhini , Humans , Macaca fascicularis , Male , Perceptual Masking
SELECTION OF CITATIONS
SEARCH DETAIL
...