Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
Add more filters










Publication year range
1.
J Acoust Soc Am ; 155(3): 2099-2113, 2024 Mar 01.
Article in English | MEDLINE | ID: mdl-38483206

ABSTRACT

Acoustic context influences speech perception, but contextual variability restricts this influence. Assgari and Stilp [J. Acoust. Soc. Am. 138, 3023-3032 (2015)] demonstrated that when categorizing vowels, variability in who spoke the preceding context sentence on each trial but not the sentence contents diminished the resulting spectral contrast effects (perceptual shifts in categorization stemming from spectral differences between sounds). Yet, how such contextual variability affects temporal contrast effects (TCEs) (also known as speaking rate normalization; categorization shifts stemming from temporal differences) is unknown. Here, stimuli were the same context sentences and conditions (one talker saying one sentence, one talker saying 200 sentences, 200 talkers saying 200 sentences) used in Assgari and Stilp [J. Acoust. Soc. Am. 138, 3023-3032 (2015)], but set to fast or slow speaking rates to encourage perception of target words as "tier" or "deer," respectively. In Experiment 1, sentence variability and talker variability each diminished TCE magnitudes; talker variability also produced shallower psychometric function slopes. In Experiment 2, when speaking rates were matched across the 200-sentences conditions, neither TCE magnitudes nor slopes differed across conditions. In Experiment 3, matching slow and fast rates across all conditions failed to produce equal TCEs and slopes everywhere. Results suggest a complex interplay between acoustic, talker, and sentence variability in shaping TCEs in speech perception.


Subject(s)
Speech , Acoustics , Psychometrics , Sound , Humans
2.
Atten Percept Psychophys ; 86(3): 991-1007, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38216848

ABSTRACT

Musicians display a variety of auditory perceptual benefits relative to people with little or no musical training; these benefits are collectively referred to as the "musician advantage." Importantly, musicians consistently outperform nonmusicians for tasks relating to pitch, but there are mixed reports as to musicians outperforming nonmusicians for timbre-related tasks. Due to their experience manipulating the timbre of their instrument or voice in performance, we hypothesized that musicians would be more sensitive to acoustic context effects stemming from the spectral changes in timbre across a musical context passage (played by a string quintet then filtered) and a target instrument sound (French horn or tenor saxophone; Experiment 1). Additionally, we investigated the role of a musician's primary instrument of instruction by recruiting French horn and tenor saxophone players to also complete this task (Experiment 2). Consistent with the musician advantage literature, musicians exhibited superior pitch discrimination to nonmusicians. Contrary to our main hypothesis, there was no difference between musicians and nonmusicians in how spectral context effects shaped instrument sound categorization. Thus, musicians may only outperform nonmusicians for some auditory skills relevant to music (e.g., pitch perception) but not others (e.g., timbre perception via spectral differences).


Subject(s)
Music , Pitch Discrimination , Humans , Female , Young Adult , Male , Adult , Timbre Perception , Pitch Perception , Practice, Psychological
3.
Atten Percept Psychophys ; 85(7): 2488-2501, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37258892

ABSTRACT

Listeners show perceptual benefits (faster and/or more accurate responses) when perceiving speech spoken by a single talker versus multiple talkers, known as talker adaptation. While near-exclusively studied in speech and with talkers, some aspects of talker adaptation might reflect domain-general processes. Music, like speech, is a sound class replete with acoustic variation, such as a multitude of pitch and instrument possibilities. Thus, it was hypothesized that perceptual benefits from structure in the acoustic signal (i.e., hearing the same sound source on every trial) are not specific to speech but rather a general auditory response. Forty nonmusician participants completed a simple musical task that mirrored talker adaptation paradigms. Low- or high-pitched notes were presented in single- and mixed-instrument blocks. Reflecting both music research on pitch and timbre interdependence and mirroring traditional "talker" adaptation paradigms, listeners were faster to make their pitch judgments when presented with a single instrument timbre relative to when the timbre was selected from one of four instruments from trial to trial. A second experiment ruled out the possibility that participants were responding faster to the specific instrument chosen as the single-instrument timbre. Consistent with general theoretical approaches to perception, perceptual benefits from signal structure are not limited to speech.


Subject(s)
Music , Speech Perception , Humans , Pitch Perception/physiology , Hearing , Hearing Tests , Speech Perception/physiology
4.
JASA Express Lett ; 3(5)2023 05 01.
Article in English | MEDLINE | ID: mdl-37219432

ABSTRACT

When speaking in noisy conditions or to a hearing-impaired listener, talkers often use clear speech, which is typically slower than conversational speech. In other research, changes in speaking rate affect speech perception through speaking rate normalization: Slower context sounds encourage perception of subsequent sounds as faster, and vice versa. Here, on each trial, listeners heard a context sentence before the target word (which varied from "deer" to "tier"). Clear and slowed conversational context sentences elicited more "deer" responses than conversational sentences, consistent with rate normalization. Changing speaking styles aids speech intelligibility but might also produce other outcomes that alter sound/word recognition.


Subject(s)
Speech Intelligibility , Speech Perception , Hearing , Hydrolases , Sound
5.
J Acoust Soc Am ; 153(4): 2426, 2023 04 01.
Article in English | MEDLINE | ID: mdl-37092945

ABSTRACT

Speech sound perception is influenced by the spectral properties of surrounding sounds. For example, listeners perceive /g/ (lower F3 onset) more often after sounds with prominent high-F3 frequencies and perceive /d/ (higher F3 onset) more often after sounds with prominent low-F3 frequencies. These biases are known as spectral contrast effects (SCEs). Much of this work examined differences between long-term average spectra (LTAS) of preceding sounds and target speech sounds. Post hoc analyses by Stilp and Assgari [(2021) Atten. Percept. Psychophys. 83(6) 2694-2708] revealed that spectra of the last 475 ms of precursor sentences, not the entire LTAS, best predicted biases in consonant categorization. Here, the influences of proximal (last 500 ms) versus distal (before the last 500 ms) portions of precursor sentences on subsequent consonant categorization were compared. Sentences emphasized different frequency regions in each temporal window (e.g., distal low-F3 emphasis, proximal high-F3 emphasis, and vice versa) naturally or via filtering. In both cases, shifts in consonant categorization were produced in accordance with spectral properties of the proximal window. This was replicated when the distal window did not emphasize either frequency region, but the proximal window did. Results endorse closer consideration of patterns of spectral energy over time in preceding sounds, not just their LTAS.


Subject(s)
Speech Perception , Sound , Phonetics , Language , Bias
6.
Article in English | MEDLINE | ID: mdl-36417128

ABSTRACT

Familiarity with a talker's voice provides numerous benefits to speech perception, including faster responses and improved intelligibility in quiet and in noise. Yet, it is unclear whether familiarity facilitates talker adaptation, or the processing benefit stemming from hearing speech from one talker compared to multiple different talkers. Here, listeners completed a speeded recognition task for words presented in either single-talker or multiple-talker blocks. Talkers were either famous (the last five Presidents of the United States of America) or non-famous (other male politicians of similar ages). Participants either received no information about the talkers before the word recognition task (Experiments 1 and 3) or heard the talkers and saw their names first (Experiment 2). As expected, responses were faster in the single-talker blocks than in the multiple-talker blocks. Famous voices elicited faster responses in Experiment 1, but familiarity effects were extinguished in Experiment 2, possibly by hearing all voices recently before the experiment. When talkers were counterbalanced across single-talker and mixed-talker blocks in Experiment 3, no familiarity effects were observed. Predictions of familiarity facilitating talker adaptation (smaller increase in response times across single- and multiple-talker blocks for famous voices) were not confirmed. Thus, talker familiarity might not augment adaptation to a consistent talker.

7.
J Acoust Soc Am ; 152(3): 1842, 2022 09.
Article in English | MEDLINE | ID: mdl-36182316

ABSTRACT

Perception of speech sounds has a long history of being compared to perception of nonspeech sounds, with rich and enduring debates regarding how closely they share similar underlying processes. In many instances, perception of nonspeech sounds is directly compared to that of speech sounds without a clear explanation of how related these sounds are to the speech they are selected to mirror (or not mirror). While the extreme acoustic variability of speech sounds is well documented, this variability is bounded by the common source of a human vocal tract. Nonspeech sounds do not share a common source, and as such, exhibit even greater acoustic variability than that observed for speech. This increased variability raises important questions about how well perception of a given nonspeech sound might resemble or model perception of speech sounds. Here, we offer a brief review of extremely diverse nonspeech stimuli that have been used in the efforts to better understand perception of speech sounds. The review is organized according to increasing spectrotemporal complexity: random noise, pure tones, multitone complexes, environmental sounds, music, speech excerpts that are not recognized as speech, and sinewave speech. Considerations are offered for stimulus selection in nonspeech perception experiments moving forward.


Subject(s)
Speech Perception , Acoustic Stimulation , Humans , Phonetics , Sound , Sound Spectrography , Speech
8.
J Acoust Soc Am ; 152(1): 55, 2022 07.
Article in English | MEDLINE | ID: mdl-35931547

ABSTRACT

Spectral properties of earlier sounds (context) influence recognition of later sounds (target). Acoustic variability in context stimuli can disrupt this process. When mean fundamental frequencies (f0's) of preceding context sentences were highly variable across trials, shifts in target vowel categorization [due to spectral contrast effects (SCEs)] were smaller than when sentence mean f0's were less variable; when sentences were rearranged to exhibit high or low variability in mean first formant frequencies (F1) in a given block, SCE magnitudes were equivalent [Assgari, Theodore, and Stilp (2019) J. Acoust. Soc. Am. 145(3), 1443-1454]. However, since sentences were originally chosen based on variability in mean f0, stimuli underrepresented the extent to which mean F1 could vary. Here, target vowels (/ɪ/-/ɛ/) were categorized following context sentences that varied substantially in mean F1 (experiment 1) or mean F3 (experiment 2) with variability in mean f0 held constant. In experiment 1, SCE magnitudes were equivalent whether context sentences had high or low variability in mean F1; the same pattern was observed in experiment 2 for new sentences with high or low variability in mean F3. Variability in some acoustic properties (mean f0) can be more perceptually consequential than others (mean F1, mean F3), but these results may be task-dependent.


Subject(s)
Phonetics , Speech Perception , Sound , Sound Spectrography , Speech Acoustics
9.
J Acoust Soc Am ; 150(4): 2806, 2021 10.
Article in English | MEDLINE | ID: mdl-34717452

ABSTRACT

When spectra differ between earlier (context) and later (target) sounds, listeners perceive larger spectral changes than are physically present. When context sounds (e.g., a sentence) possess relatively higher frequencies, the target sound (e.g., a vowel sound) is perceived as possessing relatively lower frequencies, and vice versa. These spectral contrast effects (SCEs) are pervasive in auditory perception, but studies traditionally employed contexts with high spectrotemporal variability that made it difficult to understand exactly when context spectral properties biased perception. Here, contexts were speech-shaped noise divided into four consecutive 500-ms epochs. Contexts were filtered to amplify low-F1 (100-400 Hz) or high-F1 (550-850 Hz) frequencies to encourage target perception of /ɛ/ ("bet") or /ɪ/ ("bit"), respectively, via SCEs. Spectral peaks in the context ranged from its initial epoch(s) to its entire duration (onset paradigm), ranged from its final epoch(s) to its entire duration (offset paradigm), or were present for only one epoch (single paradigm). SCE magnitudes increased as spectral-peak durations increased and/or occurred later in the context (closer to the target). Contrary to predictions, brief early spectral peaks still biased subsequent target categorization. Results are compared to related experiments using speech contexts, and physiological and/or psychoacoustic idiosyncrasies of the noise contexts are considered.


Subject(s)
Speech Perception , Auditory Perception , Noise/adverse effects , Phonetics , Psychoacoustics , Speech
10.
Atten Percept Psychophys ; 83(6): 2694-2708, 2021 Aug.
Article in English | MEDLINE | ID: mdl-33987821

ABSTRACT

Speech perception, like all perception, takes place in context. Recognition of a given speech sound is influenced by the acoustic properties of surrounding sounds. When the spectral composition of earlier (context) sounds (e.g., a sentence with more energy at lower third formant [F3] frequencies) differs from that of a later (target) sound (e.g., consonant with intermediate F3 onset frequency), the auditory system magnifies this difference, biasing target categorization (e.g., towards higher-F3-onset /d/). Historically, these studies used filters to force context stimuli to possess certain spectral compositions. Recently, these effects were produced using unfiltered context sounds that already possessed the desired spectral compositions (Stilp & Assgari, 2019, Attention, Perception, & Psychophysics, 81, 2037-2052). Here, this natural signal statistics approach is extended to consonant categorization (/g/-/d/). Context sentences were either unfiltered (already possessing the desired spectral composition) or filtered (to imbue specific spectral characteristics). Long-term spectral characteristics of unfiltered contexts were poor predictors of shifts in consonant categorization, but short-term characteristics (last 475 ms) were excellent predictors. This diverges from vowel data, where long-term and shorter-term intervals (last 1,000 ms) were equally strong predictors. Thus, time scale plays a critical role in how listeners attune to signal statistics in the acoustic environment.


Subject(s)
Phonetics , Speech Perception , Acoustic Stimulation , Humans , Language , Sound , Sound Spectrography , Speech Acoustics
11.
Hear Res ; 392: 107983, 2020 07.
Article in English | MEDLINE | ID: mdl-32464456

ABSTRACT

Perception of a sound is influenced by spectral properties of surrounding sounds. When frequencies are absent in a preceding acoustic context before being introduced in a subsequent target sound, detection of those frequencies is facilitated via an auditory enhancement effect (EE). When spectral composition differs across a preceding context and subsequent target sound, those differences are perceptually magnified and perception shifts via a spectral contrast effect (SCE). Each effect is thought to receive contributions from peripheral and central neural processing, but the relative contributions are unclear. The present experiments manipulated ear of presentation to elucidate the degrees to which peripheral and central processes contributed to each effect in speech perception. In Experiment 1, EE and SCE magnitudes in consonant categorization were substantially diminished through contralateral presentation of contexts and targets compared to ipsilateral or bilateral presentations. In Experiment 2, spectrally complementary contexts were presented dichotically followed by the target in only one ear. This arrangement was predicted to produce context effects peripherally and cancel them centrally, but the competing contralateral context minimally decreased effect magnitudes. Results confirm peripheral and central contributions to EEs and SCEs in speech perception, but both effects appear to be primarily due to peripheral processing.


Subject(s)
Cues , Noise/adverse effects , Perceptual Masking , Speech Perception , Acoustic Stimulation , Humans , Sound Spectrography , Time Factors
12.
Atten Percept Psychophys ; 82(5): 2237-2243, 2020 Jul.
Article in English | MEDLINE | ID: mdl-32077069

ABSTRACT

Speech perception is challenged by indexical variability. A litany of studies on talker normalization have demonstrated that hearing multiple talkers incurs processing costs (e.g., lower accuracy, increased response time) compared to hearing a single talker. However, when reframing these studies in terms of stimulus structure, it is evident that past tests of multiple-talker (i.e., low structure) and single-talker (i.e., high structure) conditions are not representative of the graded nature of indexical variation in the environment. Here we tested the hypothesis that processing costs incurred by multiple-talker conditions would abate given increased stimulus structure. We tested this hypothesis by manipulating the degree to which talkers' voices differed acoustically (Experiment 1) and also the frequency with which talkers' voices changed (Experiment 2) in multiple-talker conditions. Listeners performed a speeded classification task for words containing vowels that varied in acoustic-phonemic ambiguity. In Experiment 1, response times progressively decreased as acoustic variability among talkers' voices decreased. In Experiment 2, blocking talkers within mixed-talker conditions led to more similar response times among single-talker and multiple-talker conditions. Neither result interacted with acoustic-phonemic ambiguity of the target vowels. Thus, the results showed that indexical structure mediated the processing costs incurred by hearing different talkers. This is consistent with the Efficient Coding Hypothesis, which proposes that sensory and perceptual processing are facilitated by stimulus structure. Defining the roles and limits of stimulus structure on speech perception is an important direction for future research.


Subject(s)
Speech Perception , Voice , Acoustics , Hearing , Humans , Reaction Time
13.
J Acoust Soc Am ; 146(2): 1503, 2019 08.
Article in English | MEDLINE | ID: mdl-31472539

ABSTRACT

The auditory system is remarkably sensitive to changes in the acoustic environment. This is exemplified by two classic effects of preceding spectral context on perception. In auditory enhancement effects (EEs), the absence and subsequent insertion of a frequency component increases its salience. In spectral contrast effects (SCEs), spectral differences between earlier and later (target) sounds are perceptually magnified, biasing target sound categorization. These effects have been suggested to be related, but have largely been studied separately. Here, EEs and SCEs are demonstrated using the same speech materials. In Experiment 1, listeners categorized vowels (/ɪ/-/ɛ/) or consonants (/d/-/g/) following a sentence processed by a bandpass or bandstop filter (vowel tasks: 100-400 or 550-850 Hz; consonant tasks: 1700-2700 or 2700-3700 Hz). Bandpass filtering produced SCEs and bandstop filtering produced EEs, with effect magnitudes significantly correlated at the individual differences level. In Experiment 2, context sentences were processed by variable-depth notch filters in these frequency regions (-5 to -20 dB). EE magnitudes increased at larger notch depths, growing linearly in consonant categorization. This parallels previous research where SCEs increased linearly for larger spectral peaks in the context sentence. These results link EEs and SCEs, as both shape speech categorization in orderly ways.


Subject(s)
Phonetics , Speech Acoustics , Speech Perception/physiology , Female , Humans , Male , Signal-To-Noise Ratio , Young Adult
14.
J Acoust Soc Am ; 145(3): 1443, 2019 03.
Article in English | MEDLINE | ID: mdl-31067942

ABSTRACT

The perception of any given sound is influenced by surrounding sounds. When successive sounds differ in their spectral compositions, these differences may be perceptually magnified, resulting in spectral contrast effects (SCEs). For example, listeners are more likely to perceive /ɪ/ (low F1) following sentences with higher F1 frequencies; listeners are also more likely to perceive /ɛ/ (high F1) following sentences with lower F1 frequencies. Previous research showed that SCEs for vowel categorization were attenuated when sentence contexts were spoken by different talkers [Assgari and Stilp. (2015). J. Acoust. Soc. Am. 138(5), 3023-3032], but the locus of this diminished contextual influence was not specified. Here, three experiments examined implications of variable talker acoustics for SCEs in the categorization of /ɪ/ and /ɛ/. The results showed that SCEs were smaller when the mean fundamental frequency (f0) of context sentences was highly variable across talkers compared to when mean f0 was more consistent, even when talker gender was held constant. In contrast, SCE magnitudes were not influenced by variability in mean F1. These findings suggest that talker variability attenuates SCEs due to diminished consistency of f0 as a contextual influence. Connections between these results and talker normalization are considered.

15.
Atten Percept Psychophys ; 81(4): 861-883, 2019 May.
Article in English | MEDLINE | ID: mdl-30937673

ABSTRACT

An information theoretic framework is proposed to have the potential to dissolve (rather than attempt to solve) multiple long-standing problems concerning speech perception. By this view, speech perception can be reframed as a series of processes through which sensitivity to information-that which changes and/or is unpredictable-becomes increasingly sophisticated and shaped by experience. Problems concerning appropriate objects of perception (gestures vs. sounds), rate normalization, variance consequent to articulation, and talker normalization are reframed, or even dissolved, within this information-theoretic framework. Application of discriminative models founded on information theory provides a productive approach to answer questions concerning perception of speech, and perception most broadly.


Subject(s)
Models, Theoretical , Speech Perception , Gestures , Humans , Phonetics , Speech Intelligibility
16.
Atten Percept Psychophys ; 81(6): 2037-2052, 2019 Aug.
Article in English | MEDLINE | ID: mdl-30887381

ABSTRACT

All perception takes place in context. Recognition of a given speech sound is influenced by the acoustic properties of surrounding sounds. When the spectral composition of earlier (context) sounds (e.g., more energy at lower first formant [F1] frequencies) differs from that of a later (target) sound (e.g., vowel with intermediate F1), the auditory system magnifies this difference, biasing target categorization (e.g., towards higher-F1 /ɛ/). Historically, these studies used filters to force context sounds to possess desired spectral compositions. This approach is agnostic to the natural signal statistics of speech (inherent spectral compositions without any additional manipulations). The auditory system is thought to be attuned to such stimulus statistics, but this has gone untested. Here, vowel categorization was measured following unfiltered (already possessing the desired spectral composition) or filtered sentences (to match spectral characteristics of unfiltered sentences). Vowel categorization was biased in both cases, with larger biases as the spectral prominences in context sentences increased. This confirms sensitivity to natural signal statistics, extending spectral context effects in speech perception to more naturalistic listening conditions. Importantly, categorization biases were smaller and more variable following unfiltered sentences, raising important questions about how faithfully experiments using filtered contexts model everyday speech perception.


Subject(s)
Phonetics , Speech Acoustics , Speech Perception , Adult , Female , Humans , Language , Male , Speech , Time Factors
17.
Atten Percept Psychophys ; 81(4): 1119-1126, 2019 May.
Article in English | MEDLINE | ID: mdl-30725437

ABSTRACT

Auditory perception is shaped by spectral properties of surrounding sounds. For example, when spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs; i.e., categorization boundary shifts) that bias perception of later sounds. SCEs affect perception of speech and nonspeech sounds alike (Stilp Alexander, Kiefte, & Kluender in Attention, Perception, & Psychophysics, 72(2), 470-480, 2010). When categorizing speech sounds, SCE magnitudes increased linearly with greater spectral differences between contexts and target sounds (Stilp, Anderson, & Winn in Journal of the Acoustical Society of America, 137(6), 3466-3476, 2015; Stilp & Alexander in Proceedings of Meetings on Acoustics, 26, 2016; Stilp & Assgari in Journal of the Acoustical Society of America, 141(2), EL153-EL158, 2017). The present experiment tested whether this acute context sensitivity generalized to nonspeech categorization. Listeners categorized musical instrument target sounds that varied from French horn to tenor saxophone. Before each target, listeners heard a 1-second string quintet sample processed by filters that reflected part of (25%, 50%, 75%) or the full (100%) difference between horn and saxophone spectra. Larger filter gains increased spectral distinctness across context and target sounds, and resulting SCE magnitudes increased linearly, parallel to speech categorization. Thus, a highly sensitive relationship between context spectra and target categorization appears to be fundamental to auditory perception.


Subject(s)
Auditory Perception/physiology , Music/psychology , Sound Spectrography , Acoustic Stimulation , Adult , Attentional Bias , Female , Hearing , Humans , Male , Sound
18.
J Acoust Soc Am ; 143(4): 2460, 2018 04.
Article in English | MEDLINE | ID: mdl-29716264

ABSTRACT

Natural sounds have substantial acoustic structure (predictability, nonrandomness) in their spectral and temporal compositions. Listeners are expected to exploit this structure to distinguish simultaneous sound sources; however, previous studies confounded acoustic structure and listening experience. Here, sensitivity to acoustic structure in novel sounds was measured in discrimination and identification tasks. Complementary signal-processing strategies independently varied relative acoustic entropy (the inverse of acoustic structure) across frequency or time. In one condition, instantaneous frequency of low-pass-filtered 300-ms random noise was rescaled to 5 kHz bandwidth and resynthesized. In another condition, the instantaneous frequency of a short gated 5-kHz noise was resampled up to 300 ms. In both cases, entropy relative to full bandwidth or full duration was a fraction of that in 300-ms noise sampled at 10 kHz. Discrimination of sounds improved with less relative entropy. Listeners identified a probe sound as a target sound (1%, 3.2%, or 10% relative entropy) that repeated amidst distractor sounds (1%, 10%, or 100% relative entropy) at 0 dB SNR. Performance depended on differences in relative entropy between targets and background. Lower-relative-entropy targets were better identified against higher-relative-entropy distractors than lower-relative-entropy distractors; higher-relative-entropy targets were better identified amidst lower-relative-entropy distractors. Results were consistent across signal-processing strategies.


Subject(s)
Acoustic Stimulation/methods , Auditory Perception/physiology , Discrimination, Psychological/physiology , Psychoacoustics , Sound Localization/physiology , Sound , Case-Control Studies , Humans , Signal Processing, Computer-Assisted
19.
Atten Percept Psychophys ; 80(5): 1300-1310, 2018 Jul.
Article in English | MEDLINE | ID: mdl-29492759

ABSTRACT

Speech perception is heavily influenced by surrounding sounds. When spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs) that bias perception of later sounds. For example, when context sounds have more energy in low-F1 frequency regions, listeners report more high-F1 responses to a target vowel, and vice versa. SCEs have been reported using various approaches for a wide range of stimuli, but most often, large spectral peaks were added to the context to bias speech categorization. This obscures the lower limit of perceptual sensitivity to spectral properties of earlier sounds, i.e., when SCEs begin to bias speech categorization. Listeners categorized vowels (/ɪ/-/ɛ/, Experiment 1) or consonants (/d/-/g/, Experiment 2) following a context sentence with little spectral amplification (+1 to +4 dB) in frequency regions known to produce SCEs. In both experiments, +3 and +4 dB amplification in key frequency regions of the context produced SCEs, but lesser amplification was insufficient to bias performance. This establishes a lower limit of perceptual sensitivity where spectral differences across sounds can bias subsequent speech categorization. These results are consistent with proposed adaptation-based mechanisms that potentially underlie SCEs in auditory perception. SIGNIFICANCE STATEMENT: Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur frequently in everyday speech perception.


Subject(s)
Phonetics , Speech Perception/physiology , Adaptation, Physiological/physiology , Adult , Auditory Perception/physiology , Female , Hearing/physiology , Humans , Male , Self Concept , Sound Spectrography/methods , Speech/physiology , Young Adult
20.
J Acoust Soc Am ; 141(2): EL153, 2017 02.
Article in English | MEDLINE | ID: mdl-28253661

ABSTRACT

When spectral properties differ across successive sounds, this difference is perceptually magnified, resulting in spectral contrast effects (SCEs). Recently, Stilp, Anderson, and Winn [(2015) J. Acoust. Soc. Am. 137(6), 3466-3476] revealed that SCEs are graded: more prominent spectral peaks in preceding sounds produced larger SCEs (i.e., category boundary shifts) in categorization of subsequent vowels. Here, a similar relationship between spectral context and SCEs was replicated in categorization of voiced stop consonants. By generalizing this relationship across consonants and vowels, different spectral cues, and different frequency regions, acute and graded sensitivity to spectral context appears to be pervasive in speech perception.

SELECTION OF CITATIONS
SEARCH DETAIL
...