Search | VHL Regional Portal

1.

Lexically Mediated Compensation for Coarticulation Still as Elusive as a White Christmash.

McQueen, James M; Jesse, Alexandra; Mitterer, Holger.

Cogn Sci ; 47(9): e13342, 2023 09.

Article in English | MEDLINE | ID: mdl-37715483

ABSTRACT

Luthra, Peraza-Santiago, Beeson, Saltzman, Crinnion, and Magnuson (2021) present data from the lexically mediated compensation for coarticulation paradigm that they claim provides conclusive evidence in favor of top-down processing in speech perception. We argue here that this evidence does not support that conclusion. The findings are open to alternative explanations, and we give data in support of one of them (that there is an acoustic confound in the materials). Lexically mediated compensation for coarticulation thus remains elusive, while prior data from the paradigm instead challenge the idea that there is top-down processing in online speech recognition.

Subject(s)

Acoustics , Speech Perception , Humans

2.

Hearing and speech processing in midlife.

Helfer, Karen S; Jesse, Alexandra.

Hear Res ; 402: 108097, 2021 03 15.

Article in English | MEDLINE | ID: mdl-33706999

ABSTRACT

Middle-aged adults often report a decline in their ability to understand speech in adverse listening situations. However, there has been relatively little research devoted to identifying how early aging affects speech processing, as the majority of investigations into senescent changes in speech understanding compare performance in groups of young and older adults. This paper provides an overview of research on hearing and speech perception in middle-aged adults. Topics covered include both objective and subjective (self-perceived) hearing and speech understanding, listening effort, and audiovisual speech perception. This review ends with justification for future research needed to define the nature, consequences, and remediation of hearing problems in middle-aged adults.

Subject(s)

Hearing , Speech Perception , Speech , Humans , Listening Effort , Noise

3.

Sentence context guides phonetic retuning to speaker idiosyncrasies.

Jesse, Alexandra.

J Exp Psychol Learn Mem Cogn ; 47(1): 184-194, 2021 Jan.

Article in English | MEDLINE | ID: mdl-31855000

ABSTRACT

Speakers vary in their pronunciations of the sounds in their native language. Listeners use lexical knowledge to adjust their phonetic categories to speakers' idiosyncratic pronunciations. Lexical information can, however, be inconclusive or become available too late to guide this phonetic retuning. Sentence context is known to affect lexical processing, and listeners are typically more likely to categorize steps of a phonetic continuum in line with the semantic content of a sentence. In a series of experiments, we tested whether preceding sentence context can guide phonetic retuning. During a passive-listening exposure phase, English listeners heard a sound ambiguous between /s/ and /f/ spliced into the onset position of minimal word pairs (e.g., sin vs. fin). Sentence context disambiguated these minimal pairs as /s/-initial for 1 group of listeners and as /f/-initial for another group. At subsequent test, listeners categorized more steps on a /sa/-/fa/ continuum in line with their prior exposure; that is, when sentence context had disambiguated the ambiguous sound during exposure as /s/, listeners gave more /s/ responses than /f/ responses at test. These aftereffects occurred independently of whether contrastive phonemes from the respective other category were provided. No phonetic retuning was found when the disambiguating sentence contexts were replaced with neutral ones. Overall, these results provide evidence that sentence context can guide phonetic retuning, therefore expanding the usefulness of phonetic retuning as a tool for listeners to accommodate speakers. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

Subject(s)

Phonetics , Speech Perception , Speech , Female , Humans , Semantics , Young Adult

4.

Fixating the eyes of a speaker provides sufficient visual information to modulate early auditory processing.

Kaplan, Elina; Jesse, Alexandra.

Biol Psychol ; 146: 107724, 2019 09.

Article in English | MEDLINE | ID: mdl-31323242

ABSTRACT

In face-to-face conversations, when listeners process and combine information obtained from hearing and seeing a speaker, they mostly look at the eyes rather than at the more informative mouth region. Measuring event-related potentials, we tested whether fixating the speaker's eyes is sufficient for gathering enough visual speech information to modulate early auditory processing, or whether covert attention to the speaker's mouth is needed. Results showed that when listeners fixated the eye region of the speaker, the amplitudes of the auditory evoked N1 and P2 were reduced when listeners heard and saw the speaker than when they only heard her. These cross-modal interactions also occurred when, in addition, attention was restricted to the speaker's eye region. Fixating the speaker's eyes thus provides listeners with sufficient visual information to facilitate early auditory processing. The spread of covert attention to the mouth area is not needed to observe audiovisual interactions.

Subject(s)

Auditory Perception/physiology , Eye , Fixation, Ocular/physiology , Social Perception , Electroencephalography , Evoked Potentials/physiology , Evoked Potentials, Auditory/physiology , Female , Humans , Male , Mouth , Photic Stimulation , Speech Perception , Young Adult

5.

Regressive spectral assimilation bias in speech perception.

Rysling, Amanda; Jesse, Alexandra; Kingston, John.

Atten Percept Psychophys ; 81(4): 1127-1146, 2019 May.

Article in English | MEDLINE | ID: mdl-31114954

ABSTRACT

Speech perception presents a parsing problem: construing information from the acoustic input we receive as evidence for the speech sounds we recognize as language. Most work on segmental perception has focused on how listeners use differences between successive speech sounds to solve this problem. Prominent models either assume (a) that listeners attribute acoustics to the sounds whose articulation created them, or (b) that the auditory system exaggerates the changes in the auditory quality of the incoming speech signal. Both approaches predict contrast effects in that listeners will usually judge two successive phones to be distinct from each other. Few studies have examined cases in which listeners hear two sounds in a row as similar, apparently failing to differentiate them. We examine such under-studied cases. In a series of experiments, listeners were faced with ambiguity about the identity of the first of two successive phones. Listeners consistently heard the first sound as spectrally similar to the second sound in a manner suggesting that they construed the transitions between the two as evidence about the identity of the first. In these and previously reported studies, they seemed to default to this construal when the signal was not sufficiently informative for them to do otherwise. These effects go unaccounted for in the two prominent models of speech perception, but they parallel known domain-general effects in perceptual processing, and as such are likely a consequence of the structure of the human auditory system.

Subject(s)

Acoustic Stimulation/psychology , Sound Spectrography , Speech Perception/physiology , Acoustics , Adult , Bias , Female , Hearing , Humans , Male , Phonetics , Speech , Young Adult

6.

Lexical Influences on Errors in Masked Speech Perception in Younger, Middle-Aged, and Older Adults.

Jesse, Alexandra; Helfer, Karen S.

J Speech Lang Hear Res ; 62(4S): 1152-1166, 2019 04 26.

Article in English | MEDLINE | ID: mdl-31026195

ABSTRACT

Purpose In situations with a competing talker, lexical properties of words in both streams affect the recognition of words in the to-be-attended target stream. In this study, we tested whether these lexical properties also influence the type of errors made by listeners across the adult life span. Method Errors from a corpus collected by Helfer and Jesse (2015) were categorized as phonologically similar to words in the target and/or masker streams. Younger, middle-aged, and older listeners had produced these errors when trying to identify key words from a target stream while ignoring a single-talker masker. Neighborhood density and lexical frequency of target words and masker words had been manipulated independently. Results Lexical properties of target words influenced all types of errors. With higher frequency maskers, the probability of responding with a masker word increased and the phonological influence of target words decreased. Lower levels of lexical competition for maskers increased the probability that listeners reported a word phonologically related to both masker and target words. The influence of masker words increased across the adult life span, as evidenced by phonological intrusions into responses and the temporary failure in selectively attending to the target stream. The effects of lexical properties on error patterns, however, were consistent across age groups. Conclusions The ease of recognition of words in both attended and unattended speech influences the breakdown of speech perception. These influences remain robust across the adult life span.

Subject(s)

Age Factors , Perceptual Masking/physiology , Phonetics , Recognition, Psychology/physiology , Speech Perception/physiology , Adult , Aged , Auditory Threshold , Female , Humans , Male , Middle Aged , Young Adult

7.

Attentional resources contribute to the perceptual learning of talker idiosyncrasies in audiovisual speech.

Jesse, Alexandra; Kaplan, Elina.

Atten Percept Psychophys ; 81(4): 1006-1019, 2019 May.

Article in English | MEDLINE | ID: mdl-30684204

ABSTRACT

To recognize audiovisual speech, listeners evaluate and combine information obtained from the auditory and visual modalities. Listeners also use information from one modality to adjust their phonetic categories to a talker's idiosyncrasy encountered in the other modality. In this study, we examined whether the outcome of this cross-modal recalibration relies on attentional resources. In a standard recalibration experiment in Experiment 1, participants heard an ambiguous sound, disambiguated by the accompanying visual speech as either /p/ or /t/. Participants' primary task was to attend to the audiovisual speech while either monitoring a tone sequence for a target tone or ignoring the tones. Listeners subsequently categorized the steps of an auditory /p/-/t/ continuum more often in line with their exposure. The aftereffect of phonetic recalibration was reduced, but not eliminated, by attentional load during exposure. In Experiment 2, participants saw an ambiguous visual speech gesture that was disambiguated auditorily as either /p/ or /t/. At test, listeners categorized the steps of a visual /p/-/t/ continuum more often in line with the prior exposure. Imposing load in the auditory modality during exposure did not reduce the aftereffect of this type of cross-modal phonetic recalibration. Together, these results suggest that auditory attentional resources are needed for the processing of auditory speech and/or for the shifting of auditory phonetic category boundaries. Listeners thus need to dedicate attentional resources in order to accommodate talker idiosyncrasies in audiovisual speech.

Subject(s)

Acoustic Stimulation/psychology , Attention , Auditory Perception/physiology , Phonetics , Photic Stimulation/methods , Acoustic Stimulation/methods , Adult , Female , Humans , Learning , Male , Young Adult

8.

Adult dyslexic readers benefit less from visual input during audiovisual speech processing: fMRI evidence.

Francisco, Ana A; Takashima, Atsuko; McQueen, James M; van den Bunt, Mark; Jesse, Alexandra; Groen, Margriet A.

Neuropsychologia ; 117: 454-471, 2018 08.

Article in English | MEDLINE | ID: mdl-29990508

ABSTRACT

The aim of the present fMRI study was to investigate whether typical and dyslexic adult readers differed in the neural correlates of audiovisual speech processing. We tested for Blood Oxygen-Level Dependent (BOLD) activity differences between these two groups in a 1-back task, as they processed written (word, illegal consonant strings) and spoken (auditory, visual and audiovisual) stimuli. When processing written stimuli, dyslexic readers showed reduced activity in the supramarginal gyrus, a region suggested to play an important role in phonological processing, but only when they processed strings of consonants, not when they read words. During the speech perception tasks, dyslexic readers were only slower than typical readers in their behavioral responses in the visual speech condition. Additionally, dyslexic readers presented reduced neural activation in the auditory, the visual, and the audiovisual speech conditions. The groups also differed in terms of superadditivity, with dyslexic readers showing decreased neural activation in the regions of interest. An additional analysis focusing on vision-related processing during the audiovisual condition showed diminished activation for the dyslexic readers in a fusiform gyrus cluster. Our results thus suggest that there are differences in audiovisual speech processing between dyslexic and normal readers. These differences might be explained by difficulties in processing the unisensory components of audiovisual speech, more specifically, dyslexic readers may benefit less from visual information during audiovisual speech processing than typical readers. Given that visual speech processing supports the development of phonological skills fundamental in reading, differences in processing of visual speech could contribute to differences in reading ability between typical and dyslexic readers.

Subject(s)

Cerebral Cortex/diagnostic imaging , Dyslexia/diagnostic imaging , Magnetic Resonance Imaging , Speech Perception/physiology , Speech/physiology , Acoustic Stimulation , Adult , Brain Mapping , Female , Humans , Image Processing, Computer-Assisted , Male , Oxygen/blood , Photic Stimulation , Reading , Young Adult

9.

Learning to recognize unfamiliar talkers: Listeners rapidly form representations of facial dynamic signatures.

Jesse, Alexandra; Bartoli, Michael.

Cognition ; 176: 195-208, 2018 07.

Article in English | MEDLINE | ID: mdl-29604468

ABSTRACT

Seeing the motion of a talking face can be sufficient to recognize personally highly familiar speakers, suggesting that dynamic facial information is stored in long-term representations for familiar speakers. In the present study, we tested whether talking-related facial dynamic information can guide the learning of unfamiliar speakers. Participants were asked to identify speakers from configuration-normalized point-light displays showing only the biological motion that speakers produced while saying short sentences. During an initial learning phase, feedback was given. During test, listeners identified speakers from point-light displays of the training sentences and of new sentences. Listeners learned to identify two speakers, and four speakers in another experiment, from visual dynamic information alone. Learning was evident already after very little exposure. Furthermore, listeners formed abstract representations of visual dynamic signatures that allowed them to recognize speakers at test even from new linguistic materials. Control experiments showed that any potentially remaining static information in the point-light displays was not sufficient to guide learning and that listeners learned to recognize the identity, rather than the sex, of the speakers, as learning was also found when speakers were of the same sex. Overall, these results demonstrate that listeners can learn to identify unfamiliar speakers from the motion they produce during talking. Listeners thus establish abstract representations of the talking-related dynamic facial motion signatures of unfamiliar speakers already from limited exposure.

Subject(s)

Facial Recognition , Motion Perception , Recognition, Psychology , Speech Perception , Acoustic Stimulation , Adolescent , Adult , Female , Humans , Male , Photic Stimulation , Young Adult

10.

Low-frequency fine-structure cues allow for the online use of lexical stress during spoken-word recognition in spectrally degraded speech.

Kong, Ying-Yee; Jesse, Alexandra.

J Acoust Soc Am ; 141(1): 373, 2017 01.

Article in English | MEDLINE | ID: mdl-28147573

ABSTRACT

English listeners use suprasegmental cues to lexical stress during spoken-word recognition. Prosodic cues are, however, less salient in spectrally degraded speech, as provided by cochlear implants. The present study examined how spectral degradation with and without low-frequency fine-structure information affects normal-hearing listeners' ability to benefit from suprasegmental cues to lexical stress in online spoken-word recognition. To simulate electric hearing, an eight-channel vocoder spectrally degraded the stimuli while preserving temporal envelope information. Additional lowpass-filtered speech was presented to the opposite ear to simulate bimodal hearing. Using a visual world paradigm, listeners' eye fixations to four printed words (target, competitor, two distractors) were tracked, while hearing a word. The target and competitor overlapped segmentally in their first two syllables but mismatched suprasegmentally in their first syllables, as the initial syllable received primary stress in one word and secondary stress in the other (e.g., "'admiral," "'admi'ration"). In the vocoder-only condition, listeners were unable to use lexical stress to recognize targets before segmental information disambiguated them from competitors. With additional lowpass-filtered speech, however, listeners efficiently processed prosodic information to speed up online word recognition. Low-frequency fine-structure cues in simulated bimodal hearing allowed listeners to benefit from suprasegmental cues to lexical stress during word recognition.

Subject(s)

Cues , Recognition, Psychology , Speech Acoustics , Speech Intelligibility , Speech Perception , Voice Quality , Acoustic Stimulation , Female , Humans , Male , Photic Stimulation , Time Factors , Visual Perception , Young Adult

11.

English Listeners Use Suprasegmental Cues to Lexical Stress Early During Spoken-Word Recognition.

Jesse, Alexandra; Poellmann, Katja; Kong, Ying-Yee.

J Speech Lang Hear Res ; 60(1): 190-198, 2017 01 01.

Article in English | MEDLINE | ID: mdl-28056135

ABSTRACT

Purpose: We used an eye-tracking technique to investigate whether English listeners use suprasegmental information about lexical stress to speed up the recognition of spoken words in English. Method: In a visual world paradigm, 24 young English listeners followed spoken instructions to choose 1 of 4 printed referents on a computer screen (e.g., "Click on the word admiral"). Displays contained a critical pair of words (e.g., 'admiral-'admi'ration) that were segmentally identical for their first 2 syllables but differed suprasegmentally in their 1st syllable: One word began with primary lexical stress, and the other began with secondary lexical stress. All words had phrase-level prominence. Listeners' relative proportion of eye fixations on these words indicated their ability to differentiate them over time. Results: Before critical word pairs became segmentally distinguishable in their 3rd syllables, participants fixated target words more than their stress competitors, but only if targets had initial primary lexical stress. The degree to which stress competitors were fixated was independent of their stress pattern. Conclusions: Suprasegmental information about lexical stress modulates the time course of spoken-word recognition. Specifically, suprasegmental information on the primary-stressed syllable of words with phrase-level prominence helps in distinguishing the word from phonological competitors with secondary lexical stress.

Subject(s)

Phonetics , Speech Perception , Eye Movement Measurements , Fixation, Ocular , Humans , Pattern Recognition, Physiological , Reading , Recognition, Psychology , Young Adult

12.

A General Audiovisual Temporal Processing Deficit in Adult Readers With Dyslexia.

Francisco, Ana A; Jesse, Alexandra; Groen, Margriet A; McQueen, James M.

J Speech Lang Hear Res ; 60(1): 144-158, 2017 01 01.

Article in English | MEDLINE | ID: mdl-28056152

ABSTRACT

Purpose: Because reading is an audiovisual process, reading impairment may reflect an audiovisual processing deficit. The aim of the present study was to test the existence and scope of such a deficit in adult readers with dyslexia. Method: We tested 39 typical readers and 51 adult readers with dyslexia on their sensitivity to the simultaneity of audiovisual speech and nonspeech stimuli, their time window of audiovisual integration for speech (using incongruent /aCa/ syllables), and their audiovisual perception of phonetic categories. Results: Adult readers with dyslexia showed less sensitivity to audiovisual simultaneity than typical readers for both speech and nonspeech events. We found no differences between readers with dyslexia and typical readers in the temporal window of integration for audiovisual speech or in the audiovisual perception of phonetic categories. Conclusions: The results suggest an audiovisual temporal deficit in dyslexia that is not specific to speech-related events. But the differences found for audiovisual temporal sensitivity did not translate into a deficit in audiovisual speech perception. Hence, there seems to be a hiatus between simultaneity judgment and perception, suggesting a multisensory system that uses different mechanisms across tasks. Alternatively, it is possible that the audiovisual deficit in dyslexia is only observable when explicit judgments about audiovisual simultaneity are required.

Subject(s)

Auditory Perception , Dyslexia , Phonetics , Reading , Speech Perception , Dyslexia/psychology , Female , Humans , Judgment , Language Tests , Male , Young Adult

13.

Audiovisual alignment of co-speech gestures to speech supports word learning in 2-year-olds.

Jesse, Alexandra; Johnson, Elizabeth K.

J Exp Child Psychol ; 145: 1-10, 2016 May.

Article in English | MEDLINE | ID: mdl-26765249

ABSTRACT

Analyses of caregiver-child communication suggest that an adult tends to highlight objects in a child's visual scene by moving them in a manner that is temporally aligned with the adult's speech productions. Here, we used the looking-while-listening paradigm to examine whether 25-month-olds use audiovisual temporal alignment to disambiguate and learn novel word-referent mappings in a difficult word-learning task. Videos of two equally interesting and animated novel objects were simultaneously presented to children, but the movement of only one of the objects was aligned with an accompanying object-labeling audio track. No social cues (e.g., pointing, eye gaze, touch) were available to the children because the speaker was edited out of the videos. Immediately afterward, toddlers were presented with still images of the two objects and asked to look at one or the other. Toddlers looked reliably longer to the labeled object, demonstrating their acquisition of the novel word-referent mapping. A control condition showed that children's performance was not solely due to the single unambiguous labeling that had occurred at experiment onset. We conclude that the temporal link between a speaker's utterances and the motion they imposed on the referent object helps toddlers to deduce a speaker's intended reference in a difficult word-learning scenario. In combination with our previous work, these findings suggest that intersensory redundancy is a source of information used by language users of all ages. That is, intersensory redundancy is not just a word-learning tool used by young infants.

Subject(s)

Child Development/physiology , Gestures , Speech Perception/physiology , Verbal Learning/physiology , Visual Perception/physiology , Child, Preschool , Female , Humans , Language Development , Male

14.

Lexical influences on competing speech perception in younger, middle-aged, and older adults.

Helfer, Karen S; Jesse, Alexandra.

J Acoust Soc Am ; 138(1): 363-76, 2015 Jul.

Article in English | MEDLINE | ID: mdl-26233036

ABSTRACT

The influence of lexical characteristics of words in to-be-attended and to-be-ignored speech streams was examined in a competing speech task. Older, middle-aged, and younger adults heard pairs of low-cloze probability sentences in which the frequency or neighborhood density of words was manipulated in either the target speech stream or the masking speech stream. All participants also completed a battery of cognitive measures. As expected, for all groups, target words that occur frequently or that are from sparse lexical neighborhoods were easier to recognize than words that are infrequent or from dense neighborhoods. Compared to other groups, these neighborhood density effects were largest for older adults; the frequency effect was largest for middle-aged adults. Lexical characteristics of words in the to-be-ignored speech stream also affected recognition of to-be-attended words, but only when overall performance was relatively good (that is, when younger participants listened to the speech streams at a more advantageous signal-to-noise ratio). For these listeners, to-be-ignored masker words from sparse neighborhoods interfered with recognition of target speech more than masker words from dense neighborhoods. Amount of hearing loss and cognitive abilities relating to attentional control modulated overall performance as well as the strength of lexical influences.

Subject(s)

Aging/physiology , Pattern Recognition, Physiological/physiology , Speech Perception/physiology , Vocabulary , Acoustic Stimulation , Adult , Aged , Aged, 80 and over , Aging/psychology , Attention/physiology , Audiometry, Pure-Tone , Auditory Threshold , Cognition/physiology , Executive Function , Female , Humans , Male , Memory, Short-Term/physiology , Middle Aged , Psychological Tests , Signal-To-Noise Ratio , Single-Blind Method , Stroop Test , Young Adult

15.

Working memory affects older adults' use of context in spoken-word recognition.

Janse, Esther; Jesse, Alexandra.

Q J Exp Psychol (Hove) ; 67(9): 1842-62, 2014.

Article in English | MEDLINE | ID: mdl-24443921

ABSTRACT

Many older listeners report difficulties in understanding speech in noisy situations. Working memory and other cognitive skills may modulate older listeners' ability to use context information to alleviate the effects of noise on spoken-word recognition. In the present study, we investigated whether verbal working memory predicts older adults' ability to immediately use context information in the recognition of words embedded in sentences, presented in different listening conditions. In a phoneme-monitoring task, older adults were asked to detect as fast and as accurately as possible target phonemes in sentences spoken by a target speaker. Target speech was presented without noise, with fluctuating speech-shaped noise, or with competing speech from a single distractor speaker. The gradient measure of contextual probability (derived from a separate offline rating study) affected the speed of recognition. Contextual facilitation was modulated by older listeners' verbal working memory (measured with a backward digit span task) and age across listening conditions. Working memory and age, as well as hearing loss, were also the most consistent predictors of overall listening performance. Older listeners' immediate benefit from context in spoken-word recognition thus relates to their ability to keep and update a semantic representation of the sentence content in working memory.

Subject(s)

Memory, Short-Term/physiology , Recognition, Psychology/physiology , Verbal Learning , Vocabulary , Acoustic Stimulation , Aged , Aged, 80 and over , Aging , Female , Humans , Individuality , Male , Middle Aged , Predictive Value of Tests , Reaction Time

16.

Suprasegmental lexical stress cues in visual speech can guide spoken-word recognition.

Jesse, Alexandra; McQueen, James M.

Q J Exp Psychol (Hove) ; 67(4): 793-808, 2014.

Article in English | MEDLINE | ID: mdl-24134065

ABSTRACT

Visual cues to the individual segments of speech and to sentence prosody guide speech recognition. The present study tested whether visual suprasegmental cues to the stress patterns of words can also constrain recognition. Dutch listeners use acoustic suprasegmental cues to lexical stress (changes in duration, amplitude, and pitch) in spoken-word recognition. We asked here whether they can also use visual suprasegmental cues. In two categorization experiments, Dutch participants saw a speaker say fragments of word pairs that were segmentally identical but differed in their stress realization (e.g., 'ca-vi from cavia "guinea pig" vs. 'ka-vi from kaviaar "caviar"). Participants were able to distinguish between these pairs from seeing a speaker alone. Only the presence of primary stress in the fragment, not its absence, was informative. Participants were able to distinguish visually primary from secondary stress on first syllables, but only when the fragment-bearing target word carried phrase-level emphasis. Furthermore, participants distinguished fragments with primary stress on their second syllable from those with secondary stress on their first syllable (e.g., pro-'jec from projector "projector" vs. 'pro-jec from projectiel "projectile"), independently of phrase-level emphasis. Seeing a speaker thus contributes to spoken-word recognition by providing suprasegmental information about the presence of primary lexical stress.

Subject(s)

Pattern Recognition, Visual/physiology , Phonetics , Recognition, Psychology/physiology , Semantics , Speech Acoustics , Speech Perception/physiology , Acoustic Stimulation , Association Learning , Cues , Female , Humans , Linear Models , Male , Photic Stimulation , Vocabulary , Young Adult

17.

Lexically guided retuning of visual phonetic categories.

van der Zande, Patrick; Jesse, Alexandra; Cutler, Anne.

J Acoust Soc Am ; 134(1): 562-71, 2013 Jul.

Article in English | MEDLINE | ID: mdl-23862831

ABSTRACT

Listeners retune the boundaries between phonetic categories to adjust to individual speakers' productions. Lexical information, for example, indicates what an unusual sound is supposed to be, and boundary retuning then enables the speaker's sound to be included in the appropriate auditory phonetic category. In this study, it was investigated whether lexical knowledge that is known to guide the retuning of auditory phonetic categories, can also retune visual phonetic categories. In Experiment 1, exposure to a visual idiosyncrasy in ambiguous audiovisually presented target words in a lexical decision task indeed resulted in retuning of the visual category boundary based on the disambiguating lexical context. In Experiment 2 it was tested whether lexical information retunes visual categories directly, or indirectly through the generalization from retuned auditory phonetic categories. Here, participants were exposed to auditory-only versions of the same ambiguous target words as in Experiment 1. Auditory phonetic categories were retuned by lexical knowledge, but no shifts were observed for the visual phonetic categories. Lexical knowledge can therefore guide retuning of visual phonetic categories, but lexically guided retuning of auditory phonetic categories is not generalized to visual categories. Rather, listeners adjust auditory and visual phonetic categories to talker idiosyncrasies separately.

Subject(s)

Phonation , Phonetics , Reading , Semantics , Speech Acoustics , Speech Perception , Attention , Decision Making , Female , Generalization, Psychological , Humans , Male , Young Adult

18.

Tone of voice guides word learning in informative referential contexts.

Reinisch, Eva; Jesse, Alexandra; Nygaard, Lynne C.

Q J Exp Psychol (Hove) ; 66(6): 1227-40, 2013 Jun.

Article in English | MEDLINE | ID: mdl-23134484

ABSTRACT

Listeners infer which object in a visual scene a speaker refers to from the systematic variation of the speaker's tone of voice (ToV). We examined whether ToV also guides word learning. During exposure, participants heard novel adjectives (e.g., "daxen") spoken with a ToV representing hot, cold, strong, weak, big, or small while viewing picture pairs representing the meaning of the adjective and its antonym (e.g., elephant-ant for big-small). Eye fixations were recorded to monitor referent detection and learning. During test, participants heard the adjectives spoken with a neutral ToV, while selecting referents from familiar and unfamiliar picture pairs. Participants were able to learn the adjectives' meanings, and, even in the absence of informative ToV, generalize them to new referents. A second experiment addressed whether ToV provides sufficient information to infer the adjectival meaning or needs to operate within a referential context providing information about the relevant semantic dimension. Participants who saw printed versions of the novel words during exposure performed at chance during test. ToV, in conjunction with the referential context, thus serves as a cue to word meaning. ToV establishes relations between labels and referents for listeners to exploit in word learning.

Subject(s)

Semantics , Speech Perception/physiology , Verbal Learning/physiology , Vocabulary , Acoustic Stimulation , Association Learning , Attention/physiology , Female , Fixation, Ocular , Humans , Linear Models , Male , Pattern Recognition, Visual , Photic Stimulation , Reaction Time/physiology , Students , Universities

19.

Prosodic temporal alignment of co-speech gestures to speech facilitates referent resolution.

Jesse, Alexandra; Johnson, Elizabeth K.

J Exp Psychol Hum Percept Perform ; 38(6): 1567-81, 2012 Dec.

Article in English | MEDLINE | ID: mdl-22545598

ABSTRACT

Using a referent detection paradigm, we examined whether listeners can determine the object speakers are referring to by using the temporal alignment between the motion speakers impose on objects and their labeling utterances. Stimuli were created by videotaping speakers labeling a novel creature. Without being explicitly instructed to do so, speakers moved the creature during labeling. Trajectories of these motions were used to animate photographs of the creature. Participants in subsequent perception studies heard these labeling utterances while seeing side-by-side animations of two identical creatures in which only the target creature moved as originally intended by the speaker. Using the cross-modal temporal relationship between speech and referent motion, participants identified which creature the speaker was labeling, even when the labeling utterances were low-pass filtered to remove their semantic content or replaced by tone analogues. However, when the prosodic structure was eliminated by reversing the speech signal, participants no longer detected the referent as readily. These results provide strong support for a prosodic cross-modal alignment hypothesis. Speakers produce a perceptible link between the motion they impose upon a referent and the prosodic structure of their speech, and listeners readily use this prosodic cross-modal relationship to resolve referential ambiguity in word-learning situations.

Subject(s)

Concept Formation , Gestures , Speech Perception , Verbal Learning , Adult , Attention , Cues , Female , Humans , Male , Netherlands , Semantics , Time Factors , Visual Perception , Young Adult

20.

Speaking rate affects the perception of duration as a suprasegmental lexical-stress cue.

Reinisch, Eva; Jesse, Alexandra; McQueen, James M.

Lang Speech ; 54(Pt 2): 147-65, 2011 Jun.

Article in English | MEDLINE | ID: mdl-21848077

ABSTRACT

Three categorization experiments investigated whether the speaking rate of a preceding sentence influences durational cues to the perception of suprasegmental lexical-stress patterns. Dutch two-syllable word fragments had to be judged as coming from one of two longer words that matched the fragment segmentally but differed in lexical stress placement. Word pairs contrasted primary stress on either the first versus the second syllable or the first versus the third syllable. Duration of the initial or the second syllable of the fragments and rate of the preceding context (fast vs. slow) were manipulated. Listeners used speaking rate to decide about the degree of stress on initial syllables whether the syllables' absolute durations were informative about stress (Experiment Ia) or not (Experiment Ib). Rate effects on the second syllable were visible only when the initial syllable was ambiguous in duration with respect to the preceding rate context (Experiment 2). Absolute second syllable durations contributed little to stress perception (Experiment 3). These results suggest that speaking rate is used to disambiguate words and that rate-modulated stress cues are more important on initial than noninitial syllables. Speaking rate affects perception of suprasegmental information.

Subject(s)

Cues , Phonetics , Recognition, Psychology , Speech Acoustics , Speech Intelligibility , Speech Perception , Time Perception , Acoustic Stimulation , Audiometry, Speech , Humans , Time Factors

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL