Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
1.
Hum Mov Sci ; 93: 103180, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38266441

ABSTRACT

Developmental Coordination Disorder (DCD) is a movement disorder in which atypical sensory processing may underly movement atypicality. However, whether altered sensory processing is domain-specific or global in nature, are unanswered questions. Here, we measured for the first time, different aspects of sensory processing and spatiotemporal integration in the same cohort of adult participants with DCD (N = 16), possible DCD (pDCD, N = 12) and neurotypical adults (NT, N = 28). Haptic perception was reduced in both DCD and the extended DCD + pDCD groups when compared to NT adults. Audio-visual integration, measured using the sound-induced double flash illusion, was reduced only in DCD participants, and not the DCD + pDCD extended group. While low-level sensory processing was altered in DCD, the more cognitive, higher-level ability to infer temporal dimensions from spatial information, and vice-versa, as assessed with Tau-Kappa effects, was intact in DCD (and extended DCD + pDCD) participants. Both audio-visual integration and haptic perception difficulties correlated with the degree of self-reported DCD symptoms and were most apparent when comparing DCD and NT groups directly, instead of the expanded DCD + pDCD group. The association of sensory difficulties with DCD symptoms suggests that perceptual differences play a role in motor difficulties in DCD via an underlying internal modelling mechanism.


Subject(s)
Illusions , Motor Skills Disorders , Adult , Humans , Psychomotor Performance , Motor Skills Disorders/psychology , Stereognosis , Sensation
2.
Neurosci Lett ; 812: 137409, 2023 08 24.
Article in English | MEDLINE | ID: mdl-37487970

ABSTRACT

Neural oscillations subserve a broad range of speech processing and language comprehension functions. Using an electroencephalogram (EEG), we investigated the frequency-specific directed interactions between whole-brain regions while the participants processed Chinese sentences using different modality stimuli (i.e., auditory, visual, and audio-visual). The results indicate that low-frequency responses correspond to the process of information flow aggregation in primary sensory cortices in different modalities. Information flow dominated by high-frequency responses exhibited characteristics of bottom-up flow from left posterior temporal to left frontal regions. The network pattern of top-down information flowing out of the left frontal lobe was presented by the joint dominance of low- and high-frequency rhythms. Overall, our results suggest that the brain may be modality-independent when processing higher-order language information.


Subject(s)
Comprehension , Speech Perception , Humans , Comprehension/physiology , Brain Mapping/methods , Language , Brain/physiology , Frontal Lobe/physiology , Speech Perception/physiology , Magnetic Resonance Imaging
4.
Front Psychol ; 14: 1046672, 2023.
Article in English | MEDLINE | ID: mdl-37205083

ABSTRACT

Introduction: A singer's or speaker's Fach (voice type) should be appraised based on acoustic cues characterizing their voice. Instead, in practice, it is often influenced by the individual's physical appearance. This is especially distressful for transgender people who may be excluded from formal singing because of perceived mismatch between their voice and appearance. To eventually break down these visual biases, we need a better understanding of the conditions under which they occur. Specifically, we hypothesized that trans listeners (not actors) would be better able to resist such biases, relative to cis listeners, precisely because they would be more aware of appearance-voice dissociations. Methods: In an online study, 85 cisgender and 81 transgender participants were presented with 18 different actors singing or speaking short sentences. These actors covered six voice categories from high/bright (traditionally feminine) to low/dark (traditionally masculine) voices: namely soprano, mezzo-soprano (referred to henceforth as mezzo), contralto (referred to henceforth as alto), tenor, baritone, and bass. Every participant provided voice type ratings for (1) Audio-only (A) stimuli to get an unbiased estimate of a given actor's voice type, (2) Video-only (V) stimuli to get an estimate of the strength of the bias itself, and (3) combined Audio-Visual (AV) stimuli to see how much visual cues would affect the evaluation of the audio. Results: Results demonstrated that visual biases are not subtle and hold across the entire scale, shifting voice appraisal by about a third of the distance between adjacent voice types (for example, a third of the bass-to-baritone distance). This shift was 30% smaller for trans than for cis listeners, confirming our main hypothesis. This pattern was largely similar whether actors sung or spoke, though singing overall led to more feminine/high/bright ratings. Conclusion: This study is one of the first demonstrations that transgender listeners are in fact better judges of a singer's or speaker's voice type because they are better able to separate the actors' voice from their appearance, a finding that opens exciting avenues to fight more generally against implicit (or sometimes explicit) biases in voice appraisal.

5.
Mem Cognit ; 51(2): 349-370, 2023 02.
Article in English | MEDLINE | ID: mdl-36100821

ABSTRACT

In this study, we investigated the nature of long-term memory representations for naturalistic audio-visual scenes. Whereas previous research has shown that audio-visual scenes are recognized more accurately than their unimodal counterparts, it remains unclear whether this benefit stems from audio-visually integrated long-term memory representations or a summation of independent retrieval cues. We tested two predictions for audio-visually integrated memory representations. First, we used a modeling approach to test whether recognition performance for audio-visual scenes is more accurate than would be expected from independent retrieval cues. This analysis shows that audio-visual integration is not necessary to explain the benefit of audio-visual scenes relative to purely auditory or purely visual scenes. Second, we report a series of experiments investigating the occurrence of study-test congruency effects for unimodal and audio-visual scenes. Most importantly, visually encoded information was immune to additional auditory information presented during testing, whereas auditory encoded information was susceptible to additional visual information presented during testing. This renders a true integration of visual and auditory information in long-term memory representations unlikely. In sum, our results instead provide evidence for visual dominance in long-term memory. Whereas associative auditory information is capable of enhancing memory performance, the long-term memory representations appear to be primarily visual.


Subject(s)
Memory, Long-Term , Visual Perception , Humans , Cognition , Cues , Recognition, Psychology
6.
Cereb Cortex ; 33(8): 4740-4751, 2023 04 04.
Article in English | MEDLINE | ID: mdl-36178127

ABSTRACT

Human language units are hierarchical, and reading acquisition involves integrating multisensory information (typically from auditory and visual modalities) to access meaning. However, it is unclear how the brain processes and integrates language information at different linguistic units (words, phrases, and sentences) provided simultaneously in auditory and visual modalities. To address the issue, we presented participants with sequences of short Chinese sentences through auditory, visual, or combined audio-visual modalities while electroencephalographic responses were recorded. With a frequency tagging approach, we analyzed the neural representations of basic linguistic units (i.e. characters/monosyllabic words) and higher-level linguistic structures (i.e. phrases and sentences) across the 3 modalities separately. We found that audio-visual integration occurs in all linguistic units, and the brain areas involved in the integration varied across different linguistic levels. In particular, the integration of sentences activated the local left prefrontal area. Therefore, we used continuous theta-burst stimulation to verify that the left prefrontal cortex plays a vital role in the audio-visual integration of sentence information. Our findings suggest the advantage of bimodal language comprehension at hierarchical stages in language-related information processing and provide evidence for the causal role of the left prefrontal regions in processing information of audio-visual sentences.


Subject(s)
Brain Mapping , Comprehension , Humans , Comprehension/physiology , Brain/physiology , Linguistics , Electroencephalography
7.
Front Neurosci ; 16: 797277, 2022.
Article in English | MEDLINE | ID: mdl-36440282

ABSTRACT

Emotional clues are always expressed in many ways in our daily life, and the emotional information we receive is often represented by multiple modalities. Successful social interactions require a combination of multisensory cues to accurately determine the emotion of others. The integration mechanism of multimodal emotional information has been widely investigated. Different brain activity measurement methods were used to determine the location of brain regions involved in the audio-visual integration of emotional information, mainly in the bilateral superior temporal regions. However, the methods adopted in these studies are relatively simple, and the materials of the study rarely contain speech information. The integration mechanism of emotional speech in the human brain still needs further examinations. In this paper, a functional magnetic resonance imaging (fMRI) study was conducted using event-related design to explore the audio-visual integration mechanism of emotional speech in the human brain by using dynamic facial expressions and emotional speech to express emotions of different valences. Representational similarity analysis (RSA) based on regions of interest (ROIs), whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis were used to analyze and verify the role of relevant brain regions. Meanwhile, a weighted RSA method was used to evaluate the contributions of each candidate model in the best fitted model of ROIs. The results showed that only the left insula was detected by all methods, suggesting that the left insula played an important role in the audio-visual integration of emotional speech. Whole brain searchlight analysis, modality conjunction analysis and supra-additive analysis together revealed that the bilateral middle temporal gyrus (MTG), right inferior parietal lobule and bilateral precuneus might be involved in the audio-visual integration of emotional speech from other aspects.

8.
Front Psychol ; 13: 879156, 2022.
Article in English | MEDLINE | ID: mdl-35928422

ABSTRACT

During the COVID-19 pandemic, questions have been raised about the impact of face masks on communication in classroom settings. However, it is unclear to what extent visual obstruction of the speaker's mouth or changes to the acoustic signal lead to speech processing difficulties, and whether these effects can be mitigated by semantic predictability, i.e., the availability of contextual information. The present study investigated the acoustic and visual effects of face masks on speech intelligibility and processing speed under varying semantic predictability. Twenty-six children (aged 8-12) and twenty-six adults performed an internet-based cued shadowing task, in which they had to repeat aloud the last word of sentences presented in audio-visual format. The results showed that children and adults made more mistakes and responded more slowly when listening to face mask speech compared to speech produced without a face mask. Adults were only significantly affected by face mask speech when both the acoustic and the visual signal were degraded. While acoustic mask effects were similar for children, removal of visual speech cues through the face mask affected children to a lesser degree. However, high semantic predictability reduced audio-visual mask effects, leading to full compensation of the acoustically degraded mask speech in the adult group. Even though children did not fully compensate for face mask speech with high semantic predictability, overall, they still profited from semantic cues in all conditions. Therefore, in classroom settings, strategies that increase contextual information such as building on students' prior knowledge, using keywords, and providing visual aids, are likely to help overcome any adverse face mask effects.

9.
Neuroimage Clin ; 33: 102942, 2022.
Article in English | MEDLINE | ID: mdl-35033811

ABSTRACT

In naturalistic situations, sounds are often perceived in conjunction with matching visual impressions. For example, we see and hear the neighbor's dog barking in the garden. Still, there is a good chance that we recognize the neighbor's dog even when we only hear it barking, but do not see it behind the fence. Previous studies with normal-hearing (NH) listeners have shown that the audio-visual presentation of a perceptual object (like an animal) increases the probability to recognize this object later on, even if the repeated presentation of this object occurs in a purely auditory condition. In patients with a cochlear implant (CI), however, the electrical hearing of sounds is impoverished, and the ability to recognize perceptual objects in auditory conditions is significantly limited. It is currently not well understood whether CI users - as NH listeners - show a multisensory facilitation for auditory recognition. The present study used event-related potentials (ERPs) and a continuous recognition paradigm with auditory and audio-visual stimuli to test the prediction that CI users show a benefit from audio-visual perception. Indeed, the congruent audio-visual context resulted in an improved recognition ability of objects in an auditory-only condition, both in the NH listeners and the CI users. The ERPs revealed a group-specific pattern of voltage topographies and correlations between these ERP maps and the auditory recognition ability, indicating a different processing of congruent audio-visual stimuli in CI users when compared to NH listeners. Taken together, our results point to distinct cortical processing of naturalistic audio-visual objects in CI users and NH listeners, which however allows both groups to improve the recognition ability of these objects in a purely auditory context. Our findings are of relevance for future clinical research since audio-visual perception might also improve the auditory rehabilitation after cochlear implantation.


Subject(s)
Cochlear Implantation , Cochlear Implants , Speech Perception , Acoustic Stimulation , Aged , Auditory Perception , Evoked Potentials , Humans , Recognition, Psychology , Visual Perception
10.
Front Psychol ; 12: 733494, 2021.
Article in English | MEDLINE | ID: mdl-34916991

ABSTRACT

Fluent reading is characterized by fast and effortless decoding of visual and phonological information. Here we used event-related potentials (ERPs) and neuropsychological testing to probe the neurocognitive basis of reading in a sample of children with a wide range of reading skills. We report data of 51 children who were measured at two time points, i.e., at the end of first grade (mean age 7.6 years) and at the end of fourth grade (mean age 10.5 years). The aim of this study was to clarify whether next to behavioral measures also basic unimodal and bimodal neural measures help explaining the variance in the later reading outcome. Specifically, we addressed the question of whether next to the so far investigated unimodal measures of N1 print tuning and mismatch negativity (MMN), a bimodal measure of audiovisual integration (AV) contributes and possibly enhances prediction of the later reading outcome. We found that the largest variance in reading was explained by the behavioral measures of rapid automatized naming (RAN), block design and vocabulary (46%). Furthermore, we demonstrated that both unimodal measures of N1 print tuning (16%) and filtered MMN (7%) predicted reading, suggesting that N1 print tuning at the early stage of reading acquisition is a particularly good predictor of the later reading outcome. Beyond the behavioral measures, the two unimodal neural measures explained 7.2% additional variance in reading, indicating that basic neural measures can improve prediction of the later reading outcome over behavioral measures alone. In this study, the AV congruency effect did not significantly predict reading. It is therefore possible that audiovisual congruency effects reflect higher levels of multisensory integration that may be less important for reading acquisition in the first year of learning to read, and that they may potentially gain on relevance later on.

11.
Behav Res Methods ; 53(6): 2502-2511, 2021 12.
Article in English | MEDLINE | ID: mdl-33948923

ABSTRACT

The Bluegrass corpus includes sentences from 40 pairs of speakers. Participants from the Bluegrass Region rated one speaker from each pair as having a native North American English accent and the other as having a foreign accent (Experiment 1). Furthermore, speakers within each pair looked very similar in appearance, in that participants rated them similarly likely to speak with a foreign accent (Experiment 2). For each speaker we selected eight sentences based on participants' ratings of difficulty (Experiment 3). The final corpus includes a selection of 640 sentences (80 speakers, 8 stimuli per speaker) freely available through the Open Science Framework. Each sentence can be downloaded in different formats (text, audio, video) so researchers can investigate how audio-visual information influences language processing. Researchers can contribute to the corpus by validating the stimuli with new populations, selecting additional sentences, or finding new TED videos featuring appropriate speakers to answer their research questions.


Subject(s)
Poa , Speech Perception , Humans , Language , Research Personnel
12.
Front Psychol ; 12: 629996, 2021.
Article in English | MEDLINE | ID: mdl-33679553

ABSTRACT

People can discriminate the synchrony between audio-visual scenes. However, the sensitivity of audio-visual synchrony perception can be affected by many factors. Using a simultaneity judgment task, the present study investigated whether the synchrony perception of complex audio-visual stimuli was affected by audio-visual causality and stimulus reliability. In Experiment 1, the results showed that audio-visual causality could increase one's sensitivity to audio-visual onset asynchrony (AVOA) of both action stimuli and speech stimuli. Moreover, participants were more tolerant of AVOA of speech stimuli than that of action stimuli in the high causality condition, whereas no significant difference between these two kinds of stimuli was found in the low causality condition. In Experiment 2, the speech stimuli were manipulated with either high or low stimulus reliability. The results revealed a significant interaction between audio-visual causality and stimulus reliability. Under the low causality condition, the percentage of "synchronous" responses of audio-visual intact stimuli was significantly higher than that of visual_intact/auditory_blurred stimuli and audio-visual blurred stimuli. In contrast, no significant difference among all levels of stimulus reliability was observed under the high causality condition. Our study supported the synergistic effect of top-down processing and bottom-up processing in audio-visual synchrony perception.

13.
Neuroimage Clin ; 30: 102588, 2021.
Article in English | MEDLINE | ID: mdl-33618236

ABSTRACT

One of the proposed issues underlying reading difficulties in dyslexia is insufficiently automatized letter-speech sound associations. In the current fMRI experiment, we employ text-based recalibration to investigate letter-speech sound mappings in 8-10 year-old children with and without dyslexia. Here an ambiguous speech sound /a?a/ midway between /aba/ and /ada/ is combined with disambiguating "aba" or "ada" text causing a perceptual shift of the ambiguous /a?a/ sound towards the text (recalibration). This perceptual shift has been found to be reduced in adults but not in children with dyslexia compared to typical readers. Our fMRI results show significantly reduced activation in the left fusiform in dyslexic compared to typical readers, despite comparable behavioural performance. Furthermore, enhanced audio-visual activation within this region was linked to better reading and phonological skills. In contrast, higher activation in bilateral superior temporal cortex was associated with lower letter-speech sound identification fluency. These findings reflect individual differences during the early stages of reading development with reduced recruitment of the left fusiform in dyslexic readers together with an increased involvement of the superior temporal cortex in children with less automatized letter-speech sound associations.


Subject(s)
Dyslexia , Speech Perception , Adult , Child , Dyslexia/diagnostic imaging , Humans , Phonetics , Reading , Speech
14.
Cogn Process ; 22(2): 227-237, 2021 May.
Article in English | MEDLINE | ID: mdl-33404898

ABSTRACT

While previous research has shown that during mental imagery participants look back to areas visited during encoding it is unclear what happens when information presented during encoding is incongruent. To investigate this research question, we presented 30 participants with incongruent audio-visual associations (e.g. the image of a car paired with the sound of a cat) and later asked them to create a congruent mental representation based on the auditory cue (e.g. to create a mental representation of a cat while hearing the sound of a cat). The results revealed that participants spent more time in the areas where they previously saw the object and that incongruent audio-visual information during encoding did not appear to interfere with the generation and maintenance of mental images. This finding suggests that eye movements can be flexibly employed during mental imagery depending on the demands of the task.


Subject(s)
Eye Movements , Imagery, Psychotherapy , Animals , Cats , Humans , Sound , Visual Perception
15.
Percept Mot Skills ; 128(1): 59-79, 2021 Feb.
Article in English | MEDLINE | ID: mdl-32990163

ABSTRACT

In research studies on how people perceive simultaneously presented audiovisual information, researchers have often shown that the number of visual flashes participants perceive on a computer screen can be altered by varying the number of accompanying auditory, visual, or combined audiovisual cues or inducers. In the present study, we examined the effects of number-incongruent audiovisual inducer stimuli on the participants' perceived number of target flashes. We instructed 16 participants (eight males and eight females; Mage = 21.56; SDage = 1.93) to report their perceived number of target flashes while ignoring the visual and auditory inducers. Across 18 different experimental conditions, we presented one or two target flashes in association with varied numbers (0, 1, 2) of auditory and visual inducer stimuli. In the condition with one target flash paired with one visual and two auditory inducers, the number of visual inducers (i.e., one) had a greater influence on the number of perceived target flashes than did the number of auditory inducers (i.e., two). Under all other number incongruent audiovisual inducer conditions, the participants' perceived number of target flashes was influenced more by the number of auditory than the number of visual inducers. We discuss these findings in the context of perceptual grouping and perceptual temporal uncertainty.


Subject(s)
Illusions , Acoustic Stimulation , Adult , Auditory Perception , Child, Preschool , Female , Humans , Infant , Male , Photic Stimulation , Visual Perception , Young Adult
16.
Front Hum Neurosci ; 14: 576888, 2020.
Article in English | MEDLINE | ID: mdl-33192407

ABSTRACT

We investigated "musical effort" with an internationally renowned, classical, pianist while playing, listening, and imagining music. We used pupillometry as an objective measure of mental effort and fMRI as an exploratory method of effort with the same musical pieces. We also compared a group of non-professional pianists and non-musicians by the use of pupillometry and a small group of non-musicians with fMRI. This combined approach of psychophysiology and neuroimaging revealed the cognitive work during different musical activities. We found that pupil diameters were largest when "playing" (regardless of whether there was sound produced or not) compared to conditions with no movement (i.e., "listening" and "imagery"). We found positive correlations between pupil diameters of the professional pianist during different conditions with the same piano piece (i.e., normal playing, silenced playing, listen, imagining), which might indicate similar degrees of load on cognitive resources as well as an intimate link between the motor imagery of sound-producing body motions and gestures. We also confirmed that musical imagery had a strong commonality with music listening in both pianists and musically naïve individuals. Neuroimaging provided evidence for a relationship between noradrenergic (NE) activity and mental workload or attentional intensity within the domain of music cognition. We found effort related activity in the superior part of the locus coeruleus (LC) and, similarly to the pupil, the listening and imagery engaged less the LC-NE network than the motor condition. The pianists attended more intensively to the most difficult piece than the non-musicians since they showed larger pupils for the most difficult piece. Non-musicians were the most engaged by the music listening task, suggesting that the amount of attention allocated for the same task may follow a hierarchy of expertise demanding less attentional effort in expert or performers than in novices. In the professional pianist, we found only weak evidence for a commonality between subjective effort (as rated measure-by-measure) and the objective effort gauged with pupil diameter during listening. We suggest that psychophysiological methods like pupillometry can index mental effort in a manner that is not available to subjective awareness or introspection.

17.
Front Aging Neurosci ; 12: 571950, 2020.
Article in English | MEDLINE | ID: mdl-33192463

ABSTRACT

Audio-visual integration (AVI) is higher in attended conditions than in unattended conditions. Here, we explore the AVI effect when the attentional recourse is competed by additional visual distractors, and its aging effect using single- and dual-tasks. The results showed the highest AVI effect under single-task-attentional-load condition than under no- and dual-task-attentional-load conditions (all P < 0.05) in both older and younger groups, but the AVI effect was weaker and delayed for older adults compared to younger adults for all attentional-load conditions (all P < 0.05). The non-phase-locked oscillation for AVI analysis illustrated the highest theta and alpha oscillatory activity for single-task-attentional-load condition than for no- and dual-task-attentional-load conditions, and the AVI oscillatory activity mainly occurred in the Cz, CP1 and Oz of older adults but in the Fz, FC1, and Cz of younger adults. The AVI effect was significantly negatively correlated with FC1 (r 2 = 0.1468, P = 0.05) and Cz (r2 = 0.1447, P = 0.048) theta activity and with Fz (r 2 = 0.1557, P = 0.043), FC1 (r 2 = 0.1042, P = 0.008), and Cz (r 2 = 0.0897, P = 0.010) alpha activity for older adults but not for younger adults in dual task. These results suggested a reduction in AVI ability for peripheral stimuli and a shift in AVI oscillation from anterior to posterior regions in older adults as an adaptive mechanism.

18.
Brain Behav ; 10(9): e01759, 2020 09.
Article in English | MEDLINE | ID: mdl-32683799

ABSTRACT

INTRODUCTION: Previous studies have confirmed increased functional connectivity in elderly adults during processing of simple audio-visual stimuli; however, it is unclear whether elderly adults maximize their performance by strengthening their functional brain connectivity when processing dynamic audio-visual hand-held tool stimuli. The present study aimed to explore this question using global functional connectivity. METHODS: Twenty-one healthy elderly adults and 21 healthy younger adults were recruited to conduct a dynamic hand-held tool recognition task with high/low-intensity stimuli. RESULTS: Elderly adults exhibited higher areas under the curve for both the high-intensity (3.5 versus. 2.7) and low-intensity (3.0 versus. 1.2) stimuli, indicating a higher audio-visual integration ability, but a delayed and widened audio-visual integration window for elderly adults for both the high-intensity (390 - 690 ms versus. 360 - 560 ms) and low-intensity (460 - 690 ms versus. 430 - 500 ms) stimuli. Additionally, elderly adults exhibited higher theta-band (all p < .01) but lower alpha-, beta-, and gamma-band functional connectivity (all p < .05) than younger adults under both the high- and low-intensity-stimulus conditions when processing audio-visual stimuli, except for gamma-band functional connectivity under the high-intensity-stimulus condition. Furthermore, higher theta- and alpha-band functional connectivity were observed for the audio-visual stimuli than for the auditory and visual stimuli and under the high-intensity-stimulus condition than under the low-intensity-stimulus condition. CONCLUSION: The higher theta-band functional connectivity in elderly adults was mainly due to higher attention allocation. The results further suggested that in the case of sensory processing, theta, alpha, beta, and gamma activity might participate in different stages of perception.


Subject(s)
Attention , Visual Perception , Brain/diagnostic imaging , Recognition, Psychology
19.
Front Neurorobot ; 14: 29, 2020.
Article in English | MEDLINE | ID: mdl-32499692

ABSTRACT

While interacting with the world our senses and nervous system are constantly challenged to identify the origin and coherence of sensory input signals of various intensities. This problem becomes apparent when stimuli from different modalities need to be combined, e.g., to find out whether an auditory stimulus and a visual stimulus belong to the same object. To cope with this problem, humans and most other animal species are equipped with complex neural circuits to enable fast and reliable combination of signals from various sensory organs. This multisensory integration starts in the brain stem to facilitate unconscious reflexes and continues on ascending pathways to cortical areas for further processing. To investigate the underlying mechanisms in detail, we developed a canonical neural network model for multisensory integration that resembles neurophysiological findings. For example, the model comprises multisensory integration neurons that receive excitatory and inhibitory inputs from unimodal auditory and visual neurons, respectively, as well as feedback from cortex. Such feedback projections facilitate multisensory response enhancement and lead to the commonly observed inverse effectiveness of neural activity in multisensory neurons. Two versions of the model are implemented, a rate-based neural network model for qualitative analysis and a variant that employs spiking neurons for deployment on a neuromorphic processing. This dual approach allows to create an evaluation environment with the ability to test model performances with real world inputs. As a platform for deployment we chose IBM's neurosynaptic chip TrueNorth. Behavioral studies in humans indicate that temporal and spatial offsets as well as reliability of stimuli are critical parameters for integrating signals from different modalities. The model reproduces such behavior in experiments with different sets of stimuli. In particular, model performance for stimuli with varying spatial offset is tested. In addition, we demonstrate that due to the emergent properties of network dynamics model performance is close to optimal Bayesian inference for integration of multimodal sensory signals. Furthermore, the implementation of the model on a neuromorphic processing chip enables a complete neuromorphic processing cascade from sensory perception to multisensory integration and the evaluation of model performance for real world inputs.

20.
Atten Percept Psychophys ; 82(7): 3544-3557, 2020 Oct.
Article in English | MEDLINE | ID: mdl-32533526

ABSTRACT

Seeing a talker's face can aid audiovisual (AV) integration when speech is presented in noise. However, few studies have simultaneously manipulated auditory and visual degradation. We aimed to establish how degrading the auditory and visual signal affected AV integration. Where people look on the face in this context is also of interest; Buchan, Paré and Munhall (Brain Research, 1242, 162-171, 2008) found fixations on the mouth increased in the presence of auditory noise whilst Wilson, Alsius, Paré and Munhall (Journal of Speech, Language, and Hearing Research, 59(4), 601-615, 2016) found mouth fixations decreased with decreasing visual resolution. In Condition 1, participants listened to clear speech, and in Condition 2, participants listened to vocoded speech designed to simulate the information provided by a cochlear implant. Speech was presented in three levels of auditory noise and three levels of visual blurring. Adding noise to the auditory signal increased McGurk responses, while blurring the visual signal decreased McGurk responses. Participants fixated the mouth more on trials when the McGurk effect was perceived. Adding auditory noise led to people fixating the mouth more, while visual degradation led to people fixating the mouth less. Combined, the results suggest that modality preference and where people look during AV integration of incongruent syllables varies according to the quality of information available.


Subject(s)
Eye Movements , Speech Perception , Auditory Perception , Humans , Speech , Visual Perception
SELECTION OF CITATIONS
SEARCH DETAIL
...