Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Cogn Sci ; 48(5): e13449, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38773754

RESUMO

We recently reported strong, replicable (i.e., replicated) evidence for lexically mediated compensation for coarticulation (LCfC; Luthra et al., 2021), whereby lexical knowledge influences a prelexical process. Critically, evidence for LCfC provides robust support for interactive models of cognition that include top-down feedback and is inconsistent with autonomous models that allow only feedforward processing. McQueen, Jesse, and Mitterer (2023) offer five counter-arguments against our interpretation; we respond to each of those arguments here and conclude that top-down feedback provides the most parsimonious explanation of extant data.


Assuntos
Percepção da Fala , Humanos , Percepção da Fala/fisiologia , Cognição , Idioma
2.
Atten Percept Psychophys ; 86(3): 942-961, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38383914

RESUMO

Listeners have many sources of information available in interpreting speech. Numerous theoretical frameworks and paradigms have established that various constraints impact the processing of speech sounds, but it remains unclear how listeners might simultaneously consider multiple cues, especially those that differ qualitatively (i.e., with respect to timing and/or modality) or quantitatively (i.e., with respect to cue reliability). Here, we establish that cross-modal identity priming can influence the interpretation of ambiguous phonemes (Exp. 1, N = 40) and show that two qualitatively distinct cues - namely, cross-modal identity priming and auditory co-articulatory context - have additive effects on phoneme identification (Exp. 2, N = 40). However, we find no effect of quantitative variation in a cue - specifically, changes in the reliability of the priming cue did not influence phoneme identification (Exp. 3a, N = 40; Exp. 3b, N = 40). Overall, we find that qualitatively distinct cues can additively influence phoneme identification. While many existing theoretical frameworks address constraint integration to some degree, our results provide a step towards understanding how information that differs in both timing and modality is integrated in online speech perception.


Assuntos
Sinais (Psicologia) , Fonética , Percepção da Fala , Humanos , Percepção da Fala/fisiologia , Adulto Jovem , Feminino , Masculino , Adulto
3.
J Indian Prosthodont Soc ; 24(1): 69-75, 2024 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-38263560

RESUMO

AIM: The primary objective of this research was to assess and compare the impact of customized zirconia (Zr) and titanium (Ti) abutments, placed on early loaded dental implants, on both hard tissue (as measured crestal bone level) and soft tissue (as assessed by sulcular bleeding index [SBI], probing depth [PD], and Pink Esthetic Score [PES]), through clinical and radiographic evaluation. SETTINGS AND DESIGN: This research involved a sample of 15 patients who had partially dentulous mandibular arch. Within this group, a total of 30 implants were surgically placed. Specifically, each patient received two implants in the posterior region of the mandible, and the bone density in this area was classified as D2 type. In each patient, one implant was loaded with Zr abutment and the other was loaded with Ti abutment. The bone quality in the area of implant placement was Type D2. Two groups were created for this research. Each group consisted of 15 early loaded dental implants with customized Zr abutments and customized Ti abutments respectively. MATERIALS AND METHODS: Hard- and soft-tissue changes were evaluated in both the groups. Evaluation of crestal bone loss (CBL) with cone beam computed tomography and SBI, PD and PESs were evaluated by various indices at 2, 4, and 6 months postloading. STATISTICAL ANALYSIS USED: After obtaining the readings, data were subjected to statistical analysis and comparison of quantitative data was done, paired t-test was used. RESULTS: The mean CBL in the Ti abutment is higher; the difference between the two groups was not statistically significant. SBI and PD for Zr were higher, but there was no statistically significant difference between the two groups. Zr had a higher PES than Ti abutment and the difference between the two groups was statistically significant. In the literature till date, the PES of Zr abutments were proven better for provisional restorations in implant prosthesis, but very few literatures support the same for the final implant restorations. CONCLUSION: The study did not reveal a clear advantage of either Ti or Zr abutments over the other. Nevertheless, Zr abutments tended to produce a more favorable color response in the peri-implant mucosa and led to superior esthetic outcomes as measured by the PES.


Assuntos
Doenças Ósseas Metabólicas , Implantes Dentários , Zircônio , Humanos , Titânio , Estética Dentária ,
4.
Psychon Bull Rev ; 31(1): 104-121, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37580454

RESUMO

Though listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant. This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties. One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning). An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention. Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales. Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs. The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.


Assuntos
Percepção da Fala , Humanos , Fala , Fonética , Atenção
5.
Cognition ; 242: 105661, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-37944313

RESUMO

Whether top-down feedback modulates perception has deep implications for cognitive theories. Debate has been vigorous in the domain of spoken word recognition, where competing computational models and agreement on at least one diagnostic experimental paradigm suggest that the debate may eventually be resolvable. Norris and Cutler (2021) revisit arguments against lexical feedback in spoken word recognition models. They also incorrectly claim that recent computational demonstrations that feedback promotes accuracy and speed under noise (Magnuson et al., 2018) were due to the use of the Luce choice rule rather than adding noise to inputs (noise was in fact added directly to inputs). They also claim that feedback cannot improve word recognition because feedback cannot distinguish signal from noise. We have two goals in this paper. First, we correct the record about the simulations of Magnuson et al. (2018). Second, we explain how interactive activation models selectively sharpen signals via joint effects of feedback and lateral inhibition that boost lexically-coherent sublexical patterns over noise. We also review a growing body of behavioral and neural results consistent with feedback and inconsistent with autonomous (non-feedback) architectures, and conclude that parsimony supports feedback. We close by discussing the potential for synergy between autonomous and interactive approaches.


Assuntos
Percepção da Fala , Retroalimentação , Percepção da Fala/fisiologia , Idioma , Ruído
6.
Neurobiol Lang (Camb) ; 4(1): 145-177, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37229142

RESUMO

Though the right hemisphere has been implicated in talker processing, it is thought to play a minimal role in phonetic processing, at least relative to the left hemisphere. Recent evidence suggests that the right posterior temporal cortex may support learning of phonetic variation associated with a specific talker. In the current study, listeners heard a male talker and a female talker, one of whom produced an ambiguous fricative in /s/-biased lexical contexts (e.g., epi?ode) and one who produced it in /∫/-biased contexts (e.g., friend?ip). Listeners in a behavioral experiment (Experiment 1) showed evidence of lexically guided perceptual learning, categorizing ambiguous fricatives in line with their previous experience. Listeners in an fMRI experiment (Experiment 2) showed differential phonetic categorization as a function of talker, allowing for an investigation of the neural basis of talker-specific phonetic processing, though they did not exhibit perceptual learning (likely due to characteristics of our in-scanner headphones). Searchlight analyses revealed that the patterns of activation in the right superior temporal sulcus (STS) contained information about who was talking and what phoneme they produced. We take this as evidence that talker information and phonetic information are integrated in the right STS. Functional connectivity analyses suggested that the process of conditioning phonetic identity on talker information depends on the coordinated activity of a left-lateralized phonetic processing system and a right-lateralized talker processing system. Overall, these results clarify the mechanisms through which the right hemisphere supports talker-specific phonetic processing.

7.
Cognition ; 237: 105467, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37148640

RESUMO

Multiple lines of research have developed training approaches that foster category learning, with important translational implications for education. Increasing exemplar variability, blocking or interleaving by category-relevant dimension, and providing explicit instructions about diagnostic dimensions each have been shown to facilitate category learning and/or generalization. However, laboratory research often must distill the character of natural input regularities that define real-world categories. As a result, much of what we know about category learning has come from studies with simplifying assumptions. We challenge the implicit expectation that these studies reflect the process of category learning of real-world input by creating an auditory category learning paradigm that intentionally violates some common simplifying assumptions of category learning tasks. Across five experiments and nearly 300 adult participants, we used training regimes previously shown to facilitate category learning, but here drew from a more complex and multidimensional category space with tens of thousands of unique exemplars. Learning was equivalently robust across training regimes that changed exemplar variability, altered the blocking of category exemplars, or provided explicit instructions of the category-diagnostic dimension. Each drove essentially equivalent accuracy measures of learning generalization following 40 min of training. These findings suggest that auditory category learning across complex input is not as susceptible to training regime manipulation as previously thought.


Assuntos
Generalização Psicológica , Aprendizagem , Adulto , Humanos , Formação de Conceito
8.
Brain Lang ; 240: 105264, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37087863

RESUMO

Theories suggest that speech perception is informed by listeners' beliefs of what phonetic variation is typical of a talker. A previous fMRI study found right middle temporal gyrus (RMTG) sensitivity to whether a phonetic variant was typical of a talker, consistent with literature suggesting that the right hemisphere may play a key role in conditioning phonetic identity on talker information. The current work used transcranial magnetic stimulation (TMS) to test whether the RMTG plays a causal role in processing talker-specific phonetic variation. Listeners were exposed to talkers who differed in how they produced voiceless stop consonants while TMS was applied to RMTG, left MTG, or scalp vertex. Listeners subsequently showed near-ceiling performance in indicating which of two variants was typical of a trained talker, regardless of previous stimulation site. Thus, even though the RMTG is recruited for talker-specific phonetic processing, modulation of its function may have only modest consequences.


Assuntos
Fonética , Percepção da Fala , Humanos , Estimulação Magnética Transcraniana , Lobo Temporal/diagnóstico por imagem , Percepção da Fala/fisiologia , Imageamento por Ressonância Magnética
9.
Brain Lang ; 226: 105070, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35026449

RESUMO

The study of perceptual flexibility in speech depends on a variety of tasks that feature a large degree of variability between participants. Of critical interest is whether measures are consistent within an individual or across stimulus contexts. This is particularly key for individual difference designs that aredeployed to examine the neural basis or clinical consequences of perceptual flexibility. In the present set of experiments, we assess the split-half reliability and construct validity of five measures of perceptual flexibility: three of learning in a native language context (e.g., understanding someone with a foreign accent) and two of learning in a non-native context (e.g., learning to categorize non-native speech sounds). We find that most of these tasks show an appreciable level of split-half reliability, although construct validity was sometimes weak. This provides good evidence for reliability for these tasks, while highlighting possible upper limits on expected effect sizes involving each measure.


Assuntos
Percepção da Fala , Fala , Humanos , Idioma , Fonética , Reprodutibilidade dos Testes
10.
J Exp Psychol Hum Percept Perform ; 47(12): 1673-1680, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34881952

RESUMO

Determining how human listeners achieve phonetic constancy despite a variable mapping between the acoustics of speech and phonemic categories is the longest standing challenge in speech perception. A clue comes from studies where the talker changes randomly between stimuli, which slows processing compared with a single-talker baseline. These multitalker processing costs have been observed most often in speeded monitoring paradigms, where participants respond whenever a specific item occurs. Notably, the conventional paradigm imposes attentional demands via two forms of varied mapping in mixed-talker conditions. First, target recycling (i.e., allowing items to serve as targets on some trials but as distractors on others) potentially prevents the development of task automaticity. Second, in mixed trials, participants must respond to two unique stimuli (i.e., one target produced by each talker), whereas in blocked conditions, they need respond to only one token (i.e., multiple target tokens). We seek to understand how attentional demands influence talker normalization, as measured by multitalker processing costs. Across four experiments, multitalker processing costs persisted when target recycling was not allowed but diminished when only one stimulus served as the target on mixed trials. We discuss the logic of using varied mapping to elicit attentional effects and implications for theories of speech perception. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Assuntos
Percepção da Fala , Acústica , Atenção , Humanos , Fonética , Fala
11.
Atten Percept Psychophys ; 83(6): 2367-2376, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33948883

RESUMO

Researchers have hypothesized that in order to accommodate variability in how talkers produce their speech sounds, listeners must perform a process of talker normalization. Consistent with this proposal, several studies have shown that spoken word recognition is slowed when speech is produced by multiple talkers compared with when all speech is produced by one talker (a multitalker processing cost). Nusbaum and colleagues have argued that talker normalization is modulated by attention (e.g., Nusbaum & Morin, 1992, Speech Perception, Production and Linguistic Structure, pp. 113-134). Some of the strongest evidence for this claim is from a speeded monitoring study where a group of participants who expected to hear two talkers showed a multitalker processing cost, but a separate group who expected one talker did not (Magnuson & Nusbaum, 2007, Journal of Experimental Psychology, 33[2], 391-409). In that study, however, the sample size was small and the crucial interaction was not significant. In this registered report, we present the results of a well-powered attempt to replicate those findings. In contrast to the previous study, we did not observe multitalker processing costs in either of our groups. To rule out the possibility that the null result was due to task constraints, we conducted a second experiment using a speeded classification task. As in Experiment 1, we found no influence of expectations on talker normalization, with no multitalker processing cost observed in either group. Our data suggest that the previous findings of Magnuson and Nusbaum (2007) be regarded with skepticism and that talker normalization may not be permeable to high-level expectations.


Assuntos
Motivação , Percepção da Fala , Atenção , Humanos , Fonética , Fala
12.
J Exp Psychol Learn Mem Cogn ; 47(4): 685-704, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-33983786

RESUMO

A challenge for listeners is to learn the appropriate mapping between acoustics and phonetic categories for an individual talker. Lexically guided perceptual learning (LGPL) studies have shown that listeners can leverage lexical knowledge to guide this process. For instance, listeners learn to interpret ambiguous /s/-/∫/ blends as /s/ if they have previously encountered them in /s/-biased contexts like epi?ode. Here, we examined whether the degree of preceding lexical support might modulate the extent of perceptual learning. In Experiment 1, we first demonstrated that perceptual learning could be obtained in a modified LGPL paradigm where listeners were first biased to interpret ambiguous tokens as one phoneme (e.g., /s/) and then later as another (e.g., /∫/). In subsequent experiments, we tested whether the extent of learning differed depending on whether targets encountered predictive contexts or neutral contexts prior to the auditory target (e.g., epi?ode). Experiment 2 used auditory sentence contexts (e.g., "I love The Walking Dead and eagerly await every new . . ."), whereas Experiment 3 used written sentence contexts. In Experiment 4, participants did not receive sentence contexts but rather saw the written form of the target word (episode) or filler text (########) prior to hearing the critical auditory token. While we consistently observed effects of context on in-the-moment processing of critical words, the size of the learning effect was not modulated by the type of context. We hypothesize that boosting lexical support through preceding context may not strongly influence perceptual learning when ambiguous speech sounds can be identified solely from lexical information. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Assuntos
Aprendizagem , Fonética , Percepção da Fala , Redação , Feminino , Humanos , Conhecimento , Masculino
13.
Psychon Bull Rev ; 28(4): 1381-1389, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33852158

RESUMO

Pervasive behavioral and neural evidence for predictive processing has led to claims that language processing depends upon predictive coding. Formally, predictive coding is a computational mechanism where only deviations from top-down expectations are passed between levels of representation. In many cognitive neuroscience studies, a reduction of signal for expected inputs is taken as being diagnostic of predictive coding. In the present work, we show that despite not explicitly implementing prediction, the TRACE model of speech perception exhibits this putative hallmark of predictive coding, with reductions in total lexical activation, total lexical feedback, and total phoneme activation when the input conforms to expectations. These findings may indicate that interactive activation is functionally equivalent or approximant to predictive coding or that caution is warranted in interpreting neural signal reduction as diagnostic of predictive coding.


Assuntos
Percepção da Fala , Humanos , Idioma
14.
Cogn Sci ; 45(4): e12962, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33877697

RESUMO

A long-standing question in cognitive science is how high-level knowledge is integrated with sensory input. For example, listeners can leverage lexical knowledge to interpret an ambiguous speech sound, but do such effects reflect direct top-down influences on perception or merely postperceptual biases? A critical test case in the domain of spoken word recognition is lexically mediated compensation for coarticulation (LCfC). Previous LCfC studies have shown that a lexically restored context phoneme (e.g., /s/ in Christma#) can alter the perceived place of articulation of a subsequent target phoneme (e.g., the initial phoneme of a stimulus from a tapes-capes continuum), consistent with the influence of an unambiguous context phoneme in the same position. Because this phoneme-to-phoneme compensation for coarticulation is considered sublexical, scientists agree that evidence for LCfC would constitute strong support for top-down interaction. However, results from previous LCfC studies have been inconsistent, and positive effects have often been small. Here, we conducted extensive piloting of stimuli prior to testing for LCfC. Specifically, we ensured that context items elicited robust phoneme restoration (e.g., that the final phoneme of Christma# was reliably identified as /s/) and that unambiguous context-final segments (e.g., a clear /s/ at the end of Christmas) drove reliable compensation for coarticulation for a subsequent target phoneme. We observed robust LCfC in a well-powered, preregistered experiment with these pretested items (N = 40) as well as in a direct replication study (N = 40). These results provide strong evidence in favor of computational models of spoken word recognition that include top-down feedback.


Assuntos
Percepção da Fala , Humanos , Fonética
15.
Atten Percept Psychophys ; 83(5): 2217-2228, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33754298

RESUMO

Because different talkers produce their speech sounds differently, listeners benefit from maintaining distinct generative models (sets of beliefs) about the correspondence between acoustic information and phonetic categories for different talkers. A robust literature on phonetic recalibration indicates that when listeners encounter a talker who produces their speech sounds idiosyncratically (e.g., a talker who produces their /s/ sound atypically), they can update their generative model for that talker. Such recalibration has been shown to occur in a relatively talker-specific way. Because listeners in ecological situations often meet several new talkers at once, the present study considered how the process of simultaneously updating two distinct generative models compares to updating one model at a time. Listeners were exposed to two talkers, one who produced /s/ atypically and one who produced /∫/ atypically. Critically, these talkers only produced these sounds in contexts where lexical information disambiguated the phoneme's identity (e.g., epi_ode, flouri_ing). When initial exposure to the two talkers was blocked by voice (Experiment 1), listeners recalibrated to these talkers after relatively little exposure to each talker (32 instances per talker, of which 16 contained ambiguous fricatives). However, when the talkers were intermixed during learning (Experiment 2), listeners required more exposure trials before they were able to adapt to the idiosyncratic productions of these talkers (64 instances per talker, of which 32 contained ambiguous fricatives). Results suggest that there is a perceptual cost to simultaneously updating multiple distinct generative models, potentially because listeners must first select which generative model to update.


Assuntos
Percepção da Fala , Voz , Humanos , Aprendizagem , Fonética , Som
16.
Neurobiol Lang (Camb) ; 2(1): 138-151, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-37213418

RESUMO

Neurobiological models of speech perception posit that both left and right posterior temporal brain regions are involved in the early auditory analysis of speech sounds. However, frank deficits in speech perception are not readily observed in individuals with right hemisphere damage. Instead, damage to the right hemisphere is often associated with impairments in vocal identity processing. Herein lies an apparent paradox: The mapping between acoustics and speech sound categories can vary substantially across talkers, so why might right hemisphere damage selectively impair vocal identity processing without obvious effects on speech perception? In this review, I attempt to clarify the role of the right hemisphere in speech perception through a careful consideration of its role in processing vocal identity. I review evidence showing that right posterior superior temporal, right anterior superior temporal, and right inferior / middle frontal regions all play distinct roles in vocal identity processing. In considering the implications of these findings for neurobiological accounts of speech perception, I argue that the recruitment of right posterior superior temporal cortex during speech perception may specifically reflect the process of conditioning phonetic identity on talker information. I suggest that the relative lack of involvement of other right hemisphere regions in speech perception may be because speech perception does not necessarily place a high burden on talker processing systems, and I argue that the extant literature hints at potential subclinical impairments in the speech perception abilities of individuals with right hemisphere damage.

17.
Cogn Sci ; 44(12): e12917, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33274485

RESUMO

Visual word recognition is facilitated by the presence of orthographic neighbors that mismatch the target word by a single letter substitution. However, researchers typically do not consider where neighbors mismatch the target. In light of evidence that some letter positions are more informative than others, we investigate whether the influence of orthographic neighbors differs across letter positions. To do so, we quantify the number of enemies at each letter position (how many neighbors mismatch the target word at that position). Analyses of reaction time data from a visual word naming task indicate that the influence of enemies differs across letter positions, with the negative impacts of enemies being most pronounced at letter positions where readers have low prior uncertainty about which letters they will encounter (i.e., positions with low entropy). To understand the computational mechanisms that give rise to such positional entropy effects, we introduce a new computational model, VOISeR (Visual Orthographic Input Serial Reader), which receives orthographic inputs in parallel and produces an over-time sequence of phonemes as output. VOISeR produces a similar pattern of results as in the human data, suggesting that positional entropy effects may emerge even when letters are not sampled serially. Finally, we demonstrate that these effects also emerge in human subjects' data from a lexical decision task, illustrating the generalizability of positional entropy effects across visual word recognition paradigms. Taken together, such work suggests that research into orthographic neighbor effects in visual word recognition should also consider differences between letter positions.


Assuntos
Reconhecimento Visual de Modelos , Leitura , Humanos , Tempo de Reação
18.
J Cogn Neurosci ; 32(10): 2001-2012, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32662731

RESUMO

A listener's interpretation of a given speech sound can vary probabilistically from moment to moment. Previous experience (i.e., the contexts in which one has encountered an ambiguous sound) can further influence the interpretation of speech, a phenomenon known as perceptual learning for speech. This study used multivoxel pattern analysis to query how neural patterns reflect perceptual learning, leveraging archival fMRI data from a lexically guided perceptual learning study conducted by Myers and Mesite [Myers, E. B., & Mesite, L. M. Neural systems underlying perceptual adjustment to non-standard speech tokens. Journal of Memory and Language, 76, 80-93, 2014]. In that study, participants first heard ambiguous /s/-/∫/ blends in either /s/-biased lexical contexts (epi_ode) or /∫/-biased contexts (refre_ing); subsequently, they performed a phonetic categorization task on tokens from an /asi/-/a∫i/ continuum. In the current work, a classifier was trained to distinguish between phonetic categorization trials in which participants heard unambiguous productions of /s/ and those in which they heard unambiguous productions of /∫/. The classifier was able to generalize this training to ambiguous tokens from the middle of the continuum on the basis of individual participants' trial-by-trial perception. We take these findings as evidence that perceptual learning for speech involves neural recalibration, such that the pattern of activation approximates the perceived category. Exploratory analyses showed that left parietal regions (supramarginal and angular gyri) and right temporal regions (superior, middle, and transverse temporal gyri) were most informative for categorization. Overall, our results inform an understanding of how moment-to-moment variability in speech perception is encoded in the brain.


Assuntos
Percepção da Fala , Fala , Humanos , Idioma , Aprendizagem , Fonética
19.
Cogn Sci ; 44(4): e12823, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32274861

RESUMO

Despite the lack of invariance problem (the many-to-many mapping between acoustics and percepts), human listeners experience phonetic constancy and typically perceive what a speaker intends. Most models of human speech recognition (HSR) have side-stepped this problem, working with abstract, idealized inputs and deferring the challenge of working with real speech. In contrast, carefully engineered deep learning networks allow robust, real-world automatic speech recognition (ASR). However, the complexities of deep learning architectures and training regimens make it difficult to use them to provide direct insights into mechanisms that may support HSR. In this brief article, we report preliminary results from a two-layer network that borrows one element from ASR, long short-term memory nodes, which provide dynamic memory for a range of temporal spans. This allows the model to learn to map real speech from multiple talkers to semantic targets with high accuracy, with human-like timecourse of lexical access and phonological competition. Internal representations emerge that resemble phonetically organized responses in human superior temporal gyrus, suggesting that the model develops a distributed phonological code despite no explicit training on phonetic or phonemic targets. The ability to work with real speech is a major advance for cognitive models of HSR.


Assuntos
Simulação por Computador , Modelos Neurológicos , Redes Neurais de Computação , Percepção da Fala , Fala , Feminino , Humanos , Masculino , Fonética , Semântica
20.
Brain Lang ; 198: 104692, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31522094

RESUMO

Research has implicated the left inferior frontal gyrus (LIFG) in mapping acoustic-phonetic input to sound category representations, both in native speech perception and non-native phonetic category learning. At issue is whether this sensitivity reflects access to phonetic category information per se or to explicit category labels, the latter often being required by experimental procedures. The current study employed an incidental learning paradigm designed to increase sensitivity to a difficult non-native phonetic contrast without inducing explicit awareness of the categorical nature of the stimuli. Functional MRI scans revealed frontal sensitivity to phonetic category structure both before and after learning. Additionally, individuals who succeeded most on the learning task showed the largest increases in frontal recruitment after learning. Overall, results suggest that processing novel phonetic category information entails a reliance on frontal brain regions, even in the absence of explicit category labels.


Assuntos
Encéfalo/fisiologia , Idioma , Fonética , Percepção da Fala/fisiologia , Aprendizagem Verbal/fisiologia , Acústica , Adulto , Encéfalo/diagnóstico por imagem , Feminino , Lobo Frontal/diagnóstico por imagem , Lobo Frontal/fisiologia , Humanos , Imageamento por Ressonância Magnética/métodos , Masculino , Córtex Pré-Frontal/diagnóstico por imagem , Córtex Pré-Frontal/fisiologia , Som
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...