Pesquisa | Portal Regional da BVS

On the similarities of representations in artificial and brain neural networks for speech recognition.

Wingfield, Cai; Zhang, Chao; Devereux, Barry; Fonteneau, Elisabeth; Thwaites, Andrew; Liu, Xunying; Woodland, Phil; Marslen-Wilson, William; Su, Li.

Front Comput Neurosci ; 16: 1057439, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36618270

RESUMO

Introduction: In recent years, machines powered by deep learning have achieved near-human levels of performance in speech recognition. The fields of artificial intelligence and cognitive neuroscience have finally reached a similar level of performance, despite their huge differences in implementation, and so deep learning models can-in principle-serve as candidates for mechanistic models of the human auditory system. Methods: Utilizing high-performance automatic speech recognition systems, and advanced non-invasive human neuroimaging technology such as magnetoencephalography and multivariate pattern-information analysis, the current study aimed to relate machine-learned representations of speech to recorded human brain representations of the same speech. Results: In one direction, we found a quasi-hierarchical functional organization in human auditory cortex qualitatively matched with the hidden layers of deep artificial neural networks trained as part of an automatic speech recognizer. In the reverse direction, we modified the hidden layer organization of the artificial neural network based on neural activation patterns in human brains. The result was a substantial improvement in word recognition accuracy and learned speech representations. Discussion: We have demonstrated that artificial and brain neural networks can be mutually informative in the domain of speech recognition.

Entrainment to the CIECAM02 and CIELAB colour appearance models in the human cortex.

Thwaites, Andrew; Wingfield, Cai; Wieser, Eric; Soltan, Andrew; Marslen-Wilson, William D; Nimmo-Smith, Ian.

Vision Res ; 145: 1-10, 2018 04.

Artigo em Inglês | MEDLINE | ID: mdl-29608936

RESUMO

In human visual processing, information from the visual field passes through numerous transformations before perceptual attributes such as colour are derived. The sequence of transforms involved in constructing perceptions of colour can be approximated by colour appearance models such as the CIE (2002) colour appearance model, abbreviated as CIECAM02. In this study, we test the plausibility of CIECAM02 as a model of colour processing by looking for evidence of its cortical entrainment. The CIECAM02 model predicts that colour is split in to two opposing chromatic components, red-green and cyan-yellow (termed CIECAM02-a and CIECAM02-b respectively), and an achromatic component (termed CIECAM02-A). Entrainment of cortical activity to the outputs of these components was estimated using measurements of electro- and magnetoencephalographic (EMEG) activity, recorded while healthy subjects watched videos of dots changing colour. We find entrainment to chromatic component CIECAM02-a at approximately 35â¯ms latency bilaterally in occipital lobe regions, and entrainment to achromatic component CIECAM02-A at approximately 75â¯ms latency, also bilaterally in occipital regions. For comparison, transforms from a less physiologically plausible model (CIELAB) were also tested, with no significant entrainment found.

Assuntos

Percepção de Cores/fisiologia , Córtex Visual/fisiologia , Vias Visuais/fisiologia , Adolescente , Adulto , Potenciais Evocados Visuais , Feminino , Humanos , Magnetoencefalografia , Masculino , Modelos Teóricos , Adulto Jovem

Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem.

Wingfield, Cai; Su, Li; Liu, Xunying; Zhang, Chao; Woodland, Phil; Thwaites, Andrew; Fonteneau, Elisabeth; Marslen-Wilson, William D.

PLoS Comput Biol ; 13(9): e1005617, 2017 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-28945744

RESUMO

There is widespread interest in the relationship between the neurobiological systems supporting human cognition and emerging computational systems capable of emulating these capacities. Human speech comprehension, poorly understood as a neurobiological process, is an important case in point. Automatic Speech Recognition (ASR) systems with near-human levels of performance are now available, which provide a computationally explicit solution for the recognition of words in continuous speech. This research aims to bridge the gap between speech recognition processes in humans and machines, using novel multivariate techniques to compare incremental 'machine states', generated as the ASR analysis progresses over time, to the incremental 'brain states', measured using combined electro- and magneto-encephalography (EMEG), generated as the same inputs are heard by human listeners. This direct comparison of dynamic human and machine internal states, as they respond to the same incrementally delivered sensory input, revealed a significant correspondence between neural response patterns in human superior temporal cortex and the structural properties of ASR-derived phonetic models. Spatially coherent patches in human temporal cortex responded selectively to individual phonetic features defined on the basis of machine-extracted regularities in the speech to lexicon mapping process. These results demonstrate the feasibility of relating human and ASR solutions to the problem of speech recognition, and suggest the potential for further studies relating complex neural computations in human speech comprehension to the rapidly evolving ASR systems that address the same problem domain.

Assuntos

Encéfalo/fisiologia , Modelos Neurológicos , Redes Neurais de Computação , Percepção da Fala/fisiologia , Interface para o Reconhecimento da Fala , Adulto , Eletroencefalografia , Feminino , Humanos , Magnetoencefalografia , Masculino , Adulto Jovem

The difficult legacy of Turing's wager.

Thwaites, Andrew; Soltan, Andrew; Wieser, Eric; Nimmo-Smith, Ian.

J Comput Neurosci ; 43(1): 1-4, 2017 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-28643213

RESUMO

Describing the human brain in mathematical terms is an important ambition of neuroscience research, yet the challenges remain considerable. It was Alan Turing, writing in 1950, who first sought to demonstrate how time-consuming such an undertaking would be. Through analogy to the computer program, Turing argued that arriving at a complete mathematical description of the mind would take well over a thousand years. In this opinion piece, we argue that - despite seventy years of progress in the field - his arguments remain both prescient and persuasive.

Assuntos

Encéfalo/fisiologia , Matemática , Modelos Neurológicos , Humanos

Tonotopic representation of loudness in the human cortex.

Thwaites, Andrew; Schlittenlacher, Josef; Nimmo-Smith, Ian; Marslen-Wilson, William D; Moore, Brian C J.

Hear Res ; 344: 244-254, 2017 02.

Artigo em Inglês | MEDLINE | ID: mdl-27915027

RESUMO

A prominent feature of the auditory system is that neurons show tuning to audio frequency; each neuron has a characteristic frequency (CF) to which it is most sensitive. Furthermore, there is an orderly mapping of CF to position, which is called tonotopic organization and which is observed at many levels of the auditory system. In a previous study (Thwaites et al., 2016) we examined cortical entrainment to two auditory transforms predicted by a model of loudness, instantaneous loudness and short-term loudness, using speech as the input signal. The model is based on the assumption that neural activity is combined across CFs (i.e. across frequency channels) before the transform to short-term loudness. However, it is also possible that short-term loudness is determined on a channel-specific basis. Here we tested these possibilities by assessing neural entrainment to the overall and channel-specific instantaneous loudness and the overall and channel-specific short-term loudness. The results showed entrainment to channel-specific instantaneous loudness at latencies of 45 and 100 ms (bilaterally, in and around Heschl's gyrus). There was entrainment to overall instantaneous loudness at 165 ms in dorso-lateral sulcus (DLS). Entrainment to overall short-term loudness occurred primarily at 275 ms, bilaterally in DLS and superior temporal sulcus. There was only weak evidence for entrainment to channel-specific short-term loudness.

Assuntos

Vias Auditivas/fisiologia , Mapeamento Encefálico/métodos , Córtex Cerebral/fisiologia , Audição , Percepção Sonora , Magnetoencefalografia , Neurônios/fisiologia , Percepção da Fala , Estimulação Acústica , Adolescente , Adulto , Audiometria da Fala , Vias Auditivas/citologia , Córtex Cerebral/citologia , Feminino , Humanos , Masculino , Modelos Neurológicos , Tempo de Reação , Fatores de Tempo , Adulto Jovem

Representation of Instantaneous and Short-Term Loudness in the Human Cortex.

Thwaites, Andrew; Glasberg, Brian R; Nimmo-Smith, Ian; Marslen-Wilson, William D; Moore, Brian C J.

Front Neurosci ; 10: 183, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27199645

RESUMO

Acoustic signals pass through numerous transforms in the auditory system before perceptual attributes such as loudness and pitch are derived. However, relatively little is known as to exactly when these transformations happen, and where, cortically or sub-cortically, they occur. In an effort to examine this, we investigated the latencies and locations of cortical entrainment to two transforms predicted by a model of loudness perception for time-varying sounds: the transforms were instantaneous loudness and short-term loudness, where the latter is hypothesized to be derived from the former and therefore should occur later in time. Entrainment of cortical activity was estimated from electro- and magneto-encephalographic (EMEG) activity, recorded while healthy subjects listened to continuous speech. There was entrainment to instantaneous loudness bilaterally at 45, 100, and 165 ms, in Heschl's gyrus, dorsal lateral sulcus, and Heschl's gyrus, respectively. Entrainment to short-term loudness was found in both the dorsal lateral sulcus and superior temporal sulcus at 275 ms. These results suggest that short-term loudness is derived from instantaneous loudness, and that this derivation occurs after processing in sub-cortical structures.

Tracking cortical entrainment in neural activity: auditory processes in human temporal cortex.

Thwaites, Andrew; Nimmo-Smith, Ian; Fonteneau, Elisabeth; Patterson, Roy D; Buttery, Paula; Marslen-Wilson, William D.

Front Comput Neurosci ; 9: 5, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-25713530

RESUMO

A primary objective for cognitive neuroscience is to identify how features of the sensory environment are encoded in neural activity. Current auditory models of loudness perception can be used to make detailed predictions about the neural activity of the cortex as an individual listens to speech. We used two such models (loudness-sones and loudness-phons), varying in their psychophysiological realism, to predict the instantaneous loudness contours produced by 480 isolated words. These two sets of 480 contours were used to search for electrophysiological evidence of loudness processing in whole-brain recordings of electro- and magneto-encephalographic (EMEG) activity, recorded while subjects listened to the words. The technique identified a bilateral sequence of loudness processes, predicted by the more realistic loudness-sones model, that begin in auditory cortex at ~80 ms and subsequently reappear, tracking progressively down the superior temporal sulcus (STS) at lags from 230 to 330 ms. The technique was then extended to search for regions sensitive to the fundamental frequency (F0) of the voiced parts of the speech. It identified a bilateral F0 process in auditory cortex at a lag of ~90 ms, which was not followed by activity in STS. The results suggest that loudness information is being used to guide the analysis of the speech stream as it proceeds beyond auditory cortex down STS toward the temporal pole.

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA