Pesquisa | Portal Regional da BVS (teste)

Neural Generative Models and the Parallel Architecture of Language: A Critical Review and Outlook.

Rambelli, Giulia; Chersoni, Emmanuele; Testa, Davide; Blache, Philippe; Lenci, Alessandro.

Top Cogn Sci ; 2024 Apr 18.

Artigo em Inglês | MEDLINE | ID: mdl-38635667

RESUMO

According to the parallel architecture, syntactic and semantic information processing are two separate streams that interact selectively during language comprehension. While considerable effort is put into psycho- and neurolinguistics to understand the interchange of processing mechanisms in human comprehension, the nature of this interaction in recent neural Large Language Models remains elusive. In this article, we revisit influential linguistic and behavioral experiments and evaluate the ability of a large language model, GPT-3, to perform these tasks. The model can solve semantic tasks autonomously from syntactic realization in a manner that resembles human behavior. However, the outcomes present a complex and variegated picture, leaving open the question of how Language Models could learn structured conceptual representations.

Event Knowledge in Large Language Models: The Gap Between the Impossible and the Unlikely.

Kauf, Carina; Ivanova, Anna A; Rambelli, Giulia; Chersoni, Emmanuele; She, Jingyuan Selena; Chowdhury, Zawad; Fedorenko, Evelina; Lenci, Alessandro.

Cogn Sci ; 47(11): e13386, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-38009752

RESUMO

Word co-occurrence patterns in language corpora contain a surprising amount of conceptual knowledge. Large language models (LLMs), trained to predict words in context, leverage these patterns to achieve impressive performance on diverse semantic tasks requiring world knowledge. An important but understudied question about LLMs' semantic abilities is whether they acquire generalized knowledge of common events. Here, we test whether five pretrained LLMs (from 2018's BERT to 2023's MPT) assign a higher likelihood to plausible descriptions of agent-patient interactions than to minimally different implausible versions of the same event. Using three curated sets of minimal sentence pairs (total n = 1215), we found that pretrained LLMs possess substantial event knowledge, outperforming other distributional language models. In particular, they almost always assign a higher likelihood to possible versus impossible events (The teacher bought the laptop vs. The laptop bought the teacher). However, LLMs show less consistent preferences for likely versus unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLM scores generalize well across syntactic variants (active vs. passive constructions) but less well across semantic variants (synonymous sentences), (iii) some LLM errors mirror human judgment ambiguity, and (iv) sentence plausibility serves as an organizing dimension in internal LLM representations. Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.

Assuntos

Idioma , Semântica , Masculino , Humanos , Conhecimento , Leitura , Julgamento

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA