Search | VHL Regional Portal

An information-theoretic analysis of targeted regressions during reading.

Wilcox, Ethan Gotlieb; Pimentel, Tiago; Meister, Clara; Cotterell, Ryan.

Cognition ; 249: 105765, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38772254

ABSTRACT

Regressions, or backward saccades, are common during reading, accounting for between 5% and 20% of all saccades. And yet, relatively little is known about what causes them. We provide an information-theoretic operationalization for two previous qualitative hypotheses about regressions, which we dub reactivation and reanalysis. We argue that these hypotheses make different predictions about the pointwise mutual information or pmi between a regression's source and target. Intuitively, the pmi between two words measures how much more (or less) likely one word is to be present given the other. On one hand, the reactivation hypothesis predicts that regressions occur between words that are associated, implying high positive values of pmi. On the other hand, the reanalysis hypothesis predicts that regressions should occur between words that are not associated with each other, implying negative, low values of pmi. As a second theoretical contribution, we expand on previous theories by considering not only pmi but also expected values of pmi, E[pmi], where the expectation is taken over all possible realizations of the regression's target. The rationale for this is that language processing involves making inferences under uncertainty, and readers may be uncertain about what they have read, especially if a previous word was skipped. To test both theories, we use contemporary language models to estimate pmi-based statistics over word pairs in three corpora of eye tracking data in English, as well as in six languages across three language families (Indo-European, Uralic, and Turkic). Our results are consistent across languages and models tested: Positive values of pmi and E[pmi] consistently help to predict the patterns of regressions during reading, whereas negative values of pmi and E[pmi] do not. Our information-theoretic interpretation increases the predictive scope of both theories and our studies present the first systematic crosslinguistic analysis of regressions in the literature. Our results support the reactivation hypothesis and, more broadly, they expand the number of language processing behaviors that can be linked to information-theoretic principles.

Subject(s)

Reading , Saccades , Humans , Saccades/physiology , Information Theory , Adult , Psycholinguistics , Young Adult

Large-scale evidence for logarithmic effects of word predictability on reading time.

Shain, Cory; Meister, Clara; Pimentel, Tiago; Cotterell, Ryan; Levy, Roger.

Proc Natl Acad Sci U S A ; 121(10): e2307876121, 2024 Mar 05.

Article in English | MEDLINE | ID: mdl-38422017

ABSTRACT

During real-time language comprehension, our minds rapidly decode complex meanings from sequences of words. The difficulty of doing so is known to be related to words' contextual predictability, but what cognitive processes do these predictability effects reflect? In one view, predictability effects reflect facilitation due to anticipatory processing of words that are predictable from context. This view predicts a linear effect of predictability on processing demand. In another view, predictability effects reflect the costs of probabilistic inference over sentence interpretations. This view predicts either a logarithmic or a superlogarithmic effect of predictability on processing demand, depending on whether it assumes pressures toward a uniform distribution of information over time. The empirical record is currently mixed. Here, we revisit this question at scale: We analyze six reading datasets, estimate next-word probabilities with diverse statistical language models, and model reading times using recent advances in nonlinear regression. Results support a logarithmic effect of word predictability on processing difficulty, which favors probabilistic inference as a key component of human language processing.

Subject(s)

Comprehension , Language , Humans , Models, Statistical

Quantifying gender bias towards politicians in cross-lingual language models.

Stanczak, Karolina; Ray Choudhury, Sagnik; Pimentel, Tiago; Cotterell, Ryan; Augenstein, Isabelle.

PLoS One ; 18(11): e0277640, 2023.

Article in English | MEDLINE | ID: mdl-38015835

ABSTRACT

Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language. The present paper introduces a simple method for probing language models to conduct a multilingual study of gender bias towards politicians. We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender. To this end, we curate a dataset of 250k politicians worldwide, including their names and gender. Our study is conducted in seven languages across six different language modeling architectures. The results demonstrate that pre-trained language models' stance towards politicians varies strongly across analyzed languages. We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians. Finally, and contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.

Subject(s)

Multilingualism , Names , Humans , Female , Male , Sexism , Language , Bias

Visual Comparison of Language Model Adaptation.

Sevastjanova, Rita; Cakmak, Eren; Ravfogel, Shauli; Cotterell, Ryan; El-Assady, Mennatallah.

IEEE Trans Vis Comput Graph ; 29(1): 1178-1188, 2023 01.

Article in English | MEDLINE | ID: mdl-36166530

ABSTRACT

Neural language models are widely used; however, their model parameters often need to be adapted to the specific domains and tasks of an application, which is time- and resource-consuming. Thus, adapters have recently been introduced as a lightweight alternative for model adaptation. They consist of a small set of task-specific parameters with a reduced training time and simple parameter composition. The simplicity of adapter training and composition comes along with new challenges, such as maintaining an overview of adapter properties and effectively comparing their produced embedding spaces. To help developers overcome these challenges, we provide a twofold contribution. First, in close collaboration with NLP researchers, we conducted a requirement analysis for an approach supporting adapter evaluation and detected, among others, the need for both intrinsic (i.e., embedding similarity-based) and extrinsic (i.e., prediction-based) explanation methods. Second, motivated by the gathered requirements, we designed a flexible visual analytics workspace that enables the comparison of adapter properties. In this paper, we discuss several design iterations and alternatives for interactive, comparative visual explanation methods. Our comparative visualizations show the differences in the adapted embedding vectors and prediction outcomes for diverse human-interpretable concepts (e.g., person names, human qualities). We evaluate our workspace through case studies and show that, for instance, an adapter trained on the language debiasing task according to context-0 (decontextualized) embeddings introduces a new type of bias where words (even gender-independent words such as countries) become more similar to female- than male pronouns. We demonstrate that these are artifacts of context-0 embeddings, and the adapter effectively eliminates the gender information from the contextualized word representations.

Subject(s)

Computer Graphics , Natural Language Processing , Male , Female , Humans , Language , Software , Artifacts

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL