Search | VHL Regional Portal

Locality and expectation effects in Hindi preverbal constituent ordering.

Ranjan, Sidharth; Rajkumar, Rajakrishnan; Agarwal, Sumeet.

Cognition ; 223: 104959, 2022 06.

Article in English | MEDLINE | ID: mdl-35091261

ABSTRACT

We investigate the relative impact of two influential theories of language comprehension, viz., Dependency Locality Theory (Gibson, 2000; DLT) and Surprisal Theory (Hale, 2001; Levy, 2008), on preverbal constituent ordering in Hindi, a predominantly SOV language with flexible word order. Prior work in Hindi has shown that word order scrambling is influenced by information structure constraints in discourse. However, the impact of cognitively grounded factors on Hindi constituent ordering is relatively underexplored. We test the hypothesis that dependency length minimization is a significant predictor of syntactic choice, once information status and surprisal measures (estimated from n-gram i.e., trigram and incremental dependency parsing models) have been added to a machine learning model. Towards this end, we setup a framework to generate meaning-equivalent grammatical variants of Hindi sentences by linearizing preverbal constituents of projective dependency trees in the Hindi-Urdu Treebank (HUTB) corpus of written text. Our results indicate that dependency length displays a weak effect in predicting reference sentences (amidst variants) over and above the aforementioned predictors. Overall, trigram surprisal outperforms dependency length and parser surprisal by a huge margin and our analyses indicate that maximizing lexical predictability is the primary driving force behind preverbal constituent ordering choices in Hindi. The success of trigram surprisal notwithstanding, dependency length minimization predicts non-canonical reference sentences having fronted direct objects over variants containing the canonical word order, cases where surprisal estimates fail due to their bias towards frequent structures and word sequences. Locality effects persist over the Given-New preference of subject-object ordering in Hindi. Accessibility and local statistical biases discussed in the sentence processing literature are plausible explanations for the success of trigram surprisal. Further, we conjecture that the presence of case markers is a strong factor potentially overriding the pressure for dependency length minimization in Hindi. Finally, we discuss the implications of our findings for the information locality hypothesis and theories of language production.

Subject(s)

Language , Motivation , Comprehension , Humans , Machine Learning , Writing

Investigating locality effects and surprisal in written English syntactic choice phenomena.

Rajkumar, Rajakrishnan; van Schijndel, Marten; White, Michael; Schuler, William.

Cognition ; 155: 204-232, 2016 10.

Article in English | MEDLINE | ID: mdl-27428810

ABSTRACT

We investigate the extent to which syntactic choice in written English is influenced by processing considerations as predicted by Gibson's (2000) Dependency Locality Theory (DLT) and Surprisal Theory (Hale, 2001; Levy, 2008). A long line of previous work attests that languages display a tendency for shorter dependencies, and in a previous corpus study, Temperley (2007) provided evidence that this tendency exerts a strong influence on constituent ordering choices. However, Temperley's study included no frequency-based controls, and subsequent work on sentence comprehension with broad-coverage eye-tracking corpora found weak or negative effects of DLT-based measures when frequency effects were statistically controlled for (Demberg & Keller, 2008; van Schijndel, Nguyen, & Schuler 2013; van Schijndel & Schuler, 2013), calling into question the actual impact of dependency locality on syntactic choice phenomena. Going beyond Temperley's work, we show that DLT integration costs are indeed a significant predictor of syntactic choice in written English even in the presence of competing frequency-based and cognitively motivated control factors, including n-gram probability and PCFG surprisal as well as embedding depth (Wu, Bachrach, Cardenas, & Schuler, 2010; Yngve, 1960). Our study also shows that the predictions of dependency length and surprisal are only moderately correlated, a finding which mirrors Dember & Keller's (2008) results for sentence comprehension. Further, we demonstrate that the efficacy of dependency length in predicting the corpus choice increases with increasing head-dependent distances. At the same time, we find that the tendency towards dependency locality is not always observed, and with pre-verbal adjuncts in particular, non-locality cases are found more often than not. In contrast, surprisal is effective in these cases, and the embedding depth measures further increase prediction accuracy. We discuss the implications of our findings for theories of language comprehension and production, and conclude with a discussion of questions our work raises for future research.

Subject(s)

Comprehension , Linguistics , Models, Psychological , Writing , Choice Behavior , Humans , Memory , Reading

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL