Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 44
Filter
Add more filters











Publication year range
1.
Dev Sci ; : e13543, 2024 Jul 04.
Article in English | MEDLINE | ID: mdl-38961809

ABSTRACT

There is substantial evidence that children's apparent omission of grammatical morphemes in utterances such as "She play tennis" and "Mummy eating" is in fact errors of commission in which contextually licensed unmarked forms encountered in the input are reproduced in a context-blind fashion. So how do children stop making such errors? In this study, we test the assumption that children's ability to recover from error is related to their developing sensitivity to longer-range dependencies. We use a pre-registered corpus analysis to explore the predictive value of different cues with regards to children's verb-marking errors and observe a developmental pattern consistent with this account. We look at context-independent cues (the identity of the specific verb being used) and at the relative value of context-dependent cues (the identity of the specific subject+verb sequence being used). We find that the only consistent effect across a group of 2- to 3-year-olds and a group of 3- to 4-year-olds is the relative frequency of unmarked forms of specific subject+verb sequences being used. The relative frequency of unmarked forms of the verb alone is predictive only in the younger age group. This is consistent with an account in which children recover from making errors by becoming progressively more sensitive to context, at first the immediately preceding lexical contexts (e.g., the subject that precedes the verb) and eventually more distant grammatical markers (e.g., the fronted auxiliary that precedes the subject in questions). RESEARCH HIGHLIGHTS: We provide a corpus analysis investigating input effects on young children's verb-marking errors (e.g., Mummy go) across development (between 2 and 4 years of age). We find evidence that these apparent errors of omission are in fact input-driven errors of commission that persist into the third year of life. We compare the relative effect on error rates of context-independent (e.g., verb) and context-dependent (e.g., subject+verb sequence) cues across developmental time. Our findings support the proposal that children recover from making verb-marking errors by becoming progressively more sensitive to preceding context.

2.
Front Psychol ; 15: 1384629, 2024.
Article in English | MEDLINE | ID: mdl-38784615

ABSTRACT

Dependency distance (DD) is an important factor in language processing and can affect the ease with which a sentence is understood. Previous studies have investigated the role of DD in L2 writing, but little is known about how the native language influences DD in L2 academic writing. This study is probably the first one that investigates, though a large dataset of over 400 million words, whether the native language of L2 writers influences the DD in their academic writings. Using a dataset of over 2.2 million abstracts of articles downloaded from Scopus in the fields of Arts & Humanities and Social Sciences, the study analyzes the DD patterns, parsed by the latest version of the syntactic parser Stanford Corenlp 4.5.5, in the academic writing of L2 learners from different language backgrounds. It is found that native languages influence the DD of English L2 academic writings. When the mean dependency distance (MDD) of native languages is much longer than that of native English, the MDD of their English L2 academic writings will be much longer than that of English native academic writings. The findings of this study will deepen our insights into the influence of native language transfer on L2 academic writing, potentially shaping pedagogical strategies in L2 academic writing education.

3.
Cognition ; 245: 105694, 2024 04.
Article in English | MEDLINE | ID: mdl-38309042

ABSTRACT

Most research regarding early word learning in English tends to make the simplifying assumption that there exists a one-to-one mapping between concrete objects and their labels. In the current work, we provide evidence that runs counter to this assumption, aligning English with more morphologically-rich languages. We suggest that even in a morphologically-poor language like English, real world language input to infants does not provide tidy 1-to-1 mappings. Instead, infants encounter many variant wordforms for familiar nouns (e.g. dog∼doggy∼dogs). We explore this wordform variability in 44 English-learning infants' naturalistic environments using a longitudinal corpus of infant-available speech. We look at both the frequency and composition of wordform variability. We find two broad categories of variability: referent-changing alterations, where words were pluralized or compounded (e.g. coat∼raincoats); and wordplay, where words changed form without a notable change in referent (e.g. bird∼birdie). We further find that wordplay occurs with a limited number of lemmas that are usually early-learned, high-frequency, and shorter. When looking at all wordform variability, we find that individual words with higher levels of wordform variability are learned earlier than words with fewer wordforms, over and above the effect of frequency.


Subject(s)
Language Development , Speech Perception , Infant , Humans , Animals , Dogs , Language , Verbal Learning , Learning , Speech
4.
Proc Natl Acad Sci U S A ; 121(1): e2220898120, 2024 01 02.
Article in English | MEDLINE | ID: mdl-38150495

ABSTRACT

Like biological species, words in language must compete to survive. Previously, it has been shown that language changes in response to cognitive constraints and over time becomes more learnable. Here, we use two complementary research paradigms to demonstrate how the survival of existing word forms can be predicted by psycholinguistic properties that impact language production. In the first study, we analyzed the survival of words in the context of interpersonal communication. We analyzed data from a large-scale serial-reproduction experiment in which stories were passed down along a transmission chain over multiple participants. The results show that words that are acquired earlier in life, more concrete, more arousing, and more emotional are more likely to survive retellings. We reason that the same trend might scale up to language evolution over multiple generations of natural language users. If that is the case, the same set of psycholinguistic properties should also account for the change of word frequency in natural language corpora over historical time. That is what we found in two large historical-language corpora (Study 2): Early acquisition, concreteness, and high arousal all predict increasing word frequency over the past 200 y. However, the two studies diverge with respect to the impact of word valence and word length, which we take up in the discussion. By bridging micro-level behavioral preferences and macro-level language patterns, our investigation sheds light on the cognitive mechanisms underlying word competition.


Subject(s)
Language , Psycholinguistics , Humans , Emotions/physiology , Arousal/physiology , Cognition
5.
JMIR Infodemiology ; 3: e48189, 2023 Sep 29.
Article in English | MEDLINE | ID: mdl-37773617

ABSTRACT

BACKGROUND: Methamphetamine is a highly addictive stimulant that affects the central nervous system. Crystal methamphetamine is a form of the drug resembling glass fragments or shiny bluish-white rocks that can be taken through smoking, swallowing, snorting, or injecting the powder once it has been dissolved in water or alcohol. OBJECTIVE: The objective of this study is to examine how identities are socially (discursively) constructed by people who use methamphetamine within a subreddit for people who regularly use crystal meth. METHODS: Using a mixed methods approach, we analyzed 1000 threads (318,422 words) from a subreddit for regular crystal meth users. The qualitative component of the analysis used concordancing and corpus-based discourse analysis to identify discursive themes informed by assemblage theory. The quantitative portion of the analysis used corpus linguistic techniques including keyword analysis to identify words occurring with statistically marked frequency in the corpus and collocation analysis to analyze their discursive context. RESULTS: Our findings reveal that the subreddit contributors use a rich and varied lexicon to describe crystal meth and other substances, ranging from a neuroscientific register (eg, methamphetamine and dopamine) to informal vernacular (eg, meth, dope, and fent) and commercial appellations (eg, Adderall and Seroquel). They also use linguistic resources to construct symbolic boundaries between different types of methamphetamine users, differentiating between the esteemed category of "functional addicts" and relegating others to the stigmatized category of "tweakers." In addition, contributors contest the dominant view that methamphetamine use inevitably leads to psychosis, arguing instead for a more nuanced understanding that considers the interplay of factors such as sleep deprivation, poor nutrition, and neglected hygiene. CONCLUSIONS: The subreddit contributors' discourse offers a "set and setting" perspective, which provides a fresh viewpoint on drug-induced psychosis and can guide future harm reduction strategies and research. In contrast to this view, many previous studies overlook the real-world complexities of methamphetamine use, perhaps due to the use of controlled experimental settings. Actual drug use, intoxication, and addiction are complex, multifaceted, and elusive phenomena that defy straightforward characterization.


Subject(s)
Amphetamine-Related Disorders , Central Nervous System Stimulants , Methamphetamine , Humans , Methamphetamine/adverse effects , Central Nervous System Stimulants/adverse effects , Smoking , Tobacco Smoking
6.
Curr Psychol ; 42(19): 16176-16190, 2023.
Article in English | MEDLINE | ID: mdl-37554948

ABSTRACT

This interdisciplinary study examined the structure of humor creation in the specific context of efforts to positively reappraise stressful situations for effective coping. In a sample of n = 101 participants, a performance test was used to assess the quantity (fluency, number of generated ideas that qualified as humor) and quality (rated funniness) of humor creation in cognitive reappraisal. Linguistic mechanisms were identified and quantified using cognitive-linguistic methods of corpus analysis, and their employment was correlated with humor production performance on the level of the individual. Almost all individuals were able to come up with reappraisal ideas that qualified as humorous. Depressive symptoms, a negative mood state, and high perceptions of threat did not compromise the participants' capability to create humor. Individuals who were more serious-minded as a trait produced ideas that were rated as less funny, but their basic ability to create humor was unaffected. Metonymy (a contiguity-based principle of meaning extension) emerged as by far the most prominent semantic mechanism in the creation of humorous re-interpretations. Furthermore, its use was related to good humor creation performance in terms of quantity and quality, which is in line with its assumed importance in the extension of meaning in general and the creation of humor in particular. Further effective linguistic mechanisms and conceptual phenomena were identified. The empirical data may be valuable for the development of interventions involving the creation of humorous ideas for cognitive reappraisal.

7.
Appl Corpus Linguistics ; 3(1): 100037, 2023 Apr.
Article in English | MEDLINE | ID: mdl-37521321

ABSTRACT

Understanding the reception of public health messages in public-facing communications is of key importance to health agencies in managing crises, pandemics, and other health threats. Established public health communications strategies including self-efficacy messaging, fear appeals, and moralising messaging were all used during the Coronavirus pandemic. We explore the reception of public health messages to understand the efficacy of these established messaging strategies in the COVID-19 context. Taking a community-focussed approach, we combine a corpus linguistic analysis with methods of wider engagement, namely, a public survey and interactions with a Public Involvement Panel to analyse this type of real-world public health discourse. Our findings indicate that effective health messaging content provides manageable instructions, which inspire public confidence that following the guidance is worthwhile. Messaging that appeals to the audience's morals or fears in order to provide a rationale for compliance can be polarising and divisive, producing a strongly negative emotional response from the public and potentially undermining social cohesion. Provenance of the messaging alongside text-external political factors also have an influence on messaging uptake. In addition, our findings highlight key differences in messaging uptake by audience age, which demonstrates the importance of tailored communications and the need to seek public feedback to test the efficacy of messaging with the relevant demographics. Our study illustrates the value of corpus linguistics to public health agencies and health communications professionals, and we share our recommendations for improving the public health messaging both in the context of the ongoing pandemic and for future novel and re-emerging infectious disease outbreaks.

8.
Cogn Sci ; 47(6): e13302, 2023 06.
Article in English | MEDLINE | ID: mdl-37303285

ABSTRACT

Piantadosi, Tily, and Gibson analyzed a large-scale web-scraping corpus (the Google 1T dataset) and reported that word length is independently predicted from average information content (surprisal) calculated by a 2- to 4-gram model (hereafter, longer-span surprisal) across 11 Indo-European languages, namely, Czech, Dutch, English, French, German, Italian, Polish, Spanish, Portuguese, Romanian, and Swedish. However, a recent article by Meylan and Griffiths suggested the importance of preprocessing for studies with large-scale corpora and reanalyzed the same databases. After their preprocessing, the results in Piantadosi et al. were not replicated in Czech, Romanian, and Swedish. Additionally, a German-specific study by Koplenig, Kupietz, and Wolfer showed that the strict analysis did not replicate the result in Piantadosi et al. for that language with the preprocessing suggested by Meylan and Griffiths in a large-scale but less noisy database. These three studies provide evidence from 11 Indo-European languages and one Afro-Asiatic language, Hebrew, as relevant in this debate. However, we do not have evidence from other linguistic groups. This study provides evidence about Japanese based on a strict preprocessing of Google's web-scraping database. The results show that Japanese word length can be predicted independently by 2- to 4-gram surprisal.


Subject(s)
Language , Linguistics , Japan
9.
J Child Lang ; 50(2): 311-337, 2023 03.
Article in English | MEDLINE | ID: mdl-35236517

ABSTRACT

We investigate Korean-speaking children's knowledge about clause-level constructions involving a transitive event - active transitive and suffixal passive - through corpus analysis and Bayesian modelling. The analysis of Korean caregiver input and children's production in CHILDES revealed that the rates of constructional patterns produced by the children mirrored those uttered by the caregivers to a considerable degree and that the caregivers' use of case-marking was skewed towards single form-function pairings (despite the multiple form-function associations that the markers manifest). Based on these characteristics, we modelled a Bayesian learner by employing construction-based input (without considering lexical information). This simulation revealed the dominance of several constructional patterns, occupying most of the input, and their inhibitory effects on the development of the other patterns. Our findings illuminate how children shape clause-level constructional knowledge in Korean, an understudied language for this topic, as a function of input properties and domain-general learning capacities, appealing to the usage-based constructionist approach.


Subject(s)
Language Development , Language , Humans , Child , Bayes Theorem , Child Language , Republic of Korea
10.
Cogn Sci ; 46(8): e13181, 2022 08.
Article in English | MEDLINE | ID: mdl-35986665

ABSTRACT

We analyzed a Japanese lexical database to investigate the structure of the lexical environment based on the hypothesis that the lexical environment is optimized for the functioning of verbal working memory. Our prediction was that, as a consequence of the cultural transmission of language, low-imageable meanings tend to be represented by frequent phonological patterns in the current vocabulary rather than infrequent phonological patterns. This prediction was based on two findings of previous laboratory studies on verbal working memory. (1) The quality of phonological (phonemic and accent) representations in verbal working memory depends on phonological regularity knowledge; therefore, short-term phonological representations are less robust for words with infrequent phonological patterns. (2) Phonological representations are underpinned by contributions from semantic knowledge; therefore, phonological representations of highly imageable words are more robust than those for low-imageable words. Our database analyses show that nouns with less imageable meanings tend to be associated with more frequent phonological patterns in Japanese vocabulary. This lexical structure can maintain the quality of phonological representations in verbal working memory through contributions of semantic and phonological regularity knowledge. Larger semantic contributions compensate for the less robust phonological representations of infrequent phonological forms. The quality of phonological representations is preserved by phonological regularity knowledge when larger semantic contributions are not expected.


Subject(s)
Memory, Short-Term , Phonetics , Humans , Language , Semantics , Verbal Learning , Vocabulary
11.
Front Psychol ; 13: 752134, 2022.
Article in English | MEDLINE | ID: mdl-35237205

ABSTRACT

The investigation of learners' interlanguage could greatly contribute to the teaching of English as a foreign language and the development of teaching materials. The present study investigates the collocational profiles of large-scale written production by English learners with varied L1 backgrounds and different proficiency levels. Using the British National Corpus as reference corpus, learners' collocation use was extracted by corpus query language and further identified by t-score via Python programming language. The collocation list consists of 2,501 make/take + noun (the direct object) collocations. Findings show that proficient learners tend to use collocations containing more semantically complicated and abstract noun elements for varied communication tasks. Moreover, advanced learners are inclined to use collocations comprised of more difficult and longer noun elements.

12.
J Child Lang ; : 1-26, 2022 Mar 07.
Article in English | MEDLINE | ID: mdl-35249569

ABSTRACT

As written language contains more complex syntax than spoken language, exposure to written language provides opportunities for children to experience language input different from everyday speech. We investigated the distribution and nature of relative clauses in three large developmental corpora: one of child-directed speech (targeted at pre-schoolers) and two of text written for children - namely, picture books targeted at pre-schoolers for shared reading and children's own reading books. Relative clauses were more common in both types of book language. Within text, relative clause usage increased with intended age, and was more frequent in nonfiction than fiction. The types of relative clause structures in text co-occurred with specific lexical properties, such as noun animacy and pronoun use. Book language provides unique access to grammar not easily encountered in speech. This has implications for the distributional lexical-syntactic features and associated discourse functions that children experience and, from this, consequences for language development.

13.
Cognition ; 223: 105037, 2022 06.
Article in English | MEDLINE | ID: mdl-35123218

ABSTRACT

Corpus analyses have shown that turn-taking in conversation is much faster than laboratory studies of speech planning would predict. To explain fast turn-taking, Levinson and Torreira (2015) proposed that speakers are highly proactive: They begin to plan a response to their interlocutor's turn as soon as they have understood its gist, and launch this planned response when the turn-end is imminent. Thus, fast turn-taking is possible because speakers use the time while their partner is talking to plan their own utterance. In the present study, we asked how much time upcoming speakers actually have to plan their utterances. Following earlier psycholinguistic work, we used transcripts of spoken conversations in Dutch, German, and English. These transcripts consisted of segments, which are continuous stretches of speech by one speaker. In the psycholinguistic and phonetic literature, such segments have often been used as proxies for turns. We found that in all three corpora, large proportions of the segments comprised of only one or two words, which on our estimate does not give the next speaker enough time to fully plan a response. Further analyses showed that speakers indeed often did not respond to the immediately preceding segment of their partner, but continued an earlier segment of their own. More generally, our findings suggest that speech segments derived from transcribed corpora do not necessarily correspond to turns, and the gaps between speech segments therefore only provide limited information about the planning and timing of turns.


Subject(s)
Communication , Speech , Humans , Language , Phonetics , Psycholinguistics , Speech/physiology
14.
Front Psychol ; 13: 1052586, 2022.
Article in English | MEDLINE | ID: mdl-36710766

ABSTRACT

High citations most often characterize quality research that reflects the foci of the discipline. This study aims to spotlight the most recent hot topics and the trends looming from the highly cited papers (HCPs) in Web of Science category of linguistics and language & linguistics with bibliometric analysis. The bibliometric information of the 143 HCPs based on Essential Citation Indicators was retrieved and used to identify and analyze influential contributors at the levels of journals, authors, and countries. The most frequently explored topics were identified by corpus analysis and manual checking. The retrieved topics can be grouped into five general categories: multilingual-related, language teaching, and learning related, psycho/pathological/cognitive linguistics-related, methods and tools-related, and others. Topics such as bi/multilingual(ism), translanguaging, language/writing development, models, emotions, foreign language enjoyment (FLE), cognition, anxiety are among the most frequently explored. Multilingual and positive trends are discerned from the investigated HCPs. The findings inform linguistic researchers of the publication characteristics of the HCPs in the linguistics field and help them pinpoint the research trends and directions to exert their efforts in future studies.

15.
Behav Res Methods ; 54(4): 1989-2000, 2022 08.
Article in English | MEDLINE | ID: mdl-34816386

ABSTRACT

This report introduces the Beijing Sentence Corpus (BSC). This is a Chinese sentence corpus of eye-tracking data with relatively clear word boundaries. In addition, we report predictability norms for each word in the corpus. Eye movement corpora are available in alphabetic scripts such as English, German, and French. However, there is no publicly available corpus for Chinese. Thus, to study predictive processes during reading in Chinese, it is necessary to establish such a corpus. Also, given the clear word boundaries in the sentences, BSC is especially useful to provide evidence relevant to the theoretical debate of saccade target selection in Chinese. With the large-scale predictability norms, we conducted new analyses based on 60 BSC readers, testing the influences of launch word and target word properties while controlling for visual and oculomotor constraints, as well as sentence and subject-level individual differences. We discuss implications for guidance of eye movements in Chinese reading.


Subject(s)
Eye Movements , Reading , Beijing , Humans , Language , Saccades
16.
Top Cogn Sci ; 14(2): 388-399, 2022 04.
Article in English | MEDLINE | ID: mdl-34914179

ABSTRACT

Over their first years of life, children learn not just the words of their native languages, but how to use them to communicate. Because manual annotation of communicative intent does not scale to large corpora, our understanding of communicative act development is limited to case studies of a few children at a few time points. We present an approach to automatic identification of communicative acts using a hidden topic Markov model, applying it to the conversations of English-learning children in the CHILDES database. We first describe qualitative changes in parent-child communication over development, and then use our method to demonstrate two large-scale features of communicative development: (a) children develop a parent-like repertoire of our model's communicative acts rapidly, their learning rate peaking around 14 months of age, and (b) this period of steep repertoire change coincides with the highest predictability between parents' acts and children's, suggesting that structured interactions play a role in learning to communicate.


Subject(s)
Communication , Learning , Humans , Language Development , Parent-Child Relations , Parents
17.
Iperception ; 12(4): 20416695211024680, 2021.
Article in English | MEDLINE | ID: mdl-34377428

ABSTRACT

Chills experienced in response to music listening have been linked to both happiness and sadness expressed by music. To investigate these conflicting effects of valence on chills, we conducted a computational analysis on a corpus of 988 tracks previously reported to elicit chills, by comparing them with a control set of tracks matched by artist, duration, and popularity. We analysed track-level audio features obtained with the Spotify Web API across the two sets of tracks, resulting in confirmatory findings that tracks which cause chills were sadder than matched tracks and exploratory findings that they were also slower, less intense, and more instrumental than matched tracks on average. We also found that the audio characteristics of chills tracks were related to the direction and magnitude of the difference in valence between the two sets of tracks. We discuss these results in light of the current literature on valence and chills in music, provide a new interpretation in terms of personality correlates of musical preference, and review the advantages and limitations of our computational approach.

18.
Dev Sci ; 24(6): e13125, 2021 11.
Article in English | MEDLINE | ID: mdl-34060184

ABSTRACT

Psycholinguistic research over the past decade has suggested that children's linguistic knowledge includes dedicated representations for frequently-encountered multiword sequences. Important evidence for this comes from studies of children's production: it has been repeatedly demonstrated that children's rate of speech errors is greater for word sequences that are infrequent and thus unfamiliar to them than for those that are frequent. In this study, we investigate whether children's knowledge of multiword sequences can explain a phenomenon that has long represented a key theoretical fault line in the study of language development: errors of subject-auxiliary non-inversion in question production (e.g., "why we can't go outside?*"). In doing so we consider a type of error that has been ignored in discussion of multiword sequences to date. Previous work has focused on errors of omission - an absence of accurate productions for infrequent phrases. However, if children make use of dedicated representations for frequent sequences of words in their productions, we might also expect to see errors of commission - the appearance of frequent phrases in children's speech even when such phrases are not appropriate. Through a series of corpus analyses, we provide the first evidence that the global input frequency of multiword sequences (e.g., "she is going" as it appears in declarative utterances) is a valuable predictor of their errorful appearance (e.g., the uninverted question "what she is going to do?*") in naturalistic speech. This finding, we argue, constitutes powerful evidence that multiword sequences can be represented as linguistic units in their own right.


Subject(s)
Linguistics , Speech , Child , Female , Humans , Language , Language Development , Psycholinguistics
19.
Front Psychol ; 12: 790710, 2021.
Article in English | MEDLINE | ID: mdl-35140659

ABSTRACT

The correct use of connectives has great influence on language learners' writing proficiency, while errors of connectives are common in foreign learners' interlanguages. This study examines the types of errors that occur in native English-speaking learners' Chinese writing, the possible causes for the errors, and the learners' consequent learning strategies. The present research adopted corpora investigation, questionnaire survey, and focus-group interviews to examine the error types, causes of identified errors, and related learning strategies. Data analysis indicated that: (1) the main error types made by native English-speaking learners from high to low are misuse, overuse, mismatch, misplacement, and underuse of connectives; (2) causes related to intralingual transfer greatly contributes to the presence of errors; and (3) memory, social, and cognitive strategies were the most preferred, followed by metacognitive and compensation strategies, and then by effective strategies which were the least preferred. These findings showed that different strategies can be employed to cope with different errors in writing. The study further suggests that teachers and educators need to help native English-speaking learners find strategies that work best for them in terms of learning Chinese connectives.

20.
Front Psychol ; 12: 779958, 2021.
Article in English | MEDLINE | ID: mdl-35283804

ABSTRACT

While some aspects of mouthings have been previously investigated, many topics in the use of this cross-modal contact phenomenon in sign languages remain un(der)studied, and not much is known about mouthings in Russian Sign Language (RSL), in particular. This article examines various aspects of mouthings as these are used by native RSL signers and aims to contribute new insights into the use and origin of mouthings in this sign language. Based on novel data from the online RSL Corpus alongside additional elicited data, we describe the distribution, forms, functions and spreading patterns of mouthings. Our findings furthermore show that sign languages exhibit more extensive variation in the use of mouthings than has previously been thought. Moreover, we - thus far uniquely - describe mouthings also as a written-language-based contact phenomenon. This study has the potential to provide a better understanding of the nature of such contact-induced features as mouthings in sign languages in general and reveals a complex interplay of the modalities of signed, spoken and written languages.

SELECTION OF CITATIONS
SEARCH DETAIL