Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
PLoS One ; 18(4): e0284801, 2023.
Article in English | MEDLINE | ID: mdl-37093824

ABSTRACT

This study presents a Polish semantic priming dataset and semantic similarity ratings for word pairs obtained with native Polish speakers, as well as a range of semantic spaces. The word pairs include strongly related, weakly related, and semantically unrelated word pairs. The rating study (Experiment 1) confirmed that the three conditions differed in semantic relatedness. The semantic priming lexical decision study with a carefully matched subset of the stimuli (Experiment 2), revealed strong semantic priming effects for strongly related word pairs, whereas weakly related word pairs showed a smaller but still significant priming effect relative to semantically unrelated word pairs. The datasets of both experiments and those of SimLex-999 for Polish were then used in a robust semantic model selection from existing and newly trained semantic spaces. This database of semantic vectors, semantic relatedness ratings, and behavioral data collected for all word pairs enable future researchers to benchmark new vectors against this dataset. Furthermore, the new vectors are made freely available for researchers. Although similar semantically strongly and weakly related word pairs are available in other languages, this is the first freely available database for Polish, that combines measures of semantic distance and human data.


Subject(s)
Language , Semantics , Humans , Reaction Time , Poland
2.
J Exp Psychol Gen ; 150(4): 792-812, 2021 Apr.
Article in English | MEDLINE | ID: mdl-33914584

ABSTRACT

Emotions play a fundamental role in language learning, use, and processing. Words denoting positivity account for a larger part of the lexicon than words denoting negativity, and they also tend to be used more frequently, a phenomenon known as positivity bias. However, language experience changes over an individual's lifetime, making the examination of the emotion-laden lexicon an important topic not only across the life span but also across languages. Furthermore, existing theories predict a range of different age-related trajectories in processing valenced words. The present study pits all of these predictions against written productions (Facebook status updates from over 20,000 users) and behavioral data from three publicly available megastudies on different languages, namely English, Dutch, and Spanish, across adulthood. The production data demonstrated an increase in positive word types and tokens with advancing age. In terms of comprehension, the results showed a uniform and consistent effect of valence across languages and cohorts based on data from a visual word recognition task. The difference in reaction times to very positive and very negative words declined with age, with responses to positive words slowing down more strongly with age than responses to negative words. We argue that the results stem from lifelong learning and emotion regulation: Advancing age is accompanied by an increased type frequency of positive words in language production, which is mirrored as a discrimination penalty in comprehension. To our knowledge, this is the first study to simultaneously target both language production and comprehension across adulthood and in a cross-linguistic perspective. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Subject(s)
Affect , Aging , Attitude , Comprehension , Emotions , Language , Learning , Adult , Aged , Aged, 80 and over , Ethnicity , Female , Humans , Language Tests , Male , Middle Aged , Reaction Time , Social Media
3.
Behav Res Methods ; 52(5): 1867-1882, 2020 10.
Article in English | MEDLINE | ID: mdl-32072567

ABSTRACT

Vocabulary size seems to be affected by multiple factors, including those that belong to the properties of the words themselves and those that relate to the characteristics of the individuals assessing the words. In this study, we present results from a crowdsourced lexical decision megastudy in which more than 150,000 native speakers from around 20 Spanish-speaking countries performed a lexical decision task to 70 target word items selected from a list of about 45,000 Spanish words. We examined how demographic characteristics such as age, education level, and multilingualism affected participants' vocabulary size. Also, we explored how common factors related to words like frequency, length, and orthographic neighbourhood influenced the knowledge of a particular item. Results indicated important contributions of age to overall vocabulary size, with vocabulary size increasing in a logarithmic fashion with this factor. Furthermore, a contrast between monolingual and bilingual communities within Spain revealed no significant vocabulary size differences between the communities. Additionally, we replicated the standard effects of the words' properties and their interactions, accurately accounting for the estimated knowledge of a particular word. These results highlight the value of crowdsourced approaches to uncover effects that are traditionally masked by small-sampled in-lab factorial experimental designs.


Subject(s)
Crowdsourcing , Multilingualism , Reading , Humans , Reaction Time , Spain , Vocabulary
4.
Neuropsychologia ; 141: 107390, 2020 04.
Article in English | MEDLINE | ID: mdl-32057934

ABSTRACT

The accumulating evidence suggests that prior usage of a second language (L2) leads to processing costs on the subsequent production of a native language (L1). However, it is unclear what mechanism underlies this effect. It has been proposed that the L1 cost reflects inhibition of L1 representation acting during L1 production; however, previous studies exploring this issue were inconclusive. It is also unsettled whether the mechanism operates on the whole-language level or is restricted to translation equivalents in the two languages. We report a study that allowed us to address both issues behaviorally with the use of ERPs while focusing on the consequences of using L2 on the production of L1. In our experiment, native speakers of Polish (L1) and learners of English (L2) named a set of pictures in L1 following a set of pictures in either L1 or L2. Half of the pictures were repeated from the preceding block and half were new; this enabled dissociation of the effects on the level of the whole language from those specific to individual lexical items. Our results are consistent with the notion that language after-effects operate at a whole-language level. Behaviorally, we observed a clear processing cost on the whole-language level and a small facilitation on the item-specific level. The whole-language effect was accompanied by an enhanced, fronto-centrally distributed negativity in the 250-350 ms time-window which we identified as the N300 (in contrast to previous research, which probably misidentified the effect as the N2), a component that presumably reflects retrieval difficulty of relevant language representations during picture naming. As such, unlike previous studies that reported N2 for naming pictures in L1 after L2 use, we propose that the reported ERPs (N300) indicate that prior usage of L2 hampers lexical access to names in L1. Based on the literature, the after-effects could be caused by L1 inhibition and/or L2 interference, but the ERPs so far have not been informative about the causal mechanism.


Subject(s)
Multilingualism , Names , Evoked Potentials , Humans , Language , Reaction Time
5.
Behav Res Methods ; 52(2): 741-760, 2020 04.
Article in English | MEDLINE | ID: mdl-31368025

ABSTRACT

We present a new dataset of English word recognition times for a total of 62 thousand words, called the English Crowdsourcing Project. The data were collected via an internet vocabulary test in which more than one million people participated. The present dataset is limited to native English speakers. Participants were asked to indicate which words they knew. Their response times were registered, although at no point were the participants asked to respond as quickly as possible. Still, the response times correlate around .75 with the response times of the English Lexicon Project for the shared words. Also, the results of virtual experiments indicate that the new response times are a valid addition to the English Lexicon Project. This not only means that we have useful response times for some 35 thousand extra words, but we now also have data on differences in response latencies as a function of education and age.


Subject(s)
Crowdsourcing , Decision Making , Humans , Reaction Time , Recognition, Psychology , Vocabulary
6.
Psychol Belg ; 59(1): 281-300, 2019 Jul 17.
Article in English | MEDLINE | ID: mdl-31367458

ABSTRACT

We present a new database of Dutch word recognition times for a total of 54 thousand words, called the Dutch Crowdsourcing Project. The data were collected with an internet vocabulary test. The database is limited to native Dutch speakers. Participants were asked to indicate which words they knew. Their response times were registered, even though the participants were not asked to respond as fast as possible. Still, the response times correlate around .7 with the response times of the Dutch Lexicon Projects for shared words. Also results of virtual experiments indicate that the new response times are a valid addition to the Dutch Lexicon Projects. This not only means that we have useful response times for some 20 thousand extra words, but we now also have data on differences in response latencies as a function of education and age. The new data correspond better to word use in the Netherlands.

7.
Behav Res Methods ; 51(2): 467-479, 2019 04.
Article in English | MEDLINE | ID: mdl-29967979

ABSTRACT

We present word prevalence data for 61,858 English words. Word prevalence refers to the number of people who know the word. The measure was obtained on the basis of an online crowdsourcing study involving over 220,000 people. Word prevalence data are useful for gauging the difficulty of words and, as such, for matching stimulus materials in experimental conditions or selecting stimulus materials for vocabulary tests. Word prevalence also predicts word processing times, over and above the effects of word frequency, word length, similarity to other words, and age of acquisition, in line with previous findings in the Dutch language.


Subject(s)
Knowledge , Vocabulary , Adult , Crowdsourcing , Female , Humans , Language Tests
9.
Front Psychol ; 7: 1116, 2016.
Article in English | MEDLINE | ID: mdl-27524974

ABSTRACT

Based on an analysis of the literature and a large scale crowdsourcing experiment, we estimate that an average 20-year-old native speaker of American English knows 42,000 lemmas and 4,200 non-transparent multiword expressions, derived from 11,100 word families. The numbers range from 27,000 lemmas for the lowest 5% to 52,000 for the highest 5%. Between the ages of 20 and 60, the average person learns 6,000 extra lemmas or about one new lemma every 2 days. The knowledge of the words can be as shallow as knowing that the word exists. In addition, people learn tens of thousands of inflected forms and proper nouns (names), which account for the substantially high numbers of 'words known' mentioned in other publications.

10.
J Exp Psychol Hum Percept Perform ; 42(3): 441-58, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26501839

ABSTRACT

Keuleers, Stevens, Mandera, and Brysbaert (2015) presented a new variable, word prevalence, defined as word knowledge in the population. Some words are known to more people than other. This is particularly true for low-frequency words (e.g., screenshot vs. scourage). In the present study, we examined the impact of the measure by collecting lexical decision times for 30,000 Dutch word lemmas of various lengths (the Dutch Lexicon Project 2). Word prevalence had the second highest correlation with lexical decision times (after word frequency): Words known by everyone in the population were responded to 100 ms faster than words known to only half of the population, even after controlling for word frequency, word length, age of acquisition, similarity to other words, and concreteness. Because word prevalence has rather low correlations with the existing measures (including word frequency), the unique variance it contributes to lexical decision times is higher than that of the other variables. We consider the reasons why word prevalence has an impact on word processing times and we argue that it is likely to be the most important new variable protecting researchers against experimenter bias in selecting stimulus materials.


Subject(s)
Psycholinguistics , Recognition, Psychology , Semantics , Vocabulary , Adolescent , Adult , Aged , Decision Making , Female , Humans , Male , Middle Aged , Netherlands , Prevalence , Reaction Time , Reading , Young Adult
11.
Q J Exp Psychol (Hove) ; 68(8): 1623-42, 2015.
Article in English | MEDLINE | ID: mdl-25695623

ABSTRACT

Subjective ratings for age of acquisition, concreteness, affective valence, and many other variables are an important element of psycholinguistic research. However, even for well-studied languages, ratings usually cover just a small part of the vocabulary. A possible solution involves using corpora to build a semantic similarity space and to apply machine learning techniques to extrapolate existing ratings to previously unrated words. We conduct a systematic comparison of two extrapolation techniques: k-nearest neighbours, and random forest, in combination with semantic spaces built using latent semantic analysis, topic model, a hyperspace analogue to language (HAL)-like model, and a skip-gram model. A variant of the k-nearest neighbours method used with skip-gram word vectors gives the most accurate predictions but the random forest method has an advantage of being able to easily incorporate additional predictors. We evaluate the usefulness of the methods by exploring how much of the human performance in a lexical decision task can be explained by extrapolated ratings for age of acquisition and how precisely we can assign words to discrete categories based on extrapolated ratings. We find that at least some of the extrapolation methods may introduce artefacts to the data and produce results that could lead to different conclusions that would be reached based on the human ratings. From a practical point of view, the usefulness of ratings extrapolated with the described methods may be limited.


Subject(s)
Datasets as Topic , Language , Psycholinguistics , Verbal Learning/physiology , Vocabulary , Female , Humans , Machine Learning , Male , Predictive Value of Tests , Recognition, Psychology , Regression Analysis , Reproducibility of Results , Semantics
12.
Q J Exp Psychol (Hove) ; 68(8): 1665-92, 2015.
Article in English | MEDLINE | ID: mdl-25715025

ABSTRACT

We use the results of a large online experiment on word knowledge in Dutch to investigate variables influencing vocabulary size in a large population and to examine the effect of word prevalence-the percentage of a population knowing a word-as a measure of word occurrence. Nearly 300,000 participants were presented with about 70 word stimuli (selected from a list of 53,000 words) in an adapted lexical decision task. We identify age, education, and multilingualism as the most important factors influencing vocabulary size. The results suggest that the accumulation of vocabulary throughout life and in multiple languages mirrors the logarithmic growth of number of types with number of tokens observed in text corpora (Herdan's law). Moreover, the vocabulary that multilinguals acquire in related languages seems to increase their first language (L1) vocabulary size and outweighs the loss caused by decreased exposure to L1. In addition, we show that corpus word frequency and prevalence are complementary measures of word occurrence covering a broad range of language experiences. Prevalence is shown to be the strongest independent predictor of word processing times in the Dutch Lexicon Project, making it an important variable for psycholinguistic research.


Subject(s)
Crowdsourcing , Knowledge , Online Systems , Pattern Recognition, Visual/physiology , Vocabulary , Adult , Aging , Educational Status , Female , Functional Laterality , Humans , Male , Middle Aged , Multilingualism , Prevalence , Psycholinguistics , Reaction Time/physiology , Sex Factors , Young Adult
13.
Behav Res Methods ; 47(2): 471-83, 2015 Jun.
Article in English | MEDLINE | ID: mdl-24942246

ABSTRACT

We present SUBTLEX-PL, Polish word frequencies based on movie subtitles. In two lexical decision experiments, we compare the new measures with frequency estimates derived from another Polish text corpus that includes predominantly written materials. We show that the frequencies derived from the two corpora perform best in predicting human performance in a lexical decision task if used in a complementary way. Our results suggest that the two corpora may have unequal potential for explaining human performance for words in different frequency ranges and that corpora based on written materials severely overestimate frequencies for formal words. We discuss some of the implications of these findings for future studies comparing different frequency estimates. In addition to frequencies for word forms, SUBTLEX-PL includes measures of contextual diversity, part-of-speech-specific word frequencies, frequencies of associated lemmas, and word bigrams, providing researchers with necessary tools for conducting psycholinguistic research in Polish. The database is freely available for research purposes and may be downloaded from the authors' university Web site at http://crr.ugent.be/subtlex-pl .


Subject(s)
Verbal Behavior , Vocabulary , Writing , Behavioral Research/methods , Databases, Factual , Humans , Poland , Psycholinguistics/methods , Speech
14.
Q J Exp Psychol (Hove) ; 67(6): 1176-90, 2014.
Article in English | MEDLINE | ID: mdl-24417251

ABSTRACT

We present word frequencies based on subtitles of British television programmes. We show that the SUBTLEX-UK word frequencies explain more of the variance in the lexical decision times of the British Lexicon Project than the word frequencies based on the British National Corpus and the SUBTLEX-US frequencies. In addition to the word form frequencies, we also present measures of contextual diversity part-of-speech specific word frequencies, word frequencies in children programmes, and word bigram frequencies, giving researchers of British English access to the full range of norms recently made available for other languages. Finally, we introduce a new measure of word frequency, the Zipf scale, which we hope will stop the current misunderstandings of the word frequency effect.


Subject(s)
Databases, Factual/statistics & numerical data , Decision Making , Language , Recognition, Psychology/physiology , Vocabulary , Humans , Reaction Time/physiology , Statistics as Topic , United Kingdom
SELECTION OF CITATIONS
SEARCH DETAIL
...