Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Behav Res Methods ; 56(4): 3794-3813, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38724878

RESUMO

The use of taboo words represents one of the most common and arguably universal linguistic behaviors, fulfilling a wide range of psychological and social functions. However, in the scientific literature, taboo language is poorly characterized, and how it is realized in different languages and populations remains largely unexplored. Here we provide a database of taboo words, collected from different linguistic communities (Study 1, N = 1046), along with their speaker-centered semantic characterization (Study 2, N = 455 for each of six rating dimensions), covering 13 languages and 17 countries from all five permanently inhabited continents. Our results show that, in all languages, taboo words are mainly characterized by extremely low valence and high arousal, and very low written frequency. However, a significant amount of cross-country variability in words' tabooness and offensiveness proves the importance of community-specific sociocultural knowledge in the study of taboo language.


Assuntos
Idioma , Tabu , Humanos , Semântica , Comparação Transcultural
2.
Stud Health Technol Inform ; 302: 743-744, 2023 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-37203482

RESUMO

In this communication, we demonstrate that the bias observed in domain general training sets with health-related content is not improved in domain specific health-communication corpora, contra.


Assuntos
Idioma , Processamento de Linguagem Natural , Viés
3.
Stud Health Technol Inform ; 295: 221-225, 2022 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-35773848

RESUMO

This paper explores a methodology for bias quantification in transformer-based deep neural network language models for Chinese, English, and French. When queried with health-related mythbusters on COVID-19, we observe a bias that is not of a semantic/encyclopaedical knowledge nature, but rather a syntactic one, as predicted by theoretical insights of structural complexity. Our results highlight the need for the creation of health-communication corpora as training sets for deep learning.


Assuntos
COVID-19 , Idioma , Humanos , Linguística , Redes Neurais de Computação , Semântica
4.
Stud Health Technol Inform ; 289: 196-199, 2022 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-35062126

RESUMO

The ability of assessing any type of linguistic complexity of any given contents could potentially improve knowledge reproduction, especially tacit knowledge which can be expensive during a pandemic. In this paper, we develop a simple and crosslinguistic model of complexity which considers formal accounts on the study of linguistic systems, but can be easily implemented by non-linguists' groups, e.g., communication experts and policymakers. To test our model, we conduct a study on a corpus extracted from the World Health Organization (WHO)'s emergency learning platform in 6 languages. Data extracted from open-access encyclopaedic entries act as control groups. The results show that the measurements adopted signal a trend for a minimization of complexity and can be exploited as features for (automatic) text classification.


Assuntos
Multilinguismo , Idioma , Linguística , Pandemias , Organização Mundial da Saúde
5.
Stud Health Technol Inform ; 281: 516-517, 2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34042628

RESUMO

Reproduction of knowledge, especially tacit knowledge can be expensive during a pandemic. One of the most common causes is the reduced information accessibility during the translation process. Having the ability to assess the linguistic complexity of any given contents could potentially improve knowledge reproduction. Authors conduct two cross-linguistic studies on the World Health Organization (WHO)'s emergency learning platform to assess the linguistic complexity of two online courses in 10 languages. Morpho-syntactically annotated treebanks, unannotated materials from Wikipedia and language-specific corpora are set as control groups. Preliminary findings reveal a clear reduced complexity of learning contents in the most candidate languages while retaining the maximum amount of information. Creating a baseline study on low-resourced languages on the learning genre could be potentially useful for measuring impact of normative products at country and local level.


Assuntos
Linguística , Multilinguismo , Idioma , Pandemias , Organização Mundial da Saúde
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...