Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 4.360
Filter
1.
Article in English | MEDLINE | ID: mdl-38980524

ABSTRACT

OBJECTIVE: Language used by providers in medical documentation may reveal evidence of race-related implicit bias. We aimed to use natural language processing (NLP) to examine if prevalence of stigmatizing language in emergency medicine (EM) encounter notes differs across patient race/ethnicity. METHODS: In a retrospective cohort of EM encounters, NLP techniques identified stigmatizing and positive themes. Logistic regression models analyzed the association of race/ethnicity and themes within notes. Outcomes were the presence (or absence) of 7 different themes: 5 stigmatizing (difficult, non-compliant, skepticism, substance abuse/seeking, and financial difficulty) and 2 positive (compliment and compliant). RESULTS: The sample included notes from 26,363 unique patients. NH Black patient notes were less likely to contain difficult (odds ratio (OR) 0.80, 95% confidence interval (CI), 0.73-0.88), skepticism (OR 0.87, 95% CI, 0.79-0.96), and substance abuse/seeking (OR 0.62, 95% CI, 0.56-0.70) compared to NH White patient notes but more likely to contain non-compliant (OR 1.26, 95% CI, 1.17-1.36) and financial difficulty (OR 1.14, 95% CI, 1.04-1.25). Hispanic patient notes were less likely to contain difficult (OR 0.68, 95% CI, 0.58-0.80) and substance abuse/seeking (OR 0.78, 95% CI, 0.66-0.93). NH NA/AI patient notes had twice the odds as NH White patient notes to contain a stigmatizing theme (OR 2.02, 95% CI, 1.64-2.49). CONCLUSIONS: Using an NLP model to analyze themes in EM notes across racial groups, we identified several inequities in the usage of positive and stigmatizing language. Interventions to minimize race-related implicit bias should be undertaken.

2.
BMJ Open ; 14(7): e084124, 2024 Jul 05.
Article in English | MEDLINE | ID: mdl-38969371

ABSTRACT

BACKGROUND: Systematic reviews (SRs) are being published at an accelerated rate. Decision-makers may struggle with comparing and choosing between multiple SRs on the same topic. We aimed to understand how healthcare decision-makers (eg, practitioners, policymakers, researchers) use SRs to inform decision-making and to explore the potential role of a proposed artificial intelligence (AI) tool to assist in critical appraisal and choosing among SRs. METHODS: We developed a survey with 21 open and closed questions. We followed a knowledge translation plan to disseminate the survey through social media and professional networks. RESULTS: Our survey response rate was lower than expected (7.9% of distributed emails). Of the 684 respondents, 58.2% identified as researchers, 37.1% as practitioners, 19.2% as students and 13.5% as policymakers. Respondents frequently sought out SRs (97.1%) as a source of evidence to inform decision-making. They frequently (97.9%) found more than one SR on a given topic of interest to them. Just over half (50.8%) struggled to choose the most trustworthy SR among multiple. These difficulties related to lack of time (55.2%), or difficulties comparing due to varying methodological quality of SRs (54.2%), differences in results and conclusions (49.7%) or variation in the included studies (44.6%). Respondents compared SRs based on the relevance to their question of interest, methodological quality, and recency of the SR search. Most respondents (87.0%) were interested in an AI tool to help appraise and compare SRs. CONCLUSIONS: Given the identified barriers of using SR evidence, an AI tool to facilitate comparison of the relevance of SRs, the search and methodological quality, could help users efficiently choose among SRs and make healthcare decisions.


Subject(s)
Artificial Intelligence , Decision Making , Systematic Reviews as Topic , Humans , Systematic Reviews as Topic/methods , Surveys and Questionnaires , Decision Support Techniques , Delivery of Health Care
3.
Int Urol Nephrol ; 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38982018

ABSTRACT

BACKGROUND: Artificial intelligence (AI) has emerged as a promising avenue for improving patient care and surgical outcomes in urological surgery. However, the extent of AI's impact in predicting and managing complications is not fully elucidated. OBJECTIVES: We review the application of AI to foresee and manage complications in urological surgery, assess its efficacy, and discuss challenges to its use. METHODS AND MATERIALS: A targeted non-systematic literature search was conducted using the PubMed and Google Scholar databases to identify studies on AI in urological surgery and its complications. Evidence from the studies was synthesised. RESULTS: Incorporating AI into various facets of urological surgery has shown promising advancements. From preoperative planning to intraoperative guidance, AI is revolutionising the field, demonstrating remarkable proficiency in tasks such as image analysis, decision-making support, and complication prediction. Studies show that AI programmes are highly accurate, increase surgical precision and efficiency, and reduce complications. However, implementation challenges exist in AI errors, human errors, and ethical issues. CONCLUSION: AI has great potential in predicting and managing surgical complications of urological surgery. Advancements have been made, but challenges and ethical considerations must be addressed before widespread AI implementation.

4.
BMC Med Inform Decis Mak ; 24(1): 192, 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38982465

ABSTRACT

BACKGROUND: As global aging intensifies, the prevalence of ocular fundus diseases continues to rise. In China, the tense doctor-patient ratio poses numerous challenges for the early diagnosis and treatment of ocular fundus diseases. To reduce the high risk of missed or misdiagnosed cases, avoid irreversible visual impairment for patients, and ensure good visual prognosis for patients with ocular fundus diseases, it is particularly important to enhance the growth and diagnostic capabilities of junior doctors. This study aims to leverage the value of electronic medical record data to developing a diagnostic intelligent decision support platform. This platform aims to assist junior doctors in diagnosing ocular fundus diseases quickly and accurately, expedite their professional growth, and prevent delays in patient treatment. An empirical evaluation will assess the platform's effectiveness in enhancing doctors' diagnostic efficiency and accuracy. METHODS: In this study, eight Chinese Named Entity Recognition (NER) models were compared, and the SoftLexicon-Glove-Word2vec model, achieving a high F1 score of 93.02%, was selected as the optimal recognition tool. This model was then used to extract key information from electronic medical records (EMRs) and generate feature variables based on diagnostic rule templates. Subsequently, an XGBoost algorithm was employed to construct an intelligent decision support platform for diagnosing ocular fundus diseases. The effectiveness of the platform in improving diagnostic efficiency and accuracy was evaluated through a controlled experiment comparing experienced and junior doctors. RESULTS: The use of the diagnostic intelligent decision support platform resulted in significant improvements in both diagnostic efficiency and accuracy for both experienced and junior doctors (P < 0.05). Notably, the gap in diagnostic speed and precision between junior doctors and experienced doctors narrowed considerably when the platform was used. Although the platform also provided some benefits to experienced doctors, the improvement was less pronounced compared to junior doctors. CONCLUSION: The diagnostic intelligent decision support platform established in this study, based on the XGBoost algorithm and NER, effectively enhances the diagnostic efficiency and accuracy of junior doctors in ocular fundus diseases. This has significant implications for optimizing clinical diagnosis and treatment.


Subject(s)
Ophthalmologists , Humans , Clinical Decision-Making , Electronic Health Records/standards , Artificial Intelligence , China , Decision Support Systems, Clinical
5.
PeerJ Comput Sci ; 10: e2122, 2024.
Article in English | MEDLINE | ID: mdl-38983192

ABSTRACT

Grammar error correction systems are pivotal in the field of natural language processing (NLP), with a primary focus on identifying and correcting the grammatical integrity of written text. This is crucial for both language learning and formal communication. Recently, neural machine translation (NMT) has emerged as a promising approach in high demand. However, this approach faces significant challenges, particularly the scarcity of training data and the complexity of grammar error correction (GEC), especially for low-resource languages such as Indonesian. To address these challenges, we propose InSpelPoS, a confusion method that combines two synthetic data generation methods: the Inverted Spellchecker and Patterns+POS. Furthermore, we introduce an adapted seq2seq framework equipped with a dynamic decoding method and state-of-the-art Transformer-based neural language models to enhance the accuracy and efficiency of GEC. The dynamic decoding method is capable of navigating the complexities of GEC and correcting a wide range of errors, including contextual and grammatical errors. The proposed model leverages the contextual information of words and sentences to generate a corrected output. To assess the effectiveness of our proposed framework, we conducted experiments using synthetic data and compared its performance with existing GEC systems. The results demonstrate a significant improvement in the accuracy of Indonesian GEC compared to existing methods.

6.
PeerJ Comput Sci ; 10: e2063, 2024.
Article in English | MEDLINE | ID: mdl-38983191

ABSTRACT

Lack of an effective early sign language learning framework for a hard-of-hearing population can have traumatic consequences, causing social isolation and unfair treatment in workplaces. Alphabet and digit detection methods have been the basic framework for early sign language learning but are restricted by performance and accuracy, making it difficult to detect signs in real life. This article proposes an improved sign language detection method for early sign language learners based on the You Only Look Once version 8.0 (YOLOv8) algorithm, referred to as the intelligent sign language detection system (iSDS), which exploits the power of deep learning to detect sign language-distinct features. The iSDS method could overcome the false positive rates and improve the accuracy as well as the speed of sign language detection. The proposed iSDS framework for early sign language learners consists of three basic steps: (i) image pixel processing to extract features that are underrepresented in the frame, (ii) inter-dependence pixel-based feature extraction using YOLOv8, (iii) web-based signer independence validation. The proposed iSDS enables faster response times and reduces misinterpretation and inference delay time. The iSDS achieved state-of-the-art performance of over 97% for precision, recall, and F1-score with the best mAP of 87%. The proposed iSDS method has several potential applications, including continuous sign language detection systems and intelligent web-based sign recognition systems.

7.
PeerJ Comput Sci ; 10: e2092, 2024.
Article in English | MEDLINE | ID: mdl-38983225

ABSTRACT

More sophisticated data access is possible with artificial intelligence (AI) techniques such as question answering (QA), but regulations and privacy concerns have limited their use. Federated learning (FL) deals with these problems, and QA is a viable substitute for AI. The utilization of hierarchical FL systems is examined in this research, along with an ideal method for developing client-specific adapters. The User Modified Hierarchical Federated Learning Model (UMHFLM) selects local models for users' tasks. The article suggests employing recurrent neural network (RNN) as a neural network (NN) technique for learning automatically and categorizing questions based on natural language into the appropriate templates. Together, local and global models are developed, with the worldwide model influencing local models, which are, in turn, combined for personalization. The method is applied in natural language processing pipelines for phrase matching employing template exact match, segmentation, and answer type detection. The (SQuAD-2.0), a DL-based QA method for acquiring knowledge of complicated SPARQL test questions and their accompanying SPARQL queries across the DBpedia dataset, was used to train and assess the model. The SQuAD2.0 datasets evaluate the model, which identifies 38 distinct templates. Considering the top two most likely templates, the RNN model achieves template classification accuracy of 92.8% and 61.8% on the SQuAD2.0 and QALD-7 datasets. A study on data scarcity among participants found that FL Match outperformed BERT significantly. A MAP margin of 2.60% exists between BERT and FL Match at a 100% data ratio and an MRR margin of 7.23% at a 20% data ratio.

8.
PeerJ Comput Sci ; 10: e2138, 2024.
Article in English | MEDLINE | ID: mdl-38983234

ABSTRACT

The recent rapid growth in the number of Saudi female athletes and sports enthusiasts' presence on social media has exposed them to gender-hate speech and discrimination. Hate speech, a harmful worldwide phenomenon, can have severe consequences. Its prevalence in sports has surged alongside the growing influence of social media, with X serving as a prominent platform for the expression of hate speech and discriminatory comments, often targeting women in sports. This research combines two studies that explores online hate speech and gender biases in the context of sports, proposing an automated solution for detecting hate speech targeting women in sports on platforms like X, with a particular focus on Arabic, a challenging domain with limited prior research. In Study 1, semi-structured interviews with 33 Saudi female athletes and sports fans revealed common forms of hate speech, including gender-based derogatory comments, misogyny, and appearance-related discrimination. Building upon the foundations laid by Study 1, Study 2 addresses the pressing need for effective interventions to combat hate speech against women in sports on social media by evaluating machine learning (ML) models for identifying hate speech targeting women in sports in Arabic. A dataset of 7,487 Arabic tweets was collected, annotated, and pre-processed. Term frequency-inverse document frequency (TF-IDF) and part-of-speech (POS) feature extraction techniques were used, and various ML algorithms were trained Random Forest consistently outperformed, achieving accuracy (85% and 84% using TF-IDF and POS, respectively) compared to other methods, demonstrating the effectiveness of both feature sets in identifying Arabic hate speech. The research contribution advances the understanding of online hate targeting Arabic women in sports by identifying various forms of such hate. The systematic creation of a meticulously annotated Arabic hate speech dataset, specifically focused on women's sports, enhances the dataset's reliability and provides valuable insights for future research in countering hate speech against women in sports. This dataset forms a strong foundation for developing effective strategies to address online hate within the unique context of women's sports. The research findings contribute to the ongoing efforts to combat hate speech against women in sports on social media, aligning with the objectives of Saudi Arabia's Vision 2030 and recognizing the significance of female participation in sports.

9.
Front Digit Health ; 6: 1387139, 2024.
Article in English | MEDLINE | ID: mdl-38983792

ABSTRACT

Introduction: Patient-reported outcomes measures (PROMs) are valuable tools for assessing health-related quality of life and treatment effectiveness in individuals with traumatic brain injuries (TBIs). Understanding the experiences of individuals with TBIs in completing PROMs is crucial for improving their utility and relevance in clinical practice. Methods: Sixteen semi-structured interviews were conducted with a sample of individuals with TBIs. The interviews were transcribed verbatim and analysed using Thematic Analysis (TA) and Natural Language Processing (NLP) techniques to identify themes and emotional connotations related to the experiences of completing PROMs. Results: The TA of the data revealed six key themes regarding the experiences of individuals with TBIs in completing PROMs. Participants expressed varying levels of understanding and engagement with PROMs, with factors such as cognitive impairments and communication difficulties influencing their experiences. Additionally, insightful suggestions emerged on the barriers to the completion of PROMs, the factors facilitating it, and the suggestions for improving their contents and delivery methods. The sentiment analyses performed using NLP techniques allowed for the retrieval of the general sentimental and emotional "tones" in the participants' narratives of their experiences with PROMs, which were mainly characterised by low positive sentiment connotations. Although mostly neutral, participants' narratives also revealed the presence of emotions such as fear and, to a lesser extent, anger. The combination of a semantic and sentiment analysis of the experiences of people with TBIs rendered valuable information on the views and emotional responses to different aspects of the PROMs. Discussion: The findings highlighted the complexities involved in administering PROMs to individuals with TBIs and underscored the need for tailored approaches to accommodate their unique challenges. Integrating TA-based and NLP techniques can offer valuable insights into the experiences of individuals with TBIs and enhance the interpretation of qualitative data in this population.

10.
Age Ageing ; 53(7)2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38970549

ABSTRACT

BACKGROUND: Recording and coding of ageing syndromes in hospital records is known to be suboptimal. Natural Language Processing algorithms may be useful to identify diagnoses in electronic healthcare records to improve the recording and coding of these ageing syndromes, but the feasibility and diagnostic accuracy of such algorithms are unclear. METHODS: We conducted a systematic review according to a predefined protocol and in line with Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Searches were run from the inception of each database to the end of September 2023 in PubMed, Medline, Embase, CINAHL, ACM digital library, IEEE Xplore and Scopus. Eligible studies were identified via independent review of search results by two coauthors and data extracted from each study to identify the computational method, source of text, testing strategy and performance metrics. Data were synthesised narratively by ageing syndrome and computational method in line with the Studies Without Meta-analysis guidelines. RESULTS: From 1030 titles screened, 22 studies were eligible for inclusion. One study focussed on identifying sarcopenia, one frailty, twelve falls, five delirium, five dementia and four incontinence. Sensitivity (57.1%-100%) of algorithms compared with a reference standard was reported in 20 studies, and specificity (84.0%-100%) was reported in only 12 studies. Study design quality was variable with results relevant to diagnostic accuracy not always reported, and few studies undertaking external validation of algorithms. CONCLUSIONS: Current evidence suggests that Natural Language Processing algorithms can identify ageing syndromes in electronic health records. However, algorithms require testing in rigorously designed diagnostic accuracy studies with appropriate metrics reported.


Subject(s)
Accidental Falls , Aging , Electronic Health Records , Frailty , Natural Language Processing , Sarcopenia , Humans , Sarcopenia/diagnosis , Sarcopenia/epidemiology , Sarcopenia/physiopathology , Frailty/diagnosis , Aged , Syndrome , Algorithms , Geriatric Assessment/methods
11.
Transl Clin Pharmacol ; 32(2): 73-82, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38974344

ABSTRACT

Large language models (LLMs) have emerged as a powerful tool for biomedical researchers, demonstrating remarkable capabilities in understanding and generating human-like text. ChatGPT with its Code Interpreter functionality, an LLM connected with the ability to write and execute code, streamlines data analysis workflows by enabling natural language interactions. Using materials from a previously published tutorial, similar analyses can be performed through conversational interactions with the chatbot, covering data loading and exploration, model development and comparison, permutation feature importance, partial dependence plots, and additional analyses and recommendations. The findings highlight the significant potential of LLMs in assisting researchers with data analysis tasks, allowing them to focus on higher-level aspects of their work. However, there are limitations and potential concerns associated with the use of LLMs, such as the importance of critical thinking, privacy, security, and equitable access to these tools. As LLMs continue to improve and integrate with available tools, data science may experience a transformation similar to the shift from manual to automatic transmission in driving. The advancements in LLMs call for considering the future directions of data science and its education, ensuring that the benefits of these powerful tools are utilized with proper human supervision and responsibility.

12.
J Med Internet Res ; 26: e56110, 2024 Jul 08.
Article in English | MEDLINE | ID: mdl-38976865

ABSTRACT

BACKGROUND: OpenAI's ChatGPT is a pioneering artificial intelligence (AI) in the field of natural language processing, and it holds significant potential in medicine for providing treatment advice. Additionally, recent studies have demonstrated promising results using ChatGPT for emergency medicine triage. However, its diagnostic accuracy in the emergency department (ED) has not yet been evaluated. OBJECTIVE: This study compares the diagnostic accuracy of ChatGPT with GPT-3.5 and GPT-4 and primary treating resident physicians in an ED setting. METHODS: Among 100 adults admitted to our ED in January 2023 with internal medicine issues, the diagnostic accuracy was assessed by comparing the diagnoses made by ED resident physicians and those made by ChatGPT with GPT-3.5 or GPT-4 against the final hospital discharge diagnosis, using a point system for grading accuracy. RESULTS: The study enrolled 100 patients with a median age of 72 (IQR 58.5-82.0) years who were admitted to our internal medicine ED primarily for cardiovascular, endocrine, gastrointestinal, or infectious diseases. GPT-4 outperformed both GPT-3.5 (P<.001) and ED resident physicians (P=.01) in diagnostic accuracy for internal medicine emergencies. Furthermore, across various disease subgroups, GPT-4 consistently outperformed GPT-3.5 and resident physicians. It demonstrated significant superiority in cardiovascular (GPT-4 vs ED physicians: P=.03) and endocrine or gastrointestinal diseases (GPT-4 vs GPT-3.5: P=.01). However, in other categories, the differences were not statistically significant. CONCLUSIONS: In this study, which compared the diagnostic accuracy of GPT-3.5, GPT-4, and ED resident physicians against a discharge diagnosis gold standard, GPT-4 outperformed both the resident physicians and its predecessor, GPT-3.5. Despite the retrospective design of the study and its limited sample size, the results underscore the potential of AI as a supportive diagnostic tool in ED settings.


Subject(s)
Emergency Service, Hospital , Humans , Emergency Service, Hospital/statistics & numerical data , Retrospective Studies , Aged , Female , Middle Aged , Male , Aged, 80 and over , Artificial Intelligence , Physicians/statistics & numerical data , Natural Language Processing , Triage/methods
13.
Arch Craniofac Surg ; 25(3): 116-122, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38977396

ABSTRACT

BACKGROUND: Due to the importance of evidence-based research in plastic surgery, the authors of this study aimed to assess the accuracy of ChatGPT in generating novel systematic review ideas within the field of craniofacial surgery. METHODS: ChatGPT was prompted to generate 20 novel systematic review ideas for 10 different subcategories within the field of craniofacial surgery. For each topic, the chatbot was told to give 10 "general" and 10 "specific" ideas that were related to the concept. In order to determine the accuracy of ChatGPT, a literature review was conducted using PubMed, CINAHL, Embase, and Cochrane. RESULTS: In total, 200 total systematic review research ideas were generated by ChatGPT. We found that the algorithm had an overall 57.5% accuracy at identifying novel systematic review ideas. ChatGPT was found to be 39% accurate for general topics and 76% accurate for specific topics. CONCLUSION: Craniofacial surgeons should use ChatGPT as a tool. We found that ChatGPT provided more precise answers with specific research questions than with general questions and helped narrow down the search scope, leading to a more relevant and accurate response. Beyond research purposes, ChatGPT can augment patient consultations, improve healthcare equity, and assist in clinical decisionmaking. With rapid advancements in artificial intelligence (AI), it is important for plastic surgeons to consider using AI in their clinical practice to improve patient-centered outcomes.

14.
Syst Rev ; 13(1): 174, 2024 Jul 08.
Article in English | MEDLINE | ID: mdl-38978132

ABSTRACT

BACKGROUND: The demand for high-quality systematic literature reviews (SRs) for evidence-based medical decision-making is growing. SRs are costly and require the scarce resource of highly skilled reviewers. Automation technology has been proposed to save workload and expedite the SR workflow. We aimed to provide a comprehensive overview of SR automation studies indexed in PubMed, focusing on the applicability of these technologies in real world practice. METHODS: In November 2022, we extracted, combined, and ran an integrated PubMed search for SRs on SR automation. Full-text English peer-reviewed articles were included if they reported studies on SR automation methods (SSAM), or automated SRs (ASR). Bibliographic analyses and knowledge-discovery studies were excluded. Record screening was performed by single reviewers, and the selection of full text papers was performed in duplicate. We summarized the publication details, automated review stages, automation goals, applied tools, data sources, methods, results, and Google Scholar citations of SR automation studies. RESULTS: From 5321 records screened by title and abstract, we included 123 full text articles, of which 108 were SSAM and 15 ASR. Automation was applied for search (19/123, 15.4%), record screening (89/123, 72.4%), full-text selection (6/123, 4.9%), data extraction (13/123, 10.6%), risk of bias assessment (9/123, 7.3%), evidence synthesis (2/123, 1.6%), assessment of evidence quality (2/123, 1.6%), and reporting (2/123, 1.6%). Multiple SR stages were automated by 11 (8.9%) studies. The performance of automated record screening varied largely across SR topics. In published ASR, we found examples of automated search, record screening, full-text selection, and data extraction. In some ASRs, automation fully complemented manual reviews to increase sensitivity rather than to save workload. Reporting of automation details was often incomplete in ASRs. CONCLUSIONS: Automation techniques are being developed for all SR stages, but with limited real-world adoption. Most SR automation tools target single SR stages, with modest time savings for the entire SR process and varying sensitivity and specificity across studies. Therefore, the real-world benefits of SR automation remain uncertain. Standardizing the terminology, reporting, and metrics of study reports could enhance the adoption of SR automation techniques in real-world practice.


Subject(s)
Automation , PubMed , Systematic Reviews as Topic , Humans
15.
Health Aff Sch ; 2(7): qxae082, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38979103

ABSTRACT

Designing effective childhood vaccination counseling guidelines, public health campaigns, and school-entry mandates requires a nuanced understanding of the information ecology in which parents make vaccination decisions. However, evidence is lacking on how best to "catch the signal" about the public's attitudes, beliefs, and misperceptions. In this study, we characterize public sentiment and discourse about vaccinating children against SARS-CoV-2 with mRNA vaccines to identify prevalent concerns about the vaccine and to understand anti-vaccine rhetorical strategies. We applied computational topic modeling to 149 897 comments submitted to regulations.gov in October 2021 and February 2022 regarding the Food and Drug Administration's Vaccines and Related Biological Products Advisory Committee's emergency use authorization of the COVID-19 vaccines for children. We used a latent Dirichlet allocation topic modeling algorithm to generate topics and then used iterative thematic and discursive analysis to identify relevant domains, themes, and rhetorical strategies. Three domains emerged: (1) specific concerns about the COVID-19 vaccines; (2) foundational beliefs shaping vaccine attitudes; and (3) rhetorical strategies deployed in anti-vaccine arguments. Computational social listening approaches can contribute to misinformation surveillance and evidence-based guidelines for vaccine counseling and public health promotion campaigns.

16.
JMIR Med Inform ; 12: e59680, 2024 Jul 02.
Article in English | MEDLINE | ID: mdl-38954456

ABSTRACT

BACKGROUND: Named entity recognition (NER) is a fundamental task in natural language processing. However, it is typically preceded by named entity annotation, which poses several challenges, especially in the clinical domain. For instance, determining entity boundaries is one of the most common sources of disagreements between annotators due to questions such as whether modifiers or peripheral words should be annotated. If unresolved, these can induce inconsistency in the produced corpora, yet, on the other hand, strict guidelines or adjudication sessions can further prolong an already slow and convoluted process. OBJECTIVE: The aim of this study is to address these challenges by evaluating 2 novel annotation methodologies, lenient span and point annotation, aiming to mitigate the difficulty of precisely determining entity boundaries. METHODS: We evaluate their effects through an annotation case study on a Japanese medical case report data set. We compare annotation time, annotator agreement, and the quality of the produced labeling and assess the impact on the performance of an NER system trained on the annotated corpus. RESULTS: We saw significant improvements in the labeling process efficiency, with up to a 25% reduction in overall annotation time and even a 10% improvement in annotator agreement compared to the traditional boundary-strict approach. However, even the best-achieved NER model presented some drop in performance compared to the traditional annotation methodology. CONCLUSIONS: Our findings demonstrate a balance between annotation speed and model performance. Although disregarding boundary information affects model performance to some extent, this is counterbalanced by significant reductions in the annotator's workload and notable improvements in the speed of the annotation process. These benefits may prove valuable in various applications, offering an attractive compromise for developers and researchers.

17.
JMIR Ment Health ; 11: e49879, 2024 Jul 03.
Article in English | MEDLINE | ID: mdl-38959061

ABSTRACT

BACKGROUND: Suicide is a leading cause of death worldwide. Journalistic reporting guidelines were created to curb the impact of unsafe reporting; however, how suicide is framed in news reports may differ by important characteristics such as the circumstances and the decedent's gender. OBJECTIVE: This study aimed to examine the degree to which news media reports of suicides are framed using stigmatized or glorified language and differences in such framing by gender and circumstance of suicide. METHODS: We analyzed 200 news articles regarding suicides and applied the validated Stigma of Suicide Scale to identify stigmatized and glorified language. We assessed linguistic similarity with 2 widely used metrics, cosine similarity and mutual information scores, using a machine learning-based large language model. RESULTS: News reports of male suicides were framed more similarly to stigmatizing (P<.001) and glorifying (P=.005) language than reports of female suicides. Considering the circumstances of suicide, mutual information scores indicated that differences in the use of stigmatizing or glorifying language by gender were most pronounced for articles attributing legal (0.155), relationship (0.268), or mental health problems (0.251) as the cause. CONCLUSIONS: Linguistic differences, by gender, in stigmatizing or glorifying language when reporting suicide may exacerbate suicide disparities.


Subject(s)
Mass Media , Social Stigma , Suicide , Humans , Female , Male , Suicide/psychology , Suicide/statistics & numerical data , Mass Media/statistics & numerical data , Sex Factors , Adult
19.
IEEE Open J Signal Process ; 5: 738-749, 2024.
Article in English | MEDLINE | ID: mdl-38957540

ABSTRACT

The ADReSS-M Signal Processing Grand Challenge was held at the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023. The challenge targeted difficult automatic prediction problems of great societal and medical relevance, namely, the detection of Alzheimer's Dementia (AD) and the estimation of cognitive test scoress. Participants were invited to create models for the assessment of cognitive function based on spontaneous speech data. Most of these models employed signal processing and machine learning methods. The ADReSS-M challenge was designed to assess the extent to which predictive models built based on speech in one language generalise to another language. The language data compiled and made available for ADReSS-M comprised English, for model training, and Greek, for model testing and validation. To the best of our knowledge no previous shared research task investigated acoustic features of the speech signal or linguistic characteristics in the context of multilingual AD detection. This paper describes the context of the ADReSS-M challenge, its data sets, its predictive tasks, the evaluation methodology we employed, our baseline models and results, and the top five submissions. The paper concludes with a summary discussion of the ADReSS-M results, and our critical assessment of the future outlook in this field.

20.
Artif Intell Med ; 154: 102924, 2024 Jun 26.
Article in English | MEDLINE | ID: mdl-38964194

ABSTRACT

BACKGROUND: Radiology reports are typically written in a free-text format, making clinical information difficult to extract and use. Recently, the adoption of structured reporting (SR) has been recommended by various medical societies thanks to the advantages it offers, e.g. standardization, completeness, and information retrieval. We propose a pipeline to extract information from Italian free-text radiology reports that fits with the items of the reference SR registry proposed by a national society of interventional and medical radiology, focusing on CT staging of patients with lymphoma. METHODS: Our work aims to leverage the potential of Natural Language Processing and Transformer-based models to deal with automatic SR registry filling. With the availability of 174 Italian radiology reports, we investigate a rule-free generative Question Answering approach based on the Italian-specific version of T5: IT5. To address information content discrepancies, we focus on the six most frequently filled items in the annotations made on the reports: three categorical (multichoice), one free-text (free-text), and two continuous numerical (factual). In the preprocessing phase, we encode also information that is not supposed to be entered. Two strategies (batch-truncation and ex-post combination) are implemented to comply with the IT5 context length limitations. Performance is evaluated in terms of strict accuracy, f1, and format accuracy, and compared with the widely used GPT-3.5 Large Language Model. Unlike multichoice and factual, free-text answers do not have 1-to-1 correspondence with their reference annotations. For this reason, we collect human-expert feedback on the similarity between medical annotations and generated free-text answers, using a 5-point Likert scale questionnaire (evaluating the criteria of correctness and completeness). RESULTS: The combination of fine-tuning and batch splitting allows IT5 ex-post combination to achieve notable results in terms of information extraction of different types of structured data, performing on par with GPT-3.5. Human-based assessment scores of free-text answers show a high correlation with the AI performance metrics f1 (Spearman's correlation coefficients>0.5, p-values<0.001) for both IT5 ex-post combination and GPT-3.5. The latter is better at generating plausible human-like statements, even if it systematically provides answers even when they are not supposed to be given. CONCLUSIONS: In our experimental setting, a fine-tuned Transformer-based model with a modest number of parameters (i.e., IT5, 220 M) performs well as a clinical information extraction system for automatic SR registry filling task. It can extract information from more than one place in the report, elaborating it in a manner that complies with the response specifications provided by the SR registry (for multichoice and factual items), or that closely approximates the work of a human-expert (free-text items); with the ability to discern when an answer is supposed to be given or not to a user query.

SELECTION OF CITATIONS
SEARCH DETAIL
...