Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 6.752
Filter
1.
Clin Orthop Surg ; 16(3): 347-356, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38827766

ABSTRACT

Artificial intelligence (AI) has rapidly transformed various aspects of life, and the launch of the chatbot "ChatGPT" by OpenAI in November 2022 has garnered significant attention and user appreciation. ChatGPT utilizes natural language processing based on a "generative pre-trained transfer" (GPT) model, specifically the transformer architecture, to generate human-like responses to a wide range of questions and topics. Equipped with approximately 57 billion words and 175 billion parameters from online data, ChatGPT has potential applications in medicine and orthopedics. One of its key strengths is its personalized, easy-to-understand, and adaptive response, which allows it to learn continuously through user interaction. This article discusses how AI, especially ChatGPT, presents numerous opportunities in orthopedics, ranging from preoperative planning and surgical techniques to patient education and medical support. Although ChatGPT's user-friendly responses and adaptive capabilities are laudable, its limitations, including biased responses and ethical concerns, necessitate its cautious and responsible use. Surgeons and healthcare providers should leverage the strengths of the ChatGPT while recognizing its current limitations and verifying critical information through independent research and expert opinions. As AI technology continues to evolve, ChatGPT may become a valuable tool in orthopedic education and patient care, leading to improved outcomes and efficiency in healthcare delivery. The integration of AI into orthopedics offers substantial benefits but requires careful consideration and continuous improvement.


Subject(s)
Artificial Intelligence , Orthopedic Procedures , Humans , Natural Language Processing , Patient Care
2.
Med Ref Serv Q ; 43(2): 196-202, 2024.
Article in English | MEDLINE | ID: mdl-38722609

ABSTRACT

Named entity recognition (NER) is a powerful computer system that utilizes various computing strategies to extract information from raw text input, since the early 1990s. With rapid advancement in AI and computing, NER models have gained significant attention and been serving as foundational tools across numerus professional domains to organize unstructured data for research and practical applications. This is particularly evident in the medical and healthcare fields, where NER models are essential in efficiently extract critical information from complex documents that are challenging for manual review. Despite its successes, NER present limitations in fully comprehending natural language nuances. However, the development of more advanced and user-friendly models promises to improve work experiences of professional users significantly.


Subject(s)
Information Storage and Retrieval , Natural Language Processing , Information Storage and Retrieval/methods , Humans , Artificial Intelligence
3.
JMIR Ment Health ; 11: e53730, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38722220

ABSTRACT

Background: There is growing concern around the use of sodium nitrite (SN) as an emerging means of suicide, particularly among younger people. Given the limited information on the topic from traditional public health surveillance sources, we studied posts made to an online suicide discussion forum, "Sanctioned Suicide," which is a primary source of information on the use and procurement of SN. Objective: This study aims to determine the trends in SN purchase and use, as obtained via data mining from subscriber posts on the forum. We also aim to determine the substances and topics commonly co-occurring with SN, as well as the geographical distribution of users and sources of SN. Methods: We collected all publicly available from the site's inception in March 2018 to October 2022. Using data-driven methods, including natural language processing and machine learning, we analyzed the trends in SN mentions over time, including the locations of SN consumers and the sources from which SN is procured. We developed a transformer-based source and location classifier to determine the geographical distribution of the sources of SN. Results: Posts pertaining to SN show a rise in popularity, and there were statistically significant correlations between real-life use of SN and suicidal intent when compared to data from the Centers for Disease Control and Prevention (CDC) Wide-Ranging Online Data for Epidemiologic Research (⍴=0.727; P<.001) and the National Poison Data System (⍴=0.866; P=.001). We observed frequent co-mentions of antiemetics, benzodiazepines, and acid regulators with SN. Our proposed machine learning-based source and location classifier can detect potential sources of SN with an accuracy of 72.92% and showed consumption in the United States and elsewhere. Conclusions: Vital information about SN and other emerging mechanisms of suicide can be obtained from online forums.


Subject(s)
Natural Language Processing , Self-Injurious Behavior , Sodium Nitrite , Humans , Self-Injurious Behavior/epidemiology , Suicide/trends , Suicide/psychology , Adult , Internet , Male , Female , Social Media , Young Adult
4.
PLoS One ; 19(5): e0303519, 2024.
Article in English | MEDLINE | ID: mdl-38723044

ABSTRACT

OBJECTIVE: To establish whether or not a natural language processing technique could identify two common inpatient neurosurgical comorbidities using only text reports of inpatient head imaging. MATERIALS AND METHODS: A training and testing dataset of reports of 979 CT or MRI scans of the brain for patients admitted to the neurosurgery service of a single hospital in June 2021 or to the Emergency Department between July 1-8, 2021, was identified. A variety of machine learning and deep learning algorithms utilizing natural language processing were trained on the training set (84% of the total cohort) and tested on the remaining images. A subset comparison cohort (n = 76) was then assessed to compare output of the best algorithm against real-life inpatient documentation. RESULTS: For "brain compression", a random forest classifier outperformed other candidate algorithms with an accuracy of 0.81 and area under the curve of 0.90 in the testing dataset. For "brain edema", a random forest classifier again outperformed other candidate algorithms with an accuracy of 0.92 and AUC of 0.94 in the testing dataset. In the provider comparison dataset, for "brain compression," the random forest algorithm demonstrated better accuracy (0.76 vs 0.70) and sensitivity (0.73 vs 0.43) than provider documentation. For "brain edema," the algorithm again demonstrated better accuracy (0.92 vs 0.84) and AUC (0.45 vs 0.09) than provider documentation. DISCUSSION: A natural language processing-based machine learning algorithm can reliably and reproducibly identify selected common neurosurgical comorbidities from radiology reports. CONCLUSION: This result may justify the use of machine learning-based decision support to augment provider documentation.


Subject(s)
Comorbidity , Natural Language Processing , Humans , Algorithms , Inpatients/statistics & numerical data , Female , Male , Machine Learning , Magnetic Resonance Imaging/methods , Documentation , Middle Aged , Tomography, X-Ray Computed , Neurosurgical Procedures , Aged , Deep Learning
5.
Sci Rep ; 14(1): 10785, 2024 05 11.
Article in English | MEDLINE | ID: mdl-38734712

ABSTRACT

Large language models (LLMs), like ChatGPT, Google's Bard, and Anthropic's Claude, showcase remarkable natural language processing capabilities. Evaluating their proficiency in specialized domains such as neurophysiology is crucial in understanding their utility in research, education, and clinical applications. This study aims to assess and compare the effectiveness of Large Language Models (LLMs) in answering neurophysiology questions in both English and Persian (Farsi) covering a range of topics and cognitive levels. Twenty questions covering four topics (general, sensory system, motor system, and integrative) and two cognitive levels (lower-order and higher-order) were posed to the LLMs. Physiologists scored the essay-style answers on a scale of 0-5 points. Statistical analysis compared the scores across different levels such as model, language, topic, and cognitive levels. Performing qualitative analysis identified reasoning gaps. In general, the models demonstrated good performance (mean score = 3.87/5), with no significant difference between language or cognitive levels. The performance was the strongest in the motor system (mean = 4.41) while the weakest was observed in integrative topics (mean = 3.35). Detailed qualitative analysis uncovered deficiencies in reasoning, discerning priorities, and knowledge integrating. This study offers valuable insights into LLMs' capabilities and limitations in the field of neurophysiology. The models demonstrate proficiency in general questions but face challenges in advanced reasoning and knowledge integration. Targeted training could address gaps in knowledge and causal reasoning. As LLMs evolve, rigorous domain-specific assessments will be crucial for evaluating advancements in their performance.


Subject(s)
Language , Neurophysiology , Humans , Neurophysiology/methods , Natural Language Processing , Cognition/physiology
6.
Health Informatics J ; 30(2): 14604582241240680, 2024.
Article in English | MEDLINE | ID: mdl-38739488

ABSTRACT

Objective: This study examined major themes and sentiments and their trajectories and interactions over time using subcategories of Reddit data. The aim was to facilitate decision-making for psychosocial rehabilitation. Materials and Methods: We utilized natural language processing techniques, including topic modeling and sentiment analysis, on a dataset consisting of more than 38,000 topics, comments, and posts collected from a subreddit dedicated to the experiences of people who tested positive for COVID-19. In this longitudinal exploratory analysis, we studied the dynamics between the most dominant topics and subjects' emotional states over an 18-month period. Results: Our findings highlight the evolution of the textual and sentimental status of major topics discussed by COVID survivors over an extended period of time during the pandemic. We particularly studied pre- and post-vaccination eras as a turning point in the timeline of the pandemic. The results show that not only does the relevance of topics change over time, but the emotions attached to them also vary. Major social events, such as the administration of vaccines or enforcement of nationwide policies, are also reflected through the discussions and inquiries of social media users. In particular, the emotional state (i.e., sentiments and polarity of their feelings) of those who have experienced COVID personally. Discussion: Cumulative societal knowledge regarding the COVID-19 pandemic impacts the patterns with which people discuss their experiences, concerns, and opinions. The subjects' emotional state with respect to different topics was also impacted by extraneous factors and events, such as vaccination. Conclusion: By mining major topics, sentiments, and trajectories demonstrated in COVID-19 survivors' interactions on Reddit, this study contributes to the emerging body of scholarship on COVID-19 survivors' mental health outcomes, providing insights into the design of mental health support and rehabilitation services for COVID-19 survivors.


Subject(s)
COVID-19 , SARS-CoV-2 , Survivors , Humans , COVID-19/psychology , COVID-19/epidemiology , Survivors/psychology , Data Mining/methods , Pandemics , Natural Language Processing , Social Media/trends , Longitudinal Studies
7.
Syst Rev ; 13(1): 135, 2024 May 16.
Article in English | MEDLINE | ID: mdl-38755704

ABSTRACT

We aimed to compare the concordance of information extracted and the time taken between a large language model (OpenAI's GPT-3.5 Turbo via API) against conventional human extraction methods in retrieving information from scientific articles on diabetic retinopathy (DR). The extraction was done using GPT3.5 Turbo as of October 2023. OpenAI's GPT-3.5 Turbo significantly reduced the time taken for extraction. Concordance was highest at 100% for the extraction of the country of study, 64.7% for significant risk factors of DR, 47.1% for exclusion and inclusion criteria, and lastly 41.2% for odds ratio (OR) and 95% confidence interval (CI). The concordance levels seemed to indicate the complexity associated with each prompt. This suggests that OpenAI's GPT-3.5 Turbo may be adopted to extract simple information that is easily located in the text, leaving more complex information to be extracted by the researcher. It is crucial to note that the foundation model is constantly improving significantly with new versions being released quickly. Subsequent work can focus on retrieval-augmented generation (RAG), embedding, chunking PDF into useful sections, and prompting to improve the accuracy of extraction.


Subject(s)
Diabetic Retinopathy , Humans , Information Storage and Retrieval/methods , Natural Language Processing , Data Mining/methods
8.
J Orthop Surg Res ; 19(1): 287, 2024 May 10.
Article in English | MEDLINE | ID: mdl-38725085

ABSTRACT

BACKGROUND: The Center for Medicare and Medicaid Services (CMS) imposes payment penalties for readmissions following total joint replacement surgeries. This study focuses on total hip, knee, and shoulder arthroplasty procedures as they account for most joint replacement surgeries. Apart from being a burden to healthcare systems, readmissions are also troublesome for patients. There are several studies which only utilized structured data from Electronic Health Records (EHR) without considering any gender and payor bias adjustments. METHODS: For this study, dataset of 38,581 total knee, hip, and shoulder replacement surgeries performed from 2015 to 2021 at Novant Health was gathered. This data was used to train a random forest machine learning model to predict the combined endpoint of emergency department (ED) visit or unplanned readmissions within 30 days of discharge or discharge to Skilled Nursing Facility (SNF) following the surgery. 98 features of laboratory results, diagnoses, vitals, medications, and utilization history were extracted. A natural language processing (NLP) model finetuned from Clinical BERT was used to generate an NLP risk score feature for each patient based on their clinical notes. To address societal biases, a feature bias analysis was performed in conjunction with propensity score matching. A threshold optimization algorithm from the Fairlearn toolkit was used to mitigate gender and payor biases to promote fairness in predictions. RESULTS: The model achieved an Area Under the Receiver Operating characteristic Curve (AUROC) of 0.738 (95% confidence interval, 0.724 to 0.754) and an Area Under the Precision-Recall Curve (AUPRC) of 0.406 (95% confidence interval, 0.384 to 0.433). Considering an outcome prevalence of 16%, these metrics indicate the model's ability to accurately discriminate between readmission and non-readmission cases within the context of total arthroplasty surgeries while adjusting patient scores in the model to mitigate bias based on patient gender and payor. CONCLUSION: This work culminated in a model that identifies the most predictive and protective features associated with the combined endpoint. This model serves as a tool to empower healthcare providers to proactively intervene based on these influential factors without introducing bias towards protected patient classes, effectively mitigating the risk of negative outcomes and ultimately improving quality of care regardless of socioeconomic factors.


Subject(s)
Cost-Benefit Analysis , Machine Learning , Patient Readmission , Humans , Patient Readmission/economics , Patient Readmission/statistics & numerical data , Female , Male , Aged , Natural Language Processing , Middle Aged , Arthroplasty, Replacement, Knee/economics , Arthroplasty, Replacement, Hip/economics , Arthroplasty, Replacement/economics , Arthroplasty, Replacement/adverse effects , Risk Assessment/methods , Preoperative Period , Aged, 80 and over , Quality Improvement , Random Forest
9.
J Med Internet Res ; 26: e52499, 2024 May 02.
Article in English | MEDLINE | ID: mdl-38696245

ABSTRACT

This study explores the potential of using large language models to assist content analysis by conducting a case study to identify adverse events (AEs) in social media posts. The case study compares ChatGPT's performance with human annotators' in detecting AEs associated with delta-8-tetrahydrocannabinol, a cannabis-derived product. Using the identical instructions given to human annotators, ChatGPT closely approximated human results, with a high degree of agreement noted: 94.4% (9436/10,000) for any AE detection (Fleiss κ=0.95) and 99.3% (9931/10,000) for serious AEs (κ=0.96). These findings suggest that ChatGPT has the potential to replicate human annotation accurately and efficiently. The study recognizes possible limitations, including concerns about the generalizability due to ChatGPT's training data, and prompts further research with different models, data sources, and content analysis tasks. The study highlights the promise of large language models for enhancing the efficiency of biomedical research.


Subject(s)
Social Media , Humans , Social Media/statistics & numerical data , Dronabinol/adverse effects , Natural Language Processing
10.
J Med Syst ; 48(1): 51, 2024 May 16.
Article in English | MEDLINE | ID: mdl-38753223

ABSTRACT

Reports from spontaneous reporting systems (SRS) are hypothesis generating. Additional evidence such as more reports is required to determine whether the generated drug-event associations are in fact safety signals. However, underreporting of adverse drug reactions (ADRs) delays signal detection. Through the use of natural language processing, different sources of real-world data can be used to proactively collect additional evidence for potential safety signals. This study aims to explore the feasibility of using Electronic Health Records (EHRs) to identify additional cases based on initial indications from spontaneous ADR reports, with the goal of strengthening the evidence base for potential safety signals. For two confirmed and two potential signals generated by the SRS of the Netherlands Pharmacovigilance Centre Lareb, targeted searches in the EHR of the Leiden University Medical Centre were performed using a text-mining based tool, CTcue. The search for additional cases was done by constructing and running queries in the structured and free-text fields of the EHRs. We identified at least five additional cases for the confirmed signals and one additional case for each potential safety signal. The majority of the identified cases for the confirmed signals were documented in the EHRs before signal detection by the Dutch Medicines Evaluation Board. The identified cases for the potential signals were reported to Lareb as further evidence for signal detection. Our findings highlight the feasibility of performing targeted searches in the EHR based on an underlying hypothesis to provide further evidence for signal generation.


Subject(s)
Adverse Drug Reaction Reporting Systems , Electronic Health Records , Pharmacovigilance , Electronic Health Records/organization & administration , Humans , Adverse Drug Reaction Reporting Systems/organization & administration , Netherlands , Natural Language Processing , Drug-Related Side Effects and Adverse Reactions/prevention & control , Data Mining/methods
11.
PLoS One ; 19(5): e0302502, 2024.
Article in English | MEDLINE | ID: mdl-38743773

ABSTRACT

ChatGPT has demonstrated impressive abilities and impacted various aspects of human society since its creation, gaining widespread attention from different social spheres. This study aims to comprehensively assess public perception of ChatGPT on Reddit. The dataset was collected via Reddit, a social media platform, and includes 23,733 posts and comments related to ChatGPT. Firstly, to examine public attitudes, this study conducts content analysis utilizing topic modeling with the Latent Dirichlet Allocation (LDA) algorithm to extract pertinent topics. Furthermore, sentiment analysis categorizes user posts and comments as positive, negative, or neutral using Textblob and Vader in natural language processing. The result of topic modeling shows that seven topics regarding ChatGPT are identified, which can be grouped into three themes: user perception, technical methods, and impacts on society. Results from the sentiment analysis show that 61.6% of the posts and comments hold favorable opinions on ChatGPT. They emphasize ChatGPT's ability to prompt and engage in natural conversations with users, without relying on complex natural language processing. It provides suggestions for ChatGPT developers to enhance its usability design and functionality. Meanwhile, stakeholders, including users, should comprehend the advantages and disadvantages of ChatGPT in human society to promote ethical and regulated implementation of the system.


Subject(s)
Public Opinion , Social Media , Humans , Natural Language Processing , Unsupervised Machine Learning , Attitude , Algorithms
12.
PLoS One ; 19(5): e0301682, 2024.
Article in English | MEDLINE | ID: mdl-38768143

ABSTRACT

AIMS: Alcohol cravings are considered a major factor in relapse among individuals with alcohol use disorder (AUD). This study aims to investigate the frequency and triggers of cravings in the daily lives of people with alcohol-related issues. Large amounts of data are analyzed with Artificial Intelligence (AI) methods to identify possible groupings and patterns. METHODS: For the analysis, posts from the online forum "stopdrinking" on the Reddit platform were used as the dataset from April 2017 to April 2022. The posts were filtered for craving content and processed using the word2vec method to map them into a multi-dimensional vector space. Statistical analyses were conducted to calculate the nature and frequency of craving contexts and triggers (location, time, social environment, and emotions) using word similarity scores. Additionally, the themes of the craving-related posts were semantically grouped using a Latent Dirichlet Allocation (LDA) topic model. The accuracy of the results was evaluated using two manually created test datasets. RESULTS: Approximately 16% of the forum posts discuss cravings. The number of craving-related posts decreases exponentially with the number of days since the author's last alcoholic drink. The topic model confirms that the majority of posts involve individual factors and triggers of cravings. The context analysis aligns with previous craving trigger findings related to the social environment, locations and emotions. Strong semantic craving similarities were found for the emotions boredom, stress and the location airport. The results for each method were successfully validated on test datasets. CONCLUSIONS: This exploratory approach is the first to analyze alcohol cravings in the daily lives of over 24,000 individuals, providing a foundation for further AI-based craving analyses. The analysis confirms commonly known craving triggers and even discovers new important craving contexts.


Subject(s)
Behavior, Addictive , Craving , Natural Language Processing , Humans , Craving/physiology , Behavior, Addictive/psychology , Alcoholism/psychology , Emotions/physiology , Artificial Intelligence , Social Media
13.
Int J Public Health ; 69: 1606855, 2024.
Article in English | MEDLINE | ID: mdl-38770181

ABSTRACT

Objectives: Suicide risk is elevated in lesbian, gay, bisexual, and transgender (LGBT) individuals. Limited data on LGBT status in healthcare systems hinder our understanding of this risk. This study used natural language processing to extract LGBT status and a deep neural network (DNN) to examine suicidal death risk factors among US Veterans. Methods: Data on 8.8 million veterans with visits between 2010 and 2017 was used. A case-control study was performed, and suicide death risk was analyzed by a DNN. Feature impacts and interactions on the outcome were evaluated. Results: The crude suicide mortality rate was higher in LGBT patients. However, after adjusting for over 200 risk and protective factors, known LGBT status was associated with reduced risk compared to LGBT-Unknown status. Among LGBT patients, black, female, married, and older Veterans have a higher risk, while Veterans of various religions have a lower risk. Conclusion: Our results suggest that disclosed LGBT status is not directly associated with an increase suicide death risk, however, other factors (e.g., depression and anxiety caused by stigma) are associated with suicide death risks.


Subject(s)
Artificial Intelligence , Sexual and Gender Minorities , Suicide , Veterans , Humans , Male , Female , Sexual and Gender Minorities/statistics & numerical data , Sexual and Gender Minorities/psychology , Middle Aged , Case-Control Studies , Suicide/statistics & numerical data , Veterans/psychology , Veterans/statistics & numerical data , United States/epidemiology , Adult , Risk Factors , Aged , Natural Language Processing
14.
Stud Health Technol Inform ; 314: 98-102, 2024 May 23.
Article in English | MEDLINE | ID: mdl-38785011

ABSTRACT

This paper explores the potential of leveraging electronic health records (EHRs) for personalized health research through the application of artificial intelligence (AI) techniques, specifically Named Entity Recognition (NER). By extracting crucial patient information from clinical texts, including diagnoses, medications, symptoms, and lab tests, AI facilitates the rapid identification of relevant data, paving the way for future care paradigms. The study focuses on Non-small cell lung cancer (NSCLC) in Italian clinical notes, introducing a novel set of 29 clinical entities that include both presence or absence (negation) of relevant information associated with NSCLC. Using a state-of-the-art model pretrained on Italian biomedical texts, we achieve promising results (average F1-score of 80.8%), demonstrating the feasibility of employing AI for extracting biomedical information in the Italian language.


Subject(s)
Artificial Intelligence , Electronic Health Records , Lung Neoplasms , Natural Language Processing , Italy , Humans , Lung Neoplasms/diagnosis , Carcinoma, Non-Small-Cell Lung/diagnosis , Data Mining/methods
15.
Stud Health Technol Inform ; 314: 93-97, 2024 May 23.
Article in English | MEDLINE | ID: mdl-38785010

ABSTRACT

Inconsistent disease coding standards in medicine create hurdles in data exchange and analysis. This paper proposes a machine learning system to address this challenge. The system automatically matches unstructured medical text (doctor notes, complaints) to ICD-10 codes. It leverages a unique architecture featuring a training layer for model development and a knowledge base that captures relationships between symptoms and diseases. Experiments using data from a large medical research center demonstrated the system's effectiveness in disease classification prediction. Logistic regression emerged as the optimal model due to its superior processing speed, achieving an accuracy of 81.07% with acceptable error rates during high-load testing. This approach offers a promising solution to improve healthcare informatics by overcoming coding standard incompatibility and automating code prediction from unstructured medical text.


Subject(s)
Electronic Health Records , International Classification of Diseases , Machine Learning , Natural Language Processing , Humans , Clinical Coding
16.
JMIR Ment Health ; 11: e57234, 2024 May 16.
Article in English | MEDLINE | ID: mdl-38771256

ABSTRACT

Background: Rates of suicide have increased by over 35% since 1999. Despite concerted efforts, our ability to predict, explain, or treat suicide risk has not significantly improved over the past 50 years. Objective: The aim of this study was to use large language models to understand natural language use during public web-based discussions (on Reddit) around topics related to suicidality. Methods: We used large language model-based sentence embedding to extract the latent linguistic dimensions of user postings derived from several mental health-related subreddits, with a focus on suicidality. We then applied dimensionality reduction to these sentence embeddings, allowing them to be summarized and visualized in a lower-dimensional Euclidean space for further downstream analyses. We analyzed 2.9 million posts extracted from 30 subreddits, including r/SuicideWatch, between October 1 and December 31, 2022, and the same period in 2010. Results: Our results showed that, in line with existing theories of suicide, posters in the suicidality community (r/SuicideWatch) predominantly wrote about feelings of disconnection, burdensomeness, hopeless, desperation, resignation, and trauma. Further, we identified distinct latent linguistic dimensions (well-being, seeking support, and severity of distress) among all mental health subreddits, and many of the resulting subreddit clusters were in line with a statistically driven diagnostic classification system-namely, the Hierarchical Taxonomy of Psychopathology (HiTOP)-by mapping onto the proposed superspectra. Conclusions: Overall, our findings provide data-driven support for several language-based theories of suicide, as well as dimensional classification systems for mental health disorders. Ultimately, this novel combination of natural language processing techniques can assist researchers in gaining deeper insights about emotions and experiences shared on the web and may aid in the validation and refutation of different mental health theories.


Subject(s)
Linguistics , Mental Disorders , Social Media , Suicide , Humans , Social Media/statistics & numerical data , Suicide/psychology , Mental Disorders/psychology , Mental Disorders/epidemiology , Mental Disorders/classification , Natural Language Processing
17.
PLoS One ; 19(5): e0304057, 2024.
Article in English | MEDLINE | ID: mdl-38787837

ABSTRACT

Automatic Text Summarization (ATS) is gaining popularity as there is a growing demand for a system capable of processing extensive textual content and delivering a concise, yet meaningful, relevant, and useful summary. Manual summarization is both expensive and time-consuming, making it impractical for humans to handle vast amounts of data. Consequently, the need for ATS systems has become evident. These systems encounter challenges such as ensuring comprehensive content coverage, determining the appropriate length of the summary, addressing redundancy, and maintaining coherence in the generated summary. Researchers are actively addressing these challenges by employing Natural Language Processing (NLP) techniques. While traditional methods exist for generating summaries, they often fall short of addressing multiple aspects simultaneously. To overcome this limitation, recent advancements have introduced multi-objective evolutionary algorithms for ATS. This study proposes an enhancement to the performance of ATS through the utilization of an improved version of the Binary Multi-Objective Grey Wolf Optimizer (BMOGWO), incorporating mutation. The performance of this enhanced algorithm is assessed by comparing it with state-of-the-art algorithms using the DUC2002 dataset. Experimental results demonstrate that the proposed algorithm significantly outperforms the compared approaches.


Subject(s)
Algorithms , Natural Language Processing , Humans , Mutation
18.
Clin Imaging ; 110: 110164, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38691911

ABSTRACT

Natural Language Processing (NLP), a form of Artificial Intelligence, allows free-text based clinical documentation to be integrated in ways that facilitate data analysis, data interpretation and formation of individualized medical and obstetrical care. In this cross-sectional study, we identified all births during the study period carrying the radiology-confirmed diagnosis of fibroid uterus in pregnancy (defined as size of largest diameter of >5 cm) by using an NLP platform and compared it to non-NLP derived data using ICD10 codes of the same diagnosis. We then compared the two sets of data and stratified documentation gaps by race. Using fibroid uterus in pregnancy as a marker, we found that Black patients were more likely to have the diagnosis entered late into the patient's chart or had missing documentation of the diagnosis. With appropriate algorithm definitions, cross referencing and thorough validation steps, NLP can contribute to identifying areas of documentation gaps and improve quality of care.


Subject(s)
Documentation , Natural Language Processing , Uterine Neoplasms , Humans , Female , Pregnancy , Cross-Sectional Studies , Documentation/standards , Documentation/statistics & numerical data , Uterine Neoplasms/diagnostic imaging , Racism , Leiomyoma/diagnostic imaging , Adult , Obstetrics , Pregnancy Complications, Neoplastic/diagnostic imaging
19.
Sci Data ; 11(1): 455, 2024 May 04.
Article in English | MEDLINE | ID: mdl-38704422

ABSTRACT

Due to the complexity of the biomedical domain, the ability to capture semantically meaningful representations of terms in context is a long-standing challenge. Despite important progress in the past years, no evaluation benchmark has been developed to evaluate how well language models represent biomedical concepts according to their corresponding context. Inspired by the Word-in-Context (WiC) benchmark, in which word sense disambiguation is reformulated as a binary classification task, we propose a novel dataset, BioWiC, to evaluate the ability of language models to encode biomedical terms in context. BioWiC comprises 20'156 instances, covering over 7'400 unique biomedical terms, making it the largest WiC dataset in the biomedical domain. We evaluate BioWiC both intrinsically and extrinsically and show that it could be used as a reliable benchmark for evaluating context-dependent embeddings in biomedical corpora. In addition, we conduct several experiments using a variety of discriminative and generative large language models to establish robust baselines that can serve as a foundation for future research.


Subject(s)
Natural Language Processing , Semantics , Language
20.
J Med Internet Res ; 26: e55676, 2024 May 28.
Article in English | MEDLINE | ID: mdl-38805692

ABSTRACT

BACKGROUND: Clinical natural language processing (NLP) researchers need access to directly comparable evaluation results for applications such as text deidentification across a range of corpus types and the means to easily test new systems or corpora within the same framework. Current systems, reported metrics, and the personally identifiable information (PII) categories evaluated are not easily comparable. OBJECTIVE: This study presents an open-source and extensible end-to-end framework for comparing clinical NLP system performance across corpora even when the annotation categories do not align. METHODS: As a use case for this framework, we use 6 off-the-shelf text deidentification systems (ie, CliniDeID, deid from PhysioNet, MITRE Identity Scrubber Toolkit [MIST], NeuroNER, National Library of Medicine [NLM] Scrubber, and Philter) across 3 standard clinical text corpora for the task (2 of which are publicly available) and 1 private corpus (all in English), with annotation categories that are not directly analogous. The framework is built on shell scripts that can be extended to include new systems, corpora, and performance metrics. We present this open tool, multiple means for aligning PII categories during evaluation, and our initial timing and performance metric findings. Code for running this framework with all settings needed to run all pairs are available via Codeberg and GitHub. RESULTS: From this case study, we found large differences in processing speed between systems. The fastest system (ie, MIST) processed an average of 24.57 (SD 26.23) notes per second, while the slowest (ie, CliniDeID) processed an average of 1.00 notes per second. No system uniformly outperformed the others at identifying PII across corpora and categories. Instead, a rich tapestry of performance trade-offs emerged for PII categories. CliniDeID and Philter prioritize recall over precision (with an average recall 6.9 and 11.2 points higher, respectively, for partially matching spans of text matching any PII category), while the other 4 systems consistently have higher precision (with MIST's precision scoring 20.2 points higher, NLM Scrubber scoring 4.4 points higher, NeuroNER scoring 7.2 points higher, and deid scoring 17.1 points higher). The macroaverage recall across corpora for identifying names, one of the more sensitive PII categories, included deid (48.8%) and MIST (66.9%) at the low end and NeuroNER (84.1%), NLM Scrubber (88.1%), and CliniDeID (95.9%) at the high end. A variety of metrics across categories and corpora are reported with a wider variety (eg, F2-score) available via the tool. CONCLUSIONS: NLP systems in general and deidentification systems and corpora in our use case tend to be evaluated in stand-alone research articles that only include a limited set of comparators. We hold that a single evaluation pipeline across multiple systems and corpora allows for more nuanced comparisons. Our open pipeline should reduce barriers to evaluation and system advancement.


Subject(s)
Natural Language Processing , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...