Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 35
Filter
1.
bioRxiv ; 2024 May 24.
Article in English | MEDLINE | ID: mdl-38826258

ABSTRACT

This article describes the Cell Maps for Artificial Intelligence (CM4AI) project and its goals, methods, standards, current datasets, software tools , status, and future directions. CM4AI is the Functional Genomics Data Generation Project in the U.S. National Institute of Health's (NIH) Bridge2AI program. Its overarching mission is to produce ethical, AI-ready datasets of cell architecture, inferred from multimodal data collected for human cell lines, to enable transformative biomedical AI research.

2.
J Biomed Inform ; 154: 104654, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38740316

ABSTRACT

OBJECTIVES: We evaluated methods for preparing electronic health record data to reduce bias before applying artificial intelligence (AI). METHODS: We created methods for transforming raw data into a data framework for applying machine learning and natural language processing techniques for predicting falls and fractures. Strategies such as inclusion and reporting for multiple races, mixed data sources such as outpatient, inpatient, structured codes, and unstructured notes, and addressing missingness were applied to raw data to promote a reduction in bias. The raw data was carefully curated using validated definitions to create data variables such as age, race, gender, and healthcare utilization. For the formation of these variables, clinical, statistical, and data expertise were used. The research team included a variety of experts with diverse professional and demographic backgrounds to include diverse perspectives. RESULTS: For the prediction of falls, information extracted from radiology reports was converted to a matrix for applying machine learning. The processing of the data resulted in an input of 5,377,673 reports to the machine learning algorithm, out of which 45,304 were flagged as positive and 5,332,369 as negative for falls. Processed data resulted in lower missingness and a better representation of race and diagnosis codes. For fractures, specialized algorithms extracted snippets of text around keywork "femoral" from dual x-ray absorptiometry (DXA) scans to identify femoral neck T-scores that are important for predicting fracture risk. The natural language processing algorithms yielded 98% accuracy and 2% error rate The methods to prepare data for input to artificial intelligence processes are reproducible and can be applied to other studies. CONCLUSION: The life cycle of data from raw to analytic form includes data governance, cleaning, management, and analysis. When applying artificial intelligence methods, input data must be prepared optimally to reduce algorithmic bias, as biased output is harmful. Building AI-ready data frameworks that improve efficiency can contribute to transparency and reproducibility. The roadmap for the application of AI involves applying specialized techniques to input data, some of which are suggested here. This study highlights data curation aspects to be considered when preparing data for the application of artificial intelligence to reduce bias.


Subject(s)
Accidental Falls , Algorithms , Artificial Intelligence , Electronic Health Records , Machine Learning , Natural Language Processing , Humans , Accidental Falls/prevention & control , Fractures, Bone , Female
3.
Bull World Health Organ ; 102(1): 32-45, 2024 Jan 01.
Article in English | MEDLINE | ID: mdl-38164328

ABSTRACT

Objective: To assess spatiotemporal trends in, and determinants of, the acceptance of coronavirus disease 2019 (COVID-19) vaccination globally, as expressed on the social media platform X (formerly Twitter). Methods: We collected over 13 million posts on the platform regarding COVID-19 vaccination made between November 2020 and March 2022 in 90 languages. Multilingual deep learning XLM-RoBERTa models annotated all posts using an annotation framework after being fine-tuned on 8125 manually annotated, English-language posts. The annotation results were used to assess spatiotemporal trends in COVID-19 vaccine acceptance and confidence as expressed by platform users in 135 countries and territories. We identified associations between spatiotemporal trends in vaccine acceptance and country-level characteristics and public policies by using univariate and multivariate regression analysis. Findings: A greater proportion of platform users in the World Health Organization's South-East Asia, Eastern Mediterranean and Western Pacific Regions expressed vaccine acceptance than users in the rest of the world. Countries in which a greater proportion of platform users expressed vaccine acceptance had higher COVID-19 vaccine coverage rates. Trust in government was also associated with greater vaccine acceptance. Internationally, vaccine acceptance and confidence declined among platform users as: (i) vaccination eligibility was extended to adolescents; (ii) vaccine supplies became sufficient; (iii) nonpharmaceutical interventions were relaxed; and (iv) global reports on adverse events following vaccination appeared. Conclusion: Social media listening could provide an effective and expeditious means of informing public health policies during pandemics, and could supplement existing public health surveillance approaches in addressing global health issues.


Subject(s)
COVID-19 , Social Media , Humans , Adolescent , COVID-19/epidemiology , COVID-19/prevention & control , COVID-19 Vaccines , Vaccination , Attitude
5.
J Am Med Inform Assoc ; 31(3): 727-731, 2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38146986

ABSTRACT

OBJECTIVES: Clinical text processing offers a promising avenue for improving multiple aspects of healthcare, though operational deployment remains a substantial challenge. This case report details the implementation of a national clinical text processing infrastructure within the Department of Veterans Affairs (VA). METHODS: Two foundational use cases, cancer case management and suicide and overdose prevention, illustrate how text processing can be practically implemented at scale for diverse clinical applications using shared services. RESULTS: Insights from these use cases underline both commonalities and differences, providing a replicable model for future text processing applications. CONCLUSIONS: This project enables more efficient initiation, testing, and future deployment of text processing models, streamlining the integration of these use cases into healthcare operations. This project implementation is in a large integrated health delivery system in the United States, but we expect the lessons learned to be relevant to any health system, including smaller local and regional health systems in the United States.


Subject(s)
Suicide , Veterans , Humans , United States , United States Department of Veterans Affairs , Delivery of Health Care , Case Management
6.
medRxiv ; 2023 May 30.
Article in English | MEDLINE | ID: mdl-37398113

ABSTRACT

Objectives: Evaluating methods for building data frameworks for application of AI in large scale datasets for women's health studies. Methods: We created methods for transforming raw data to a data framework for applying machine learning (ML) and natural language processing (NLP) techniques for predicting falls and fractures. Results: Prediction of falls was higher in women compared to men. Information extracted from radiology reports was converted to a matrix for applying machine learning. For fractures, by applying specialized algorithms, we extracted snippets from dual x-ray absorptiometry (DXA) scans for meaningful terms usable for predicting fracture risk. Discussion: Life cycle of data from raw to analytic form includes data governance, cleaning, management, and analysis. For applying AI, data must be prepared optimally to reduce algorithmic bias. Conclusion: Algorithmic bias is harmful for research using AI methods. Building AI ready data frameworks that improve efficiency can be especially valuable for women's health.

7.
PEC Innov ; 2: 100161, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37384151

ABSTRACT

Objective: Identify how patients and clinicians incorporate patient-centered communication (PCC) within secure messaging. Methods: A random sample of 199 secure messages from patient portal communication between patients and clinicians were collected and analyzed. Via manual annotation, the task of tagging target words/phrases in text, we identified five components of PCC: information giving, information seeking, emotional support, partnership, and shared decision-making. Textual analysis was also performed to understand the context of PCC expressions within messages. Results: Information-giving was the predominant (n = 346, 68.1%) PCC category used in secure messaging, more than double of the other four PCC codes, information-seeking (n = 82, 16.1%), emotional support (n = 52, 10.2%), shared decision making (n = 5, 1.0%), combined. The textual analysis revealed that clinicians informed patients about appointment reminders and new protocols while patients reminded clinicians about upcoming procedures and outcomes of test results conducted by other clinicians. Although less common, patients expressed statements of concern, uncertainty, and fear; enabling clinicians to provide support. Conclusion: Secure messaging is mainly used for exchanging information, but other aspects of PCC emerge using this channel of communication. Innovation: Meaningful discussions can occur via secure messaging, and clinicians should be mindful of incorporating PCC when communicating with patients through secure messaging.

8.
J Integr Complement Med ; 29(6-7): 420-429, 2023.
Article in English | MEDLINE | ID: mdl-36971840

ABSTRACT

Background: Complementary and integrative health (CIH) approaches have been recommended in national and international clinical guidelines for chronic pain management. We set out to determine whether exposure to CIH approaches is associated with pain care quality (PCQ) in the Veterans Health Administration (VHA) primary care setting. Methods: We followed a cohort of 62,721 Veterans with newly diagnosed musculoskeletal disorders between October 2016 and September 2017 over 1-year. PCQ scores were derived from primary care progress notes using natural language processing. CIH exposure was defined as documentation of acupuncture, chiropractic or massage therapies by providers. Propensity scores (PSs) were used to match one control for each Veteran with CIH exposure. Generalized estimating equations were used to examine associations between CIH exposure and PCQ scores, accounting for potential selection and confounding bias. Results: CIH was documented for 14,114 (22.5%) Veterans over 16,015 primary care clinic visits during the follow-up period. The CIH exposure group and the 1:1 PS-matched control group achieved superior balance on all measured baseline covariates, with standardized differences ranging from 0.000 to 0.045. CIH exposure was associated with an adjusted rate ratio (aRR) of 1.147 (95% confidence interval [CI]: 1.142, 1.151) on PCQ total score (mean: 8.36). Sensitivity analyses using an alternative PCQ scoring algorithm (aRR: 1.155; 95% CI: 1.150-1.160) and redefining CIH exposure by chiropractic alone (aRR: 1.118; 95% CI: 1.110-1.126) derived consistent results. Discussion: Our data suggest that incorporating CIH approaches may reflect higher overall quality of care for patients with musculoskeletal pain seen in primary care settings, supporting VHA initiatives and the Declaration of Astana to build comprehensive, sustainable primary care capacity for pain management. Future investigation is warranted to better understand whether and to what degree the observed association may reflect the therapeutic benefits patients actually received or other factors such as empowering provider-patient education and communication about these approaches.


Subject(s)
Chronic Pain , Complementary Therapies , Humans , Veterans Health , Chronic Pain/diagnosis , Chronic Pain/drug therapy , Complementary Therapies/methods , Quality of Health Care , Primary Health Care
9.
PLoS One ; 18(1): e0279163, 2023.
Article in English | MEDLINE | ID: mdl-36598881

ABSTRACT

OBJECTIVES: Understand the continuity and changes in headache not-otherwise-specified (NOS), migraine, and post-traumatic headache (PTH) diagnoses after the transition from ICD-9-CM to ICD-10-CM in the Veterans Health Administration (VHA). BACKGROUND: Headache is one of the most commonly diagnosed chronic conditions managed within primary and specialty care clinics. The VHA transitioned from ICD-9-CM to ICD-10-CM on October-1-2015. The effect transitioning on coding of specific headache diagnoses is unknown. Accuracy of headache diagnosis is important since different headache types respond to different treatments. METHODS: We mapped headache diagnoses from ICD-9-CM (FY 2014/2015) onto ICD-10-CM (FY 2016/2017) and computed coding proportions two years before/after the transition in VHA. We used queries to determine the change in transition pathways. We report the odds of ICD-10-CM coding associated with ICD-9-CM controlling for provider type, and patient age, sex, and race/ethnicity. RESULTS: Only 37%, 58% and 34% of patients with ICD-9-CM coding of NOS, migraine, and PTH respectively had an ICD-10-CM headache diagnosis. Of those with an ICD-10-CM diagnosis, 73-79% had a single headache diagnosis. The odds ratios for receiving the same code in both ICD-9-CM and ICD-10-CM after adjustment for ICD-9-CM and ICD-10-CM headache comorbidities and sociodemographic factors were high (range 6-26) and statistically significant. Specifically, 75% of patients with headache NOS had received one headache diagnoses (Adjusted headache NOS-ICD-9-CM OR for headache NOS-ICD-10-CM = 6.1, 95% CI 5.89-6.32. 79% of migraineurs had one headache diagnoses, mostly migraine (Adjusted migraine-ICD-9-CM OR for migraine-ICD-10-CM = 26.43, 95% CI 25.51-27.38). The same held true for PTH (Adjusted PTH-ICD-9-CM OR for PTH-ICD-10-CM = 22.92, 95% CI: 18.97-27.68). These strong associations remained after adjustment for specialist care in ICD-10-CM follow-up period. DISCUSSION: The majority of people with ICD-9-CM headache diagnoses did not have an ICD-10-CM headache diagnosis. However, a given diagnosis in ICD-9-CM by a primary care provider (PCP) was significantly predictive of its assignment in ICD-10-CM as was seeing either a neurologist or physiatrist (compared to a generalist) for an ICD-10-CM headache diagnosis. CONCLUSION: When a veteran had a specific diagnosis in ICD-9-CM, the odds of being coded with the same diagnosis in ICD-10-CM were significantly higher. Specialist visit during the ICD-10-CM period was independently associated with all three ICD-10-CM headaches.


Subject(s)
Migraine Disorders , Post-Traumatic Headache , Veterans , Humans , International Classification of Diseases , Veterans Health , Headache/epidemiology , Migraine Disorders/diagnosis , Migraine Disorders/epidemiology , Comorbidity
10.
J Pain ; 24(2): 273-281, 2023 02.
Article in English | MEDLINE | ID: mdl-36167230

ABSTRACT

Prior research has demonstrated disparities in general medical care for patients with mental health conditions, but little is known about disparities in pain care. The objective of this retrospective cohort study was to determine whether mental health conditions are associated with indicators of pain care quality (PCQ) as documented by primary care clinicians in the Veterans Health Administration (VHA). We used natural language processing to analyze electronic health record data from a national sample of Veterans with moderate to severe musculoskeletal pain during primary care visits in the Fiscal Year 2017. Twelve PCQ indicators were annotated from clinician progress notes as present or absent; PCQ score was defined as the sum of these indicators. Generalized estimating equation Poisson models examined associations among mental health diagnosis categories and PCQ scores. The overall mean PCQ score across 135,408 person-visits was 8.4 (SD = 2.3). In the final adjusted model, post-traumatic stress disorder was associated with higher PCQ scores (RR = 1.006, 95%CI 1.002-1.010, P = .007). Depression, alcohol use disorder, other substance use disorder, schizophrenia, and bipolar disorder diagnoses were not associated with PCQ scores. Overall, results suggest that in this patient population, presence of a mental health condition is not associated with lower quality pain care. PERSPECTIVE: This study used a natural language processing approach to analyze medical records to determine whether mental health conditions are associated with indicators of pain care quality as documented by primary care clinicians. Findings suggest that presence of a diagnosed mental health condition is not associated with lower quality pain care.


Subject(s)
Chronic Pain , Veterans , United States/epidemiology , Humans , Veterans/psychology , Veterans Health , Electronic Health Records , Retrospective Studies , Mental Health , United States Department of Veterans Affairs , Quality of Health Care , Chronic Pain/epidemiology , Primary Health Care
12.
Pain ; 163(6): e715-e724, 2022 06 01.
Article in English | MEDLINE | ID: mdl-34724683

ABSTRACT

ABSTRACT: The lack of a reliable approach to assess quality of pain care hinders quality improvement initiatives. Rule-based natural language processing algorithms were used to extract pain care quality (PCQ) indicators from documents of Veterans Health Administration primary care providers for veterans diagnosed within the past year with musculoskeletal disorders with moderate-to-severe pain intensity across 2 time periods 2013 to 2014 (fiscal year [FY] 2013) and 2017 to 2018 (FY 2017). Patterns of documentation of PCQ indicators for 64,444 veterans and 124,408 unique visits (FY 2013) and 63,427 veterans and 146,507 visits (FY 2017) are described. The most commonly documented PCQ indicators in each cohort were presence of pain, etiology or source, and site of pain (greater than 90% of progress notes), while least commonly documented were sensation, what makes pain better or worse, and pain's impact on function (documented in fewer than 50%). A PCQ indicator score (maximum = 12) was calculated for each visit in FY 2013 (mean = 7.8, SD = 1.9) and FY 2017 (mean = 8.3, SD = 2.3) by adding one point for every indicator documented. Standardized Cronbach alpha for total PCQ scores was 0.74 in the most recent data (FY 2017). The mean PCQ indicator scores across patient characteristics and types of healthcare facilities were highly stable. Estimates of the frequency of documentation of PCQ indicators have face validity and encourage further evaluation of the reliability, validity, and utility of the measure. A reliable measure of PCQ fills an important scientific knowledge and practice gap.


Subject(s)
Veterans Health , Veterans , Humans , Pain , Primary Health Care , Quality of Health Care , Reproducibility of Results , United States , United States Department of Veterans Affairs
13.
Health Technol (Berl) ; 11(5): 1073-1082, 2021.
Article in English | MEDLINE | ID: mdl-34414063

ABSTRACT

The COVID-19 pandemic has presented many unique challenges to patient care especially in emergency medicine. These challenges result in an altered patient experience. Patient experience refers to the cumulative impression made on patients during their medical visit and is measured by a standardized survey tool. Patient experience is considered a key measure of quality of care. The volume of survey data received makes it difficult to spot trends and concerns in patient comments. Topic modeling and sentiment analysis are well documented analytic techniques that can be used to gain insight into patient experience and make sense of vast quantities of data. This study examined three periods of time, pre, during and post-COVID-19 first wave in order to identify key trends in sentiment and topics related to patient experience. Previously collected, anonymized Press Ganey (PG) survey data was used from three northeastern emergency department that make up an academic emergency department. Data was collected for three contiguous time periods: Pre-COVID-19 (12/10/2019- 3/10/2020), During COVID-19: (3/11/2020-6/10/2020), and Post-first wave COVID-19 (6/11/2020- 9/10/2020). Preprocessing of the data was carried out then a sentiment label (i.e., positive, negative, neutral, mixed) was assigned by the tool. These labels were used to assess the validity of Press Ganey labels. Next, a topic modeling approach from machine learning was used to analyze the contents of the patient comments and uncover concerns and perceptions of patient experiences. Themes that emerged from the analysis of patient comments included concerns over personal safety and exposure to the virus, exclusion of family from decision making and care and high levels of scrutiny over systems issues, care, and treatment protocols. Topic modeling showed shifting priorities and concerns throughout the three periods examined. Prior to the pandemic, patient comments were largely positive and focused on technical expertise and perceptions of competence. New topics and concerns that patients reported relevant to the pandemic were identified during-COVID-19. Comments on systems issues regarding processes to limit viral spread and concerns over family/visitor restrictions were dominant. Although there was evidence of praise and appreciation of the efforts of staff there was also a high level of scrutiny of the processes encountered during the emergency visit. Sentiment analysis and topic modeling offer a unique method for organizing and analyzing the shifting concerns of patients and families. Suggestions of interventions are made to address these evolving concerns. The automation of analysis using artificial intelligence would allow for rapid and accurate analysis of patient feedback.

14.
Comput Biol Med ; 132: 104336, 2021 05.
Article in English | MEDLINE | ID: mdl-33761419

ABSTRACT

OBJECTIVE: We sought to understand spatial-temporal factors and socioeconomic disparities that shaped U.S. residents' response to COVID-19 as it emerged. METHODS: We mined coronavirus-related tweets from January 23rd to March 25th, 2020. We classified tweets by the socioeconomic status of the county from which they originated with the Area Deprivation Index (ADI). We applied topic modeling to identify and monitor topics of concern over time. We investigated how topics varied by ADI and between hotspots and non-hotspots. RESULTS: We identified 45 topics in 269,556 unique tweets. Topics shifted from early-outbreak-related content in January, to the presidential election and governmental response in February, to lifestyle impacts in March. High-resourced areas (low ADI) were concerned with stocks and social distancing, while under-resourced areas shared negative expression and discussion of the CARES Act relief package. These differences were consistent within hotspots, with increased discussion regarding employment in high ADI hotspots. DISCUSSION: Topic modeling captures major concerns on Twitter in the early months of COVID-19. Our study extends previous Twitter-based research as it assesses how topics differ based on a marker of socioeconomic status. Comparisons between low and high-resourced areas indicate more focus on personal economic hardship in less-resourced communities and less focus on general public health messaging. CONCLUSION: Real-time social media analysis of community-based pandemic responses can uncover differential conversations correlating to local impact and income, education, and housing disparities. In future public health crises, such insights can inform messaging campaigns, which should partly focus on the interests of those most disproportionately impacted.


Subject(s)
COVID-19 , Social Media , Humans , Pandemics , SARS-CoV-2 , Socioeconomic Factors
15.
Comput Biol Med ; 129: 104132, 2021 02.
Article in English | MEDLINE | ID: mdl-33290931

ABSTRACT

BACKGROUND: Opioid misuse (OM) is a major health problem in the United States, and can lead to addiction and fatal overdose. We sought to employ natural language processing (NLP) and machine learning to categorize Twitter chatter based on the motive of OM. MATERIALS AND METHODS: We collected data from Twitter using opioid-related keywords, and manually annotated 6988 tweets into three classes-No-OM, Pain-related-OM, and Recreational-OM-with the No-OM class representing tweets indicating no use/misuse, and the Pain-related misuse and Recreational-misuse classes representing misuse for pain or recreation/addiction. We trained and evaluated multi-class classifiers, and performed term-level k-means clustering to assess whether there were terms closely associated with the three classes. RESULTS: On a held-out test set of 1677 tweets, a transformer-based classifier (XLNet) achieved the best performance with F1-score of 0.71 for the Pain-misuse class, and 0.79 for the Recreational-misuse class. Macro- and micro-averaged F1-scores over all classes were 0.82 and 0.92, respectively. Content-analysis using clustering revealed distinct clusters of terms associated with each class. DISCUSSION: While some past studies have attempted to automatically detect opioid misuse, none have further characterized the motive for misuse. Our multi-class classification approach using XLNet showed promising performance, including in detecting the subtle differences between pain-related and recreation-related misuse. The distinct clustering of class-specific keywords may help conduct targeted data collection, overcoming under-representation of minority classes. CONCLUSION: Machine learning can help identify pain-related and recreational-related OM contents on Twitter to potentially enable the study of the characteristics of individuals exhibiting such behavior.


Subject(s)
Opioid-Related Disorders , Social Media , Analgesics, Opioid/adverse effects , Humans , Machine Learning , Natural Language Processing , United States
16.
J Biomed Inform ; 111: 103601, 2020 11.
Article in English | MEDLINE | ID: mdl-33065264

ABSTRACT

OBJECTIVES: Using Twitter, we aim to (1) define and quantify the prevalence and evolution of facets of social distancing during the COVID-19 pandemic in the US in a spatiotemporal context and (2) examine amplified tweets among social distancing facets. MATERIALS AND METHODS: We analyzed English and US-based tweets containing "coronavirus" between January 23-March 24, 2020 using the Twitter API. Tweets containing keywords were grouped into six social distancing facets: implementation, purpose, social disruption, adaptation, positive emotions, and negative emotions. RESULTS: A total of 259,529 unique tweets were included in the analyses. Social distancing tweets became more prevalent from late January to March but were not geographically uniform. Early facets of social distancing appeared in Los Angeles, San Francisco, and Seattle: the first cities impacted by the COVID-19 outbreak. Tweets related to the "implementation" and "negative emotions" facets largely dominated in combination with topics of "social disruption" and "adaptation", albeit to lesser degree. Social disruptiveness tweets were most retweeted, and implementation tweets were most favorited. DISCUSSION: Social distancing can be defined by facets that respond to and represent certain events in a pandemic, including travel restrictions and rising case counts. For example, Miami had a low volume of social distancing tweets but grew in March corresponding with the rise of COVID-19 cases. CONCLUSION: The evolution of social distancing facets on Twitter reflects actual events and may signal potential disease hotspots. Our facets can also be used to understand public discourse on social distancing which may inform future public health measures.


Subject(s)
COVID-19/prevention & control , Pandemics , Social Media , COVID-19/epidemiology , COVID-19/virology , Humans , SARS-CoV-2/isolation & purification
17.
J Stroke Cerebrovasc Dis ; 29(12): 105306, 2020 Dec.
Article in English | MEDLINE | ID: mdl-33070110

ABSTRACT

INTRODUCTION: Nontraumatic intracranial hemorrhage (ICH) is a neurological emergency of research interest; however, unlike ischemic stroke, has not been well studied in large datasets due to the lack of an established administrative claims-based definition. We aimed to evaluate both explicit diagnosis codes and machine learning methods to create a claims-based definition for this clinical phenotype. METHODS: We examined all patients admitted to our tertiary medical center with a primary or secondary International Classification of Disease version 9 (ICD-9) or 10 (ICD-10) code for ICH in claims from any portion of the hospitalization in 2014-2015. As a gold standard, we defined the nontraumatic ICH phenotype based on manual chart review. We tested explicit definitions based on ICD-9 and ICD-10 that had been previously published in the literature as well as four machine learning classifiers including support vector machine (SVM), logistic regression with LASSO, random forest and xgboost. We report five standard measures of model performance for each approach. RESULTS: A total of 1830 patients with 2145 unique ICD-10 codes were included in the initial dataset, of which 437 (24%) were true positive based on manual review. The explicit ICD-10 definition performed best (Sensitivity = 0.89 (95% CI 0.85-0.92), Specificity = 0.83 (0.81-0.85), F-score = 0.73 (0.69-0.77)) and improves on an explicit ICD-9 definition (Sensitivity = 0.87 (0.83-0.90), Specificity = 0.77 (0.74-0.79), F-score = 0.67 (0.63-0.71). Among machine learning classifiers, SVM performed best (Sensitivity = 0.78 (0.75-0.82), Specificity = 0.84 (0.81-0.87), AUC = 0.89 (0.87-0.92), F-score = 0.66 (0.62-0.69)). CONCLUSIONS: An explicit ICD-10 definition can be used to accurately identify patients with a nontraumatic ICH phenotype with substantially better performance than ICD-9. An explicit ICD-10 based definition is easier to implement and quantitatively not appreciably improved with the additional application of machine learning classifiers. Future research utilizing large datasets should utilize this definition to address important research gaps.


Subject(s)
Administrative Claims, Healthcare , Data Mining , International Classification of Diseases , Intracranial Hemorrhages/diagnosis , Support Vector Machine , Aged , Aged, 80 and over , Female , Health Services Research , Humans , Intracranial Hemorrhages/classification , Male , Middle Aged , Phenotype , Predictive Value of Tests , Reproducibility of Results
18.
Chiropr Man Therap ; 28(1): 47, 2020 07 17.
Article in English | MEDLINE | ID: mdl-32680545

ABSTRACT

BACKGROUND: Chronic spinal pain conditions affect millions of US adults and carry a high healthcare cost burden, both direct and indirect. Conservative interventions for spinal pain conditions, including chiropractic care, have been associated with lower healthcare costs and improvements in pain status in different clinical populations, including veterans. Little is currently known about predicting healthcare service utilization in the domain of conservative interventions for spinal pain conditions, including the frequency of use of chiropractic services. The purpose of this retrospective cohort study was to explore the use of supervised machine learning approaches to predicting one-year chiropractic service utilization by veterans receiving VA chiropractic care. METHODS: We included 19,946 veterans who entered the Musculoskeletal Diagnosis Cohort between October 1, 2003 and September 30, 2013 and utilized VA chiropractic services within one year of cohort entry. The primary outcome was one-year chiropractic service utilization following index chiropractic visit, split into quartiles represented by the following classes: 1 visit, 2 to 3 visits, 4 to 6 visits, and 7 or greater visits. We compared the performance of four multiclass classification algorithms (gradient boosted classifier, stochastic gradient descent classifier, support vector classifier, and artificial neural network) in predicting visit quartile using 158 sociodemographic and clinical features. RESULTS: The selected algorithms demonstrated poor prediction capabilities. Subset accuracy was 42.1% for the gradient boosted classifier, 38.6% for the stochastic gradient descent classifier, 41.4% for the support vector classifier, and 40.3% for the artificial neural network. The micro-averaged area under the precision-recall curve for each one-versus-rest classifier was 0.43 for the gradient boosted classifier, 0.38 for the stochastic gradient descent classifier, 0.43 for the support vector classifier, and 0.42 for the artificial neural network. Performance of each model yielded only a small positive shift in prediction probability (approximately 15%) compared to naïve classification. CONCLUSIONS: Using supervised machine learning to predict chiropractic service utilization remains challenging, with only a small shift in predictive probability over naïve classification and limited clinical utility. Future work should examine mechanisms to improve model performance.


Subject(s)
Manipulation, Chiropractic/statistics & numerical data , Patient Acceptance of Health Care/statistics & numerical data , Supervised Machine Learning , Veterans Health , Adult , Algorithms , Female , Humans , Male , Manipulation, Chiropractic/methods , Middle Aged , Musculoskeletal Pain/therapy , Predictive Value of Tests , Retrospective Studies , United States
19.
Front Big Data ; 3: 19, 2020.
Article in English | MEDLINE | ID: mdl-33693393

ABSTRACT

Choosing an optimal data fusion technique is essential when performing machine learning with multimodal data. In this study, we examined deep learning-based multimodal fusion techniques for the combined classification of radiological images and associated text reports. In our analysis, we (1) compared the classification performance of three prototypical multimodal fusion techniques: Early, Late, and Model fusion, (2) assessed the performance of multimodal compared to unimodal learning; and finally (3) investigated the amount of labeled data needed by multimodal vs. unimodal models to yield comparable classification performance. Our experiments demonstrate the potential of multimodal fusion methods to yield competitive results using less training data (labeled data) than their unimodal counterparts. This was more pronounced using the Early and less so using the Model and Late fusion approaches. With increasing amount of training data, unimodal models achieved comparable results to multimodal models. Overall, our results suggest the potential of multimodal learning to decrease the need for labeled training data resulting in a lower annotation burden for domain experts.

20.
Article in English | MEDLINE | ID: mdl-31056516

ABSTRACT

While coronary microvascular dysfunction (CMD) is a major cause of ischemia, it is very challenging to diagnose due to lack of CMD-specific screening measures. CMD has been identified as one of the five priority areas of investigation in a 2014 National Research Consensus Conference on Gender-Specific Research in Emergency Care. In this study, we utilized methods from machine learning that leverage structured and unstructured narratives in clinical notes to detect patients with CMD. We have shown that structured data are not sufficient to detect CMD and integrating unstructured data in the computational model boosts the performance significantly.


Subject(s)
Coronary Disease , Data Mining/methods , Machine Learning , Natural Language Processing , Coronary Disease/classification , Coronary Disease/diagnosis , Electronic Health Records , Female , Humans , Male , Microvessels/physiopathology
SELECTION OF CITATIONS
SEARCH DETAIL
...