Search | VHL Regional Portal

1.

Detection of Personal and Family History of Suicidal Thoughts and Behaviors using Deep Learning and Natural Language Processing: A Multi-Site Study.

Adekkanattu, Prakash; Furmanchuk, Al'ona; Wu, Yonghui; Pathak, Aman; Patra, Braja Gopal; Bost, Sarah; Morrow, Destinee; Wang, Grace Hsin-Min; Yang, Yuyang; Forrest, Noah James; Luo, Yuan; Walunas, Theresa L; Jenny, Wei-Hsuan Lo-Ciganic; Gelad, Walid; Bian, Jiang; Bao, Yuhua; Weiner, Mark; Oslin, David; Pathak, Jyotishman.

Res Sq ; 2024 Mar 11.

Article in English | MEDLINE | ID: mdl-38559051

ABSTRACT

Objective: Personal and family history of suicidal thoughts and behaviors (PSH and FSH, respectively) are significant risk factors associated with future suicide events. These are often captured in narrative clinical notes in electronic health records (EHRs). Collaboratively, Weill Cornell Medicine (WCM), Northwestern Medicine (NM), and the University of Florida (UF) developed and validated deep learning (DL)-based natural language processing (NLP) tools to detect PSH and FSH from such notes. The tool's performance was further benchmarked against a method relying exclusively on ICD-9/10 diagnosis codes. Materials and Methods: We developed DL-based NLP tools utilizing pre-trained transformer models Bio_ClinicalBERT and GatorTron, and compared them with expert-informed, rule-based methods. The tools were initially developed and validated using manually annotated clinical notes at WCM. Their portability and performance were further evaluated using clinical notes at NM and UF. Results: The DL tools outperformed the rule-based NLP tool in identifying PSH and FHS. For detecting PSH, the rule-based system obtained an F1-score of 0.75 ± 0.07, while the Bio_ClinicalBERT and GatorTron DL tools scored 0.83 ± 0.09 and 0.84 ± 0.07, respectively. For detecting FSH, the rule-based NLP tool's F1-score was 0.69 ± 0.11, compared to 0.89 ± 0.10 for Bio_ClinicalBERT and 0.92 ± 0.07 for GatorTron. For the gold standard corpora across the three sites, only 2.2% (WCM), 9.3% (NM), and 7.8% (UF) of patients reported to have an ICD-9/10 diagnosis code for suicidal thoughts and behaviors prior to the clinical notes report date. The best performing GatorTron DL tool identified 93.0% (WCM), 80.4% (NM), and 89.0% (UF) of patients with documented PSH, and 85.0%(WCM), 89.5%(NM), and 100%(UF) of patients with documented FSH in their notes. Discussion: While PSH and FSH are significant risk factors for future suicide events, little effort has been made previously to identify individuals with these history. To address this, we developed a transformer based DL method and compared with conventional rule-based NLP approach. The varying effectiveness of the rule-based tools across sites suggests a need for improvement in its dictionary-based approach. In contrast, the performances of the DL tools were higher and comparable across sites. Furthermore, DL tools were fine-tuned using only small number of annotated notes at each site, underscores its greater adaptability to local documentation practices and lexical variations. Conclusion: Variations in local documentation practices across health care systems pose challenges to rule-based NLP tools. In contrast, the developed DL tools can effectively extract PSH and FSH information from unstructured clinical notes. These tools will provide clinicians with crucial information for assessing and treating patients at elevated risk for suicide who are rarely been diagnosed.

2.

Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias.

Yu, Zehao; Peng, Cheng; Yang, Xi; Dang, Chong; Adekkanattu, Prakash; Gopal Patra, Braja; Peng, Yifan; Pathak, Jyotishman; Wilson, Debbie L; Chang, Ching-Yuan; Lo-Ciganic, Wei-Hsuan; George, Thomas J; Hogan, William R; Guo, Yi; Bian, Jiang; Wu, Yonghui.

J Biomed Inform ; 153: 104642, 2024 May.

Article in English | MEDLINE | ID: mdl-38621641

ABSTRACT

OBJECTIVE: To develop a natural language processing (NLP) package to extract social determinants of health (SDoH) from clinical narratives, examine the bias among race and gender groups, test the generalizability of extracting SDoH for different disease groups, and examine population-level extraction ratio. METHODS: We developed SDoH corpora using clinical notes identified at the University of Florida (UF) Health. We systematically compared 7 transformer-based large language models (LLMs) and developed an open-source package - SODA (i.e., SOcial DeterminAnts) to facilitate SDoH extraction from clinical narratives. We examined the performance and potential bias of SODA for different race and gender groups, tested the generalizability of SODA using two disease domains including cancer and opioid use, and explored strategies for improvement. We applied SODA to extract 19 categories of SDoH from the breast (n = 7,971), lung (n = 11,804), and colorectal cancer (n = 6,240) cohorts to assess patient-level extraction ratio and examine the differences among race and gender groups. RESULTS: We developed an SDoH corpus using 629 clinical notes of cancer patients with annotations of 13,193 SDoH concepts/attributes from 19 categories of SDoH, and another cross-disease validation corpus using 200 notes from opioid use patients with 4,342 SDoH concepts/attributes. We compared 7 transformer models and the GatorTron model achieved the best mean average strict/lenient F1 scores of 0.9122 and 0.9367 for SDoH concept extraction and 0.9584 and 0.9593 for linking attributes to SDoH concepts. There is a small performance gap (â¼4%) between Males and Females, but a large performance gap (>16 %) among race groups. The performance dropped when we applied the cancer SDoH model to the opioid cohort; fine-tuning using a smaller opioid SDoH corpus improved the performance. The extraction ratio varied in the three cancer cohorts, in which 10 SDoH could be extracted from over 70 % of cancer patients, but 9 SDoH could be extracted from less than 70 % of cancer patients. Individuals from the White and Black groups have a higher extraction ratio than other minority race groups. CONCLUSIONS: Our SODA package achieved good performance in extracting 19 categories of SDoH from clinical narratives. The SODA package with pre-trained transformer models is available at https://github.com/uf-hobi-informatics-lab/SODA_Docker.

Subject(s)

Narration , Natural Language Processing , Social Determinants of Health , Humans , Female , Male , Bias , Electronic Health Records , Documentation/methods , Data Mining/methods

3.

Decoding Suicide Decedent Profiles and Signs of Suicidal Intent Using Latent Class Analysis.

Xiao, Yunyu; Bi, Kaiwen; Yip, Paul Siu-Fai; Cerel, Julie; Brown, Timothy T; Peng, Yifan; Pathak, Jyotishman; Mann, J John.

JAMA Psychiatry ; 81(6): 595-605, 2024 Jun 01.

Article in English | MEDLINE | ID: mdl-38506817

ABSTRACT

Importance: Suicide rates in the US increased by 35.6% from 2001 to 2021. Given that most individuals die on their first attempt, earlier detection and intervention are crucial. Understanding modifiable risk factors is key to effective prevention strategies. Objective: To identify distinct suicide profiles or classes, associated signs of suicidal intent, and patterns of modifiable risks for targeted prevention efforts. Design, Setting, and Participants: This cross-sectional study used data from the 2003-2020 National Violent Death Reporting System Restricted Access Database for 306â¯800 suicide decedents. Statistical analysis was performed from July 2022 to June 2023. Exposures: Suicide decedent profiles were determined using latent class analyses of available data on suicide circumstances, toxicology, and methods. Main Outcomes and Measures: Disclosure of recent intent, suicide note presence, and known psychotropic usage. Results: Among 306â¯800 suicide decedents (mean [SD] age, 46.3 [18.4] years; 239â¯627 males [78.1%] and 67â¯108 females [21.9%]), 5 profiles or classes were identified. The largest class, class 4 (97â¯175 [31.7%]), predominantly faced physical health challenges, followed by polysubstance problems in class 5 (58â¯803 [19.2%]), and crisis, alcohol-related, and intimate partner problems in class 3 (55â¯367 [18.0%]), mental health problems (class 2, 53â¯928 [17.6%]), and comorbid mental health and substance use disorders (class 1, 41â¯527 [13.5%]). Class 4 had the lowest rates of disclosing suicidal intent (13â¯952 [14.4%]) and leaving a suicide note (24â¯351 [25.1%]). Adjusting for covariates, compared with class 1, class 4 had the highest odds of not disclosing suicide intent (odds ratio [OR], 2.58; 95% CI, 2.51-2.66) and not leaving a suicide note (OR, 1.45; 95% CI, 1.41-1.49). Class 4 also had the lowest rates of all known psychiatric illnesses and psychotropic medications among all suicide profiles. Class 4 had more older adults (23â¯794 were aged 55-70 years [24.5%]; 20â¯100 aged ≥71 years [20.7%]), veterans (22â¯220 [22.9%]), widows (8633 [8.9%]), individuals with less than high school education (15â¯690 [16.1%]), and rural residents (23â¯966 [24.7%]). Conclusions and Relevance: This study identified 5 distinct suicide profiles, highlighting a need for tailored prevention strategies. Improving the detection and treatment of coexisting mental health conditions, substance and alcohol use disorders, and physical illnesses is paramount. The implementation of means restriction strategies plays a vital role in reducing suicide risks across most of the profiles, reinforcing the need for a multifaceted approach to suicide prevention.

Subject(s)

Latent Class Analysis , Humans , Male , Female , Middle Aged , Cross-Sectional Studies , Adult , United States/epidemiology , Suicidal Ideation , Aged , Suicide, Attempted/statistics & numerical data , Suicide, Attempted/psychology , Young Adult , Suicide, Completed/statistics & numerical data , Suicide, Completed/psychology , Risk Factors , Suicide/statistics & numerical data , Suicide/psychology , Adolescent , Substance-Related Disorders/epidemiology , Substance-Related Disorders/psychology

4.

Preparing for the bedside-optimizing a postpartum depression risk prediction model for clinical implementation in a health system.

Liu, Yifan; Joly, Rochelle; Reading Turchioe, Meghan; Benda, Natalie; Hermann, Alison; Beecy, Ashley; Pathak, Jyotishman; Zhang, Yiye.

J Am Med Inform Assoc ; 31(6): 1258-1267, 2024 May 20.

Article in English | MEDLINE | ID: mdl-38531676

ABSTRACT

OBJECTIVE: We developed and externally validated a machine-learning model to predict postpartum depression (PPD) using data from electronic health records (EHRs). Effort is under way to implement the PPD prediction model within the EHR system for clinical decision support. We describe the pre-implementation evaluation process that considered model performance, fairness, and clinical appropriateness. MATERIALS AND METHODS: We used EHR data from an academic medical center (AMC) and a clinical research network database from 2014 to 2020 to evaluate the predictive performance and net benefit of the PPD risk model. We used area under the curve and sensitivity as predictive performance and conducted a decision curve analysis. In assessing model fairness, we employed metrics such as disparate impact, equal opportunity, and predictive parity with the White race being the privileged value. The model was also reviewed by multidisciplinary experts for clinical appropriateness. Lastly, we debiased the model by comparing 5 different debiasing approaches of fairness through blindness and reweighing. RESULTS: We determined the classification threshold through a performance evaluation that prioritized sensitivity and decision curve analysis. The baseline PPD model exhibited some unfairness in the AMC data but had a fair performance in the clinical research network data. We revised the model by fairness through blindness, a debiasing approach that yielded the best overall performance and fairness, while considering clinical appropriateness suggested by the expert reviewers. DISCUSSION AND CONCLUSION: The findings emphasize the need for a thorough evaluation of intervention-specific models, considering predictive performance, fairness, and appropriateness before clinical implementation.

Subject(s)

Depression, Postpartum , Electronic Health Records , Machine Learning , Humans , Female , Risk Assessment/methods , Decision Support Systems, Clinical

5.

Characterizing atrial fibrillation symptom improvement following de novo catheter ablation.

Reading Turchioe, Meghan; Volodarskiy, Alexander; Guo, Winston; Taylor, Brittany; Hobensack, Mollie; Pathak, Jyotishman; Slotwiner, David.

Eur J Cardiovasc Nurs ; 23(3): 241-250, 2024 Apr 12.

Article in English | MEDLINE | ID: mdl-37479225

ABSTRACT

AIMS: Atrial fibrillation (AF) symptom relief is a primary indication for catheter ablation, but AF symptom resolution is not well characterized. The study objective was to describe AF symptom documentation in electronic health records (EHRs) pre- and post-ablation and identify correlates of post-ablation symptoms. METHODS AND RESULTS: We conducted a retrospective cohort study using EHRs of patients with AF (n = 1293), undergoing ablation in a large, urban health system from 2010 to 2020. We extracted symptom data from clinical notes using a natural language processing algorithm (F score: 0.81). We used Cochran's Q tests with post-hoc McNemar's tests to determine differences in symptom prevalence pre- and post-ablation. We used logistic regression models to estimate the adjusted odds of symptom resolution by personal or clinical characteristics at 6 and 12 months post-ablation. In fully adjusted models, at 12 months post-ablation patients, patients with heart failure had significantly lower odds of dyspnoea resolution [odds ratio (OR) 0.38, 95% confidence interval (CI) 0.25-0.57], oedema resolution (OR 0.37, 95% CI 0.25-0.56), and fatigue resolution (OR 0.54, 95% CI 0.34-0.85), but higher odds of palpitations resolution (OR 1.90, 95% CI 1.25-2.89) compared with those without heart failure. Age 65 and older, female sex, Black or African American race, smoking history, and antiarrhythmic use were also associated with lower odds of resolution of specific symptoms at 6 and 12 months. CONCLUSION: The post-ablation symptom patterns are heterogeneous. Findings warrant confirmation with larger, more representative data sets, which may be informative for patients whose primary goal for undergoing an ablation is symptom relief.

Subject(s)

Atrial Fibrillation , Catheter Ablation , Heart Failure , Humans , Female , Aged , Atrial Fibrillation/diagnosis , Retrospective Studies , Anti-Arrhythmia Agents/therapeutic use , Heart Failure/complications , Treatment Outcome

6.

Visualizing machine learning-based predictions of postpartum depression risk for lay audiences.

Desai, Pooja M; Harkins, Sarah; Rahman, Saanjaana; Kumar, Shiveen; Hermann, Alison; Joly, Rochelle; Zhang, Yiye; Pathak, Jyotishman; Kim, Jessica; D'Angelo, Deborah; Benda, Natalie C; Reading Turchioe, Meghan.

J Am Med Inform Assoc ; 31(2): 289-297, 2024 Jan 18.

Article in English | MEDLINE | ID: mdl-37847667

ABSTRACT

OBJECTIVES: To determine if different formats for conveying machine learning (ML)-derived postpartum depression risks impact patient classification of recommended actions (primary outcome) and intention to seek care, perceived risk, trust, and preferences (secondary outcomes). MATERIALS AND METHODS: We recruited English-speaking females of childbearing age (18-45 years) using an online survey platform. We created 2 exposure variables (presentation format and risk severity), each with 4 levels, manipulated within-subject. Presentation formats consisted of text only, numeric only, gradient number line, and segmented number line. For each format viewed, participants answered questions regarding each outcome. RESULTS: Five hundred four participants (mean age 31 years) completed the survey. For the risk classification question, performance was high (93%) with no significant differences between presentation formats. There were main effects of risk level (all P < .001) such that participants perceived higher risk, were more likely to agree to treatment, and more trusting in their obstetrics team as the risk level increased, but we found inconsistencies in which presentation format corresponded to the highest perceived risk, trust, or behavioral intention. The gradient number line was the most preferred format (43%). DISCUSSION AND CONCLUSION: All formats resulted high accuracy related to the classification outcome (primary), but there were nuanced differences in risk perceptions, behavioral intentions, and trust. Investigators should choose health data visualizations based on the primary goal they want lay audiences to accomplish with the ML risk score.

Subject(s)

Depression, Postpartum , Female , Humans , Adult , Adolescent , Young Adult , Middle Aged , Depression, Postpartum/diagnosis , Risk Factors , Surveys and Questionnaires , Data Visualization

7.

Association of opioid or other substance use disorders with health care use among patients with suicidal symptoms.

Vekaria, Veer; Patra, Braja G; Xi, Wenna; Murphy, Sean M; Avery, Jonathan; Olfson, Mark; Pathak, Jyotishman.

J Subst Use Addict Treat ; 156: 209177, 2024 01.

Article in English | MEDLINE | ID: mdl-37820869

ABSTRACT

INTRODUCTION: Prior literature establishes noteworthy relationships between suicidal symptoms and substance use disorders (SUDs), particularly opioid use disorder (OUD). However, engagement with health care services among this vulnerable population remains underinvestigated. This study sought to examine patterns of health care use, identify risk factors in seeking treatment, and assess associations between outpatient service use and emergency department (ED) visits. METHODS: Using electronic health records (EHRs) derived from five health systems across New York City, the study selected 7881 adults with suicidal symptoms (including suicidal ideation, suicide attempt, or self-harm) and SUDs between 2010 and 2019. To examine the association between SUDs (including OUD) and all-cause service use (outpatient, inpatient, and ED), we performed quasi-Poisson regressions adjusted for age, gender, and chronic disease burden, and we estimated the relative risks (RR) of associated factors. Next, the study evaluated cause-specific utilization within each resource category (SUD-related, suicide-related, and other-psychiatric) and compared them using Mann-Whitney U tests. Finally, we used adjusted quasi-Poisson regression models to analyze the association between outpatient and ED utilization among different risk groups. RESULTS: Among patients with suicidal symptoms and SUD diagnoses, relative to other SUDs, a diagnosis of OUD was associated with higher all-cause outpatient visits (RR: 1.22), ED visits (RR: 1.54), and inpatient hospitalizations (RR: 1.67) (ps < 0.001). Men had a lower risk of having outpatient visits (RR: 0.80) and inpatient hospitalizations (RR: 0.90), and older age protected against ED visits (RR range: 0.59-0.69) (ps < 0.001). OUD was associated with increased SUD-related encounters across all settings, and increased suicide-related ED visits and inpatient hospitalizations (p < 0.001). Individuals with more mental health outpatient visits were less likely to have suicide-related ED visits (RR: 0.86, p < 0.01), however this association was not found among younger and male patients with OUD. Although few OUD patients received medications for OUD (MOUD) treatment (9.9 %), methadone composed the majority of MOUD prescriptions (77.7 %), of which over 70 % were prescribed during an ED encounter. CONCLUSIONS: This study reinforces the importance of tailoring SUD and suicide risk interventions to different age groups and types of SUDs, and highlights missed opportunities for deploying screening and prevention resources among the male and OUD populations. Redressing underutilization of MOUD remains a priority to reduce acute health outcomes among younger patients with OUD.

Subject(s)

Analgesics, Opioid , Opioid-Related Disorders , Adult , Humans , Male , Analgesics, Opioid/adverse effects , Suicidal Ideation , Suicide, Attempted/prevention & control , Opioid-Related Disorders/epidemiology , Delivery of Health Care

8.

Using Machine Learning to Predict Antidepressant Treatment Outcome From Electronic Health Records.

Xu, Zhenxing; Vekaria, Veer; Wang, Fei; Cukor, Judith; Su, Chang; Adekkanattu, Prakash; Brandt, Pascal; Jiang, Guoqian; Kiefer, Richard C; Luo, Yuan; Rasmussen, Luke V; Xu, Jie; Xiao, Yunyu; Alexopoulos, George; Pathak, Jyotishman.

Psychiatr Res Clin Pract ; 5(4): 118-125, 2023.

Article in English | MEDLINE | ID: mdl-38077277

ABSTRACT

Objective: To evaluate if a machine learning approach can accurately predict antidepressant treatment outcome using electronic health records (EHRs) from patients with depression. Method: This study examined 808 patients with depression at a New York City-based outpatient mental health clinic between June 13, 2016 and June 22, 2020. Antidepressant treatment outcome was defined based on trend in depression symptom severity over time and was categorized as either "Recovering" or "Worsening" (i.e., non-Recovering), measured by the slope of individual-level Patient Health Questionnaire-9 (PHQ-9) score trajectory spanning 6 months following treatment initiation. A patient was designated as "Recovering" if the slope is less than 0 and as "Worsening" if the slope was no less than 0. Multiple machine learning (ML) models including L2 norm regularized Logistic Regression, Naive Bayes, Random Forest, and Gradient Boosting Decision Tree (GBDT) were used to predict treatment outcome based on additional data from EHRs, including demographics and diagnoses. Shapley Additive Explanations were applied to identify the most important predictors. Results: The GBDT achieved the best results of predicting "Recovering" (AUC: 0.7654 ± 0.0227; precision: 0.6002 ± 0.0215; recall: 0.5131 ± 0.0336). When excluding patients with low PHQ-9 scores (<10) at baseline, the results of predicting "Recovering" (AUC: 0.7254 ± 0.0218; precision: 0.5392 ± 0.0437; recall: 0.4431 ± 0.0513) were obtained. Prior diagnosis of anxiety, psychotherapy, recurrent depression, and baseline depression symptom severity were strong predictors. Conclusions: The results demonstrate the potential utility of using ML in longitudinal EHRs to predict antidepressant treatment outcome. Our predictive tool holds the promise to accelerate personalized medical management in patients with psychiatric illnesses.

9.

The genetic contribution to the comorbidity of depression and anxiety: a multi-site electronic health records study of almost 178 000 people.

Coombes, Brandon J; Landi, Isotta; Choi, Karmel W; Singh, Kritika; Fennessy, Brian; Jenkins, Greg D; Batzler, Anthony; Pendegraft, Richard; Nunez, Nicolas A; Gao, Y Nina; Ryu, Euijung; Wickramaratne, Priya; Weissman, Myrna M; Pathak, Jyotishman; Mann, J John; Smoller, Jordan W; Davis, Lea K; Olfson, Mark; Charney, Alexander W; Biernacka, Joanna M.

Psychol Med ; 53(15): 7368-7374, 2023 Nov.

Article in English | MEDLINE | ID: mdl-38078748

ABSTRACT

BACKGROUND: Depression and anxiety are common and highly comorbid, and their comorbidity is associated with poorer outcomes posing clinical and public health concerns. We evaluated the polygenic contribution to comorbid depression and anxiety, and to each in isolation. METHODS: Diagnostic codes were extracted from electronic health records for four biobanks [N = 177 865 including 138 632 European (77.9%), 25 612 African (14.4%), and 13 621 Hispanic (7.7%) ancestry participants]. The outcome was a four-level variable representing the depression/anxiety diagnosis group: neither, depression-only, anxiety-only, and comorbid. Multinomial regression was used to test for association of depression and anxiety polygenic risk scores (PRSs) with the outcome while adjusting for principal components of ancestry. RESULTS: In total, 132 960 patients had neither diagnosis (74.8%), 16 092 depression-only (9.0%), 13 098 anxiety-only (7.4%), and 16 584 comorbid (9.3%). In the European meta-analysis across biobanks, both PRSs were higher in each diagnosis group compared to controls. Notably, depression-PRS (OR 1.20 per s.d. increase in PRS; 95% CI 1.18-1.23) and anxiety-PRS (OR 1.07; 95% CI 1.05-1.09) had the largest effect when the comorbid group was compared with controls. Furthermore, the depression-PRS was significantly higher in the comorbid group than the depression-only group (OR 1.09; 95% CI 1.06-1.12) and the anxiety-only group (OR 1.15; 95% CI 1.11-1.19) and was significantly higher in the depression-only group than the anxiety-only group (OR 1.06; 95% CI 1.02-1.09), showing a genetic risk gradient across the conditions and the comorbidity. CONCLUSIONS: This study suggests that depression and anxiety have partially independent genetic liabilities and the genetic vulnerabilities to depression and anxiety make distinct contributions to comorbid depression and anxiety.

Subject(s)

Depression , Electronic Health Records , Humans , Anxiety/epidemiology , Anxiety/genetics , Anxiety Disorders/epidemiology , Anxiety Disorders/genetics , Comorbidity , Depression/epidemiology , Depression/genetics , Multifactorial Inheritance , Risk Factors

10.

Automated classification of lay health articles using natural language processing: a case study on pregnancy health and postpartum depression.

Patra, Braja Gopal; Sun, Zhaoyi; Cheng, Zilin; Kumar, Praneet Kasi Reddy Jagadeesh; Altammami, Abdullah; Liu, Yiyang; Joly, Rochelle; Jedlicka, Caroline; Delgado, Diana; Pathak, Jyotishman; Peng, Yifan; Zhang, Yiye.

Front Psychiatry ; 14: 1258887, 2023.

Article in English | MEDLINE | ID: mdl-38053538

ABSTRACT

Objective: Evidence suggests that high-quality health education and effective communication within the framework of social support hold significant potential in preventing postpartum depression. Yet, developing trustworthy and engaging health education and communication materials requires extensive expertise and substantial resources. In light of this, we propose an innovative approach that involves leveraging natural language processing (NLP) to classify publicly accessible lay articles based on their relevance and subject matter to pregnancy and mental health. Materials and methods: We manually reviewed online lay articles from credible and medically validated sources to create a gold standard corpus. This manual review process categorized the articles based on their pertinence to pregnancy and related subtopics. To streamline and expand the classification procedure for relevance and topics, we employed advanced NLP models such as Random Forest, Bidirectional Encoder Representations from Transformers (BERT), and Generative Pre-trained Transformer model (gpt-3.5-turbo). Results: The gold standard corpus included 392 pregnancy-related articles. Our manual review process categorized the reading materials according to lifestyle factors associated with postpartum depression: diet, exercise, mental health, and health literacy. A BERT-based model performed best (F1 = 0.974) in an end-to-end classification of relevance and topics. In a two-step approach, given articles already classified as pregnancy-related, gpt-3.5-turbo performed best (F1 = 0.972) in classifying the above topics. Discussion: Utilizing NLP, we can guide patients to high-quality lay reading materials as cost-effective, readily available health education and communication sources. This approach allows us to scale the information delivery specifically to individuals, enhancing the relevance and impact of the materials provided.

11.

Effects of Geography on Risk for Future Suicidal Ideation and Attempts Among Children and Youth.

Xi, Wenna; Banerjee, Samprit; Zima, Bonnie T; Alexopoulos, George S; Olfson, Mark; Xiao, Yunyu; Pathak, Jyotishman.

JAACAP Open ; 1(3): 206-217, 2023 Nov.

Article in English | MEDLINE | ID: mdl-37946932

ABSTRACT

Objective: Geography may influence the relationships of predictors for suicidal ideation (SI) and suicide attempts (SA) in children and youth. Method: This is a nationwide retrospective cohort study of 124,424 individuals less than 25 years of age using commercial claims data (2011-2015) from the Health Care Cost Institute. Outcomes were time to SI or SA within 3 months after the indexed mental health or substance use disorder (MH/SUD) outpatient visit. Predictors included sociodemographic and clinical characteristics up to 3 years before the index event. Results: At each follow-up time period, rates of SI and SA varied by the US geographic division (p < .001), and the Mountain Division consistently had the highest rates for both SI and SA (5.44%-10.26% for SI; 0.70%-2.82% for SA). Having MH emergency department (ED) visits in the past year increased the risk of SI by 28% to 65% for individuals residing in the New England, Mid-Atlantic, East North Central, West North Central, and East South Central Divisions. The main effects of geographic divisions were significant for SA (p<0.001). Risk of SA was lower in New England, Mid-Atlantic, South Atlantic, and Pacific (hazard ratios = 0.57, 0.51, 0.67, and 0.79, respectively) and higher in the Mountain Division (hazard ratio = 1.46). Conclusion: To understand the underlying mechanisms driving the high prevalence of SI and SA in the Mountain Division and the elevated risk of SI after having MH ED visits, future research examining regional differences in risks for SI and SA should include indicators of access to MH ED care and other social determinants of health.

12.

Patterns of Social Determinants of Health and Child Mental Health, Cognition, and Physical Health.

Xiao, Yunyu; Mann, J John; Chow, Julian Chun-Chung; Brown, Timothy T; Snowden, Lonnie R; Yip, Paul Siu-Fai; Tsai, Alexander C; Hou, Yu; Pathak, Jyotishman; Wang, Fei; Su, Chang.

JAMA Pediatr ; 177(12): 1294-1305, 2023 12 01.

Article in English | MEDLINE | ID: mdl-37843837

ABSTRACT

Importance: Social determinants of health (SDOH) influence child health. However, most previous studies have used individual, small-set, or cherry-picked SDOH variables without examining unbiased computed SDOH patterns from high-dimensional SDOH factors to investigate associations with child mental health, cognition, and physical health. Objective: To identify SDOH patterns and estimate their associations with children's mental, cognitive, and physical developmental outcomes. Design, Setting, and Participants: This population-based cohort study included children aged 9 to 10 years at baseline and their caregivers enrolled in the Adolescent Brain Cognitive Development (ABCD) Study between 2016 and 2021. The ABCD Study includes 21 sites across 17 states. Exposures: Eighty-four neighborhood-level, geocoded variables spanning 7 domains of SDOH, including bias, education, physical and health infrastructure, natural environment, socioeconomic status, social context, and crime and drugs, were studied. Hierarchical agglomerative clustering was used to identify SDOH patterns. Main Outcomes and Measures: Associations of SDOH and child mental health (internalizing and externalizing behaviors) and suicidal behaviors, cognitive function (performance, reading skills), and physical health (body mass index, exercise, sleep disorder) were estimated using mixed-effects linear and logistic regression models. Results: Among 10 504 children (baseline median [SD] age, 9.9 [0.6] years; 5510 boys [52.5%] and 4994 girls [47.5%]; 229 Asian [2.2%], 1468 Black [14.0%], 2128 Hispanic [20.3%], 5565 White [53.0%], and 1108 multiracial [10.5%]), 4 SDOH patterns were identified: pattern 1, affluence (4078 children [38.8%]); pattern 2, high-stigma environment (2661 children [25.3%]); pattern 3, high socioeconomic deprivation (2653 children [25.3%]); and pattern 4, high crime and drug sales, low education, and high population density (1112 children [10.6%]). The SDOH patterns were distinctly associated with child health outcomes. Children exposed to socioeconomic deprivation (SDOH pattern 3) showed the worst health profiles, manifesting more internalizing (ß = 0.75; 95% CI, 0.14-1.37) and externalizing (ß = 1.43; 95% CI, 0.83-2.02) mental health problems, lower cognitive performance, and adverse physical health. Conclusions: This study shows that an unbiased quantitative analysis of multidimensional SDOH can permit the determination of how SDOH patterns are associated with child developmental outcomes. Children exposed to socioeconomic deprivation showed the worst outcomes relative to other SDOH categories. These findings suggest the need to determine whether improvement in socioeconomic conditions can enhance child developmental outcomes.

Subject(s)

Mental Health , Social Determinants of Health , Male , Female , Adolescent , Humans , Child , Cohort Studies , Child Development , Cognition

13.

Longitudinal Trajectories of Symptom Change During Antidepressant Treatment Among Managed Care Patients with Co-Occurring Depression and Anxiety.

Cukor, Judith; Xu, Zhenxing; Vekaria, Veer; Wang, Fei; Olfson, Mark; Banerjee, Samprit; Simon, Gregory; Alexopoulos, George; Pathak, Jyotishman.

medRxiv ; 2023 Sep 26.

Article in English | MEDLINE | ID: mdl-37808868

ABSTRACT

Depression and anxiety are highly correlated, yet little is known about the course of each condition when presenting concurrently. This study aimed to identify longitudinal patterns and changes in depression and anxiety symptoms during antidepressant treatment, and evaluate clinical factors associated with each response pattern. Self-report Patient Health Questionnaire-9 (PHQ-9) and General Anxiety Disorder-7 (GAD-7) scores were used to track the courses of depression and anxiety respectively over a three-month window, and group-based trajectory modeling was used to derive subgroups of patients who have similar response patterns. Multinomial regression was used to associate various clinical variables with trajectory subgroup membership. Of the 577 included adults, 373 (64.6%) were women, and the mean age was 39.3 (SD: 12.9) years. Six depression and six anxiety trajectory subgroups were computationally derived; three depression subgroups demonstrated symptom improvement, and three exhibited nonresponse. Similar patterns were observed in the six anxiety subgroups. Factors associated with treatment nonresponse included higher pretreatment depression and anxiety severity and poorer sleep quality, while better overall health and younger age were associated with higher rates of remission. Synchronous and asynchronous paths to improvement were also observed between depression and anxiety. High baseline depression or anxiety severity alone may be an insufficient predictor of treatment nonresponse. These findings have the potential to motivate clinical strategies aimed at treating depression and anxiety simultaneously.

14.

Women's perspectives on the use of artificial intelligence (AI)-based technologies in mental healthcare.

Reading Turchioe, Meghan; Harkins, Sarah; Desai, Pooja; Kumar, Shiveen; Kim, Jessica; Hermann, Alison; Joly, Rochelle; Zhang, Yiye; Pathak, Jyotishman; Benda, Natalie C.

JAMIA Open ; 6(3): ooad048, 2023 Oct.

Article in English | MEDLINE | ID: mdl-37425486

ABSTRACT

This study aimed to evaluate women's attitudes towards artificial intelligence (AI)-based technologies used in mental health care. We conducted a cross-sectional, online survey of U.S. adults reporting female sex at birth focused on bioethical considerations for AI-based technologies in mental healthcare, stratifying by previous pregnancy. Survey respondents (n = 258) were open to AI-based technologies in mental healthcare but concerned about medical harm and inappropriate data sharing. They held clinicians, developers, healthcare systems, and the government responsible for harm. Most reported it was "very important" for them to understand AI output. More previously pregnant respondents reported being told AI played a small role in mental healthcare was "very important" versus those not previously pregnant (P = .03). We conclude that protections against harm, transparency around data use, preservation of the patient-clinician relationship, and patient comprehension of AI predictions may facilitate trust in AI-based technologies for mental healthcare among women.

15.

AD-BERT: Using pre-trained language model to predict the progression from mild cognitive impairment to Alzheimer's disease.

Mao, Chengsheng; Xu, Jie; Rasmussen, Luke; Li, Yikuan; Adekkanattu, Prakash; Pacheco, Jennifer; Bonakdarpour, Borna; Vassar, Robert; Shen, Li; Jiang, Guoqian; Wang, Fei; Pathak, Jyotishman; Luo, Yuan.

J Biomed Inform ; 144: 104442, 2023 08.

Article in English | MEDLINE | ID: mdl-37429512

ABSTRACT

OBJECTIVE: We develop a deep learning framework based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model using unstructured clinical notes from electronic health records (EHRs) to predict the risk of disease progression from Mild Cognitive Impairment (MCI) to Alzheimer's Disease (AD). METHODS: We identified 3657 patients diagnosed with MCI together with their progress notes from Northwestern Medicine Enterprise Data Warehouse (NMEDW) between 2000 and 2020. The progress notes no later than the first MCI diagnosis were used for the prediction. We first preprocessed the notes by deidentification, cleaning and splitting into sections, and then pre-trained a BERT model for AD (named AD-BERT) based on the publicly available Bio+Clinical BERT on the preprocessed notes. All sections of a patient were embedded into a vector representation by AD-BERT and then combined by global MaxPooling and a fully connected network to compute the probability of MCI-to-AD progression. For validation, we conducted a similar set of experiments on 2563 MCI patients identified at Weill Cornell Medicine (WCM) during the same timeframe. RESULTS: Compared with the 7 baseline models, the AD-BERT model achieved the best performance on both datasets, with Area Under receiver operating characteristic Curve (AUC) of 0.849 and F1 score of 0.440 on NMEDW dataset, and AUC of 0.883 and F1 score of 0.680 on WCM dataset. CONCLUSION: The use of EHRs for AD-related research is promising, and AD-BERT shows superior predictive performance in modeling MCI-to-AD progression prediction. Our study demonstrates the utility of pre-trained language models and clinical notes in predicting MCI-to-AD progression, which could have important implications for improving early detection and intervention for AD.

Subject(s)

Alzheimer Disease , Cognitive Dysfunction , Humans , Alzheimer Disease/diagnosis , Cognitive Dysfunction/diagnosis , Disease Progression

16.

Risk stratification models for predicting preventable hospitalization in commercially insured late middle-aged adults with depression.

Evans, Lauren; Wu, Yiyuan; Xi, Wenna; Ghosh, Arnab K; Kim, Min-Hyung; Alexopoulos, George S; Pathak, Jyotishman; Banerjee, Samprit.

BMC Health Serv Res ; 23(1): 621, 2023 Jun 13.

Article in English | MEDLINE | ID: mdl-37312121

ABSTRACT

BACKGROUND: A significant number of late middle-aged adults with depression have a high illness burden resulting from chronic conditions which put them at high risk of hospitalization. Many late middle-aged adults are covered by commercial health insurance, but such insurance claims have not been used to identify the risk of hospitalization in individuals with depression. In the present study, we developed and validated a non-proprietary model to identify late middle-aged adults with depression at risk for hospitalization, using machine learning methods. METHODS: This retrospective cohort study involved 71,682 commercially insured older adults aged 55-64 years diagnosed with depression. National health insurance claims were used to capture demographics, health care utilization, and health status during the base year. Health status was captured using 70 chronic health conditions, and 46 mental health conditions. The outcomes were 1- and 2-year preventable hospitalization. For each of our two outcomes, we evaluated seven modelling approaches: four prediction models utilized logistic regression with different combinations of predictors to evaluate the relative contribution of each group of variables, and three prediction models utilized machine learning approaches - logistic regression with LASSO penalty, random forests (RF), and gradient boosting machine (GBM). RESULTS: Our predictive model for 1-year hospitalization achieved an AUC of 0.803, with a sensitivity of 72% and a specificity of 76% under the optimum threshold of 0.463, and our predictive model for 2-year hospitalization achieved an AUC of 0.793, with a sensitivity of 76% and a specificity of 71% under the optimum threshold of 0.452. For predicting both 1-year and 2-year risk of preventable hospitalization, our best performing models utilized the machine learning approach of logistic regression with LASSO penalty which outperformed more black-box machine learning models like RF and GBM. CONCLUSIONS: Our study demonstrates the feasibility of identifying depressed middle-aged adults at higher risk of future hospitalization due to burden of chronic illnesses using basic demographic information and diagnosis codes recorded in health insurance claims. Identifying this population may assist health care planners in developing effective screening strategies and management approaches and in efficient allocation of public healthcare resources as this population transitions to publicly funded healthcare programs, e.g., Medicare in the US.

Subject(s)

Depression , Medicare , United States/epidemiology , Middle Aged , Humans , Aged , Depression/diagnosis , Depression/epidemiology , Retrospective Studies , Hospitalization , Risk Assessment

17.

Comparing the effects of four common drug classes on the progression of mild cognitive impairment to dementia using electronic health records.

Xu, Jie; Wang, Fei; Zang, Chengxi; Zhang, Hao; Niotis, Kellyann; Liberman, Ava L; Stonnington, Cynthia M; Ishii, Makoto; Adekkanattu, Prakash; Luo, Yuan; Mao, Chengsheng; Rasmussen, Luke V; Xu, Zhenxing; Brandt, Pascal; Pacheco, Jennifer A; Peng, Yifan; Jiang, Guoqian; Isaacson, Richard; Pathak, Jyotishman.

Sci Rep ; 13(1): 8102, 2023 05 19.

Article in English | MEDLINE | ID: mdl-37208478

ABSTRACT

The objective of this study was to investigate the potential association between the use of four frequently prescribed drug classes, namely antihypertensive drugs, statins, selective serotonin reuptake inhibitors, and proton-pump inhibitors, and the likelihood of disease progression from mild cognitive impairment (MCI) to dementia using electronic health records (EHRs). We conducted a retrospective cohort study using observational EHRs from a cohort of approximately 2 million patients seen at a large, multi-specialty urban academic medical center in New York City, USA between 2008 and 2020 to automatically emulate the randomized controlled trials. For each drug class, two exposure groups were identified based on the prescription orders documented in the EHRs following their MCI diagnosis. During follow-up, we measured drug efficacy based on the incidence of dementia and estimated the average treatment effect (ATE) of various drugs. To ensure the robustness of our findings, we confirmed the ATE estimates via bootstrapping and presented associated 95% confidence intervals (CIs). Our analysis identified 14,269 MCI patients, among whom 2501 (17.5%) progressed to dementia. Using average treatment estimation and bootstrapping confirmation, we observed that drugs including rosuvastatin (ATE = - 0.0140 [- 0.0191, - 0.0088], p value < 0.001), citalopram (ATE = - 0.1128 [- 0.125, - 0.1005], p value < 0.001), escitalopram (ATE = - 0.0560 [- 0.0615, - 0.0506], p value < 0.001), and omeprazole (ATE = - 0.0201 [- 0.0299, - 0.0103], p value < 0.001) have a statistically significant association in slowing the progression from MCI to dementia. The findings from this study support the commonly prescribed drugs in altering the progression from MCI to dementia and warrant further investigation.

Subject(s)

Alzheimer Disease , Cognitive Dysfunction , Humans , Alzheimer Disease/diagnosis , Retrospective Studies , Electronic Health Records , Disease Progression , Cognitive Dysfunction/drug therapy , Cognitive Dysfunction/epidemiology , Cognitive Dysfunction/diagnosis , Randomized Controlled Trials as Topic

18.

A multisite comparison using electronic health records and natural language processing to identify the association between suicidality and hospital readmission amongst patients with eating disorders.

Cliffe, Charlotte; Cusick, Marika; Vellupillai, Sumithra; Shear, Matthew; Downs, Johnny; Epstein, Sophie; Pathak, Jyotishman; Dutta, Rina.

Int J Eat Disord ; 56(8): 1581-1592, 2023 08.

Article in English | MEDLINE | ID: mdl-37194359

ABSTRACT

OBJECTIVES: To describe and compare the association between suicidality and subsequent readmission for patients hospitalized for eating disorder treatment, within 2 years of discharge, at two large academic medical centers in two different countries. METHODS: Over an 8-year study window from January 2009 to March 2017, we identified all inpatient eating disorder admissions at Weill Cornell Medicine, New York, USA (WCM) and South London and Maudsley Foundation NHS Trust, London, UK (SLaM). To establish each patient's-suicidality profile, we applied two natural language processing (NLP) algorithms, independently developed at the two institutions, and detected suicidality in clinical notes documented in the first week of admission. We calculated the odds ratios (OR) for any subsequent readmission within 2 years postdischarge and determined whether this was to another eating disorder unit, other psychiatric unit, a general medical hospital admission or emergency room attendance. RESULTS: We identified 1126 and 420 eating disorder inpatient admissions at WCM and SLaM, respectively. In the WCM cohort, evidence of above average suicidality during the first week of admission was significantly associated with an increased risk of noneating disorder-related psychiatric readmission (OR 3.48 95% CI = 2.03-5.99, p-value < .001), but a similar pattern was not observed in the SLaM cohort (OR 1.34, 95% CI = 0.75-2.37, p = .32), there was no significant increase in risk of admission. In both cohorts, personality disorder increased the risk of any psychiatric readmission within 2 years. DISCUSSION: Patterns of increased risk of psychiatric readmission from above average suicidality detected via NLP during inpatient eating disorder admissions differed in our two patient cohorts. However, comorbid diagnoses such as personality disorder increased the risk of any psychiatric readmission across both cohorts. PUBLIC SIGNIFICANCE: Suicidality amongst is eating disorders is an extremely common presentation and it is important we further our understanding of identifying those most at risk. This research also provides a novel study design, comparing two NLP algorithms on electronic health record data based in the United States and United Kingdom on eating disorder inpatients. Studies researching both UK and US mental health patients are sparse therefore this study provides novel data.

Subject(s)

Feeding and Eating Disorders , Suicide , Humans , Patient Readmission , Electronic Health Records , Natural Language Processing , Aftercare , Patient Discharge

19.

Multi-ancestry genome- and phenome-wide association studies of diverticular disease in electronic health records with natural language processing enriched phenotyping algorithm.

Joo, Yoonjung Yoonie; Pacheco, Jennifer A; Thompson, William K; Rasmussen-Torvik, Laura J; Rasmussen, Luke V; Lin, Frederick T J; Andrade, Mariza de; Borthwick, Kenneth M; Bottinger, Erwin; Cagan, Andrew; Carrell, David S; Denny, Joshua C; Ellis, Stephen B; Gottesman, Omri; Linneman, James G; Pathak, Jyotishman; Peissig, Peggy L; Shang, Ning; Tromp, Gerard; Veerappan, Annapoorani; Smith, Maureen E; Chisholm, Rex L; Gawron, Andrew J; Hayes, M Geoffrey; Kho, Abel N.

PLoS One ; 18(5): e0283553, 2023.

Article in English | MEDLINE | ID: mdl-37196047

ABSTRACT

OBJECTIVE: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique. MATERIALS AND METHODS: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs. We performed genome-wide association studies (GWAS) of DD in European, African and multi-ancestry participants, followed by phenome-wide association studies (PheWAS) of the risk variants to identify their potential comorbid/pleiotropic effects in clinical phenotypes. RESULTS: Our developed algorithm showed a significant improvement in patient classification performance for DD analysis (algorithm PPVs ≥ 0.94), with up to a 3.5 fold increase in terms of the number of identified patients than the traditional method. Ancestry-stratified analyses of diverticulosis and diverticulitis of the identified subjects replicated the well-established associations between ARHGAP15 loci with DD, showing overall intensified GWAS signals in diverticulitis patients compared to diverticulosis patients. Our PheWAS analyses identified significant associations between the DD GWAS variants and circulatory system, genitourinary, and neoplastic EHR phenotypes. DISCUSSION: As the first multi-ancestry GWAS-PheWAS study, we showcased that heterogenous EHR data can be mapped through an integrative analytical pipeline and reveal significant genotype-phenotype associations with clinical interpretation. CONCLUSION: A systematic framework to process unstructured EHR data with NLP could advance a deep and scalable phenotyping for better patient identification and facilitate etiological investigation of a disease with multilayered data.

Subject(s)

Diverticular Diseases , Diverticulitis , Diverticulum , Humans , Electronic Health Records , Genome-Wide Association Study/methods , Natural Language Processing , Phenotype , Algorithms , Polymorphism, Single Nucleotide

20.

An NLP approach to identify SDoH-related circumstance and suicide crisis from death investigation narratives.

Wang, Song; Dang, Yifang; Sun, Zhaoyi; Ding, Ying; Pathak, Jyotishman; Tao, Cui; Xiao, Yunyu; Peng, Yifan.

J Am Med Inform Assoc ; 30(8): 1408-1417, 2023 07 19.

Article in English | MEDLINE | ID: mdl-37040620

ABSTRACT

OBJECTIVES: Suicide presents a major public health challenge worldwide, affecting people across the lifespan. While previous studies revealed strong associations between Social Determinants of Health (SDoH) and suicide deaths, existing evidence is limited by the reliance on structured data. To resolve this, we aim to adapt a suicide-specific SDoH ontology (Suicide-SDoHO) and use natural language processing (NLP) to effectively identify individual-level SDoH-related social risks from death investigation narratives. MATERIALS AND METHODS: We used the latest National Violent Death Report System (NVDRS), which contains 267â804 victim suicide data from 2003 to 2019. After adapting the Suicide-SDoHO, we developed a transformer-based model to identify SDoH-related circumstances and crises in death investigation narratives. We applied our model retrospectively to annotate narratives whose crisis variables were not coded in NVDRS. The crisis rates were calculated as the percentage of the group's total suicide population with the crisis present. RESULTS: The Suicide-SDoHO contains 57 fine-grained circumstances in a hierarchical structure. Our classifier achieves AUCs of 0.966 and 0.942 for classifying circumstances and crises, respectively. Through the crisis trend analysis, we observed that not everyone is equally affected by SDoH-related social risks. For the economic stability crisis, our result showed a significant increase in crisis rate in 2007-2009, parallel with the Great Recession. CONCLUSIONS: This is the first study curating a Suicide-SDoHO using death investigation narratives. We showcased that our model can effectively classify SDoH-related social risks through NLP approaches. We hope our study will facilitate the understanding of suicide crises and inform effective prevention strategies.

Subject(s)

Homicide , Suicide , Humans , Natural Language Processing , Retrospective Studies , Social Determinants of Health , Cause of Death , Violence , Population Surveillance

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL