Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
JAMA Netw Open ; 7(5): e248895, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38713466

ABSTRACT

Importance: The introduction of large language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4; OpenAI), has generated significant interest in health care, yet studies evaluating their performance in a clinical setting are lacking. Determination of clinical acuity, a measure of a patient's illness severity and level of required medical attention, is one of the foundational elements of medical reasoning in emergency medicine. Objective: To determine whether an LLM can accurately assess clinical acuity in the emergency department (ED). Design, Setting, and Participants: This cross-sectional study identified all adult ED visits from January 1, 2012, to January 17, 2023, at the University of California, San Francisco, with a documented Emergency Severity Index (ESI) acuity level (immediate, emergent, urgent, less urgent, or nonurgent) and with a corresponding ED physician note. A sample of 10 000 pairs of ED visits with nonequivalent ESI scores, balanced for each of the 10 possible pairs of 5 ESI scores, was selected at random. Exposure: The potential of the LLM to classify acuity levels of patients in the ED based on the ESI across 10 000 patient pairs. Using deidentified clinical text, the LLM was queried to identify the patient with a higher-acuity presentation within each pair based on the patients' clinical history. An earlier LLM was queried to allow comparison with this model. Main Outcomes and Measures: Accuracy score was calculated to evaluate the performance of both LLMs across the 10 000-pair sample. A 500-pair subsample was manually classified by a physician reviewer to compare performance between the LLMs and human classification. Results: From a total of 251 401 adult ED visits, a balanced sample of 10 000 patient pairs was created wherein each pair comprised patients with disparate ESI acuity scores. Across this sample, the LLM correctly inferred the patient with higher acuity for 8940 of 10 000 pairs (accuracy, 0.89 [95% CI, 0.89-0.90]). Performance of the comparator LLM (accuracy, 0.84 [95% CI, 0.83-0.84]) was below that of its successor. Among the 500-pair subsample that was also manually classified, LLM performance (accuracy, 0.88 [95% CI, 0.86-0.91]) was comparable with that of the physician reviewer (accuracy, 0.86 [95% CI, 0.83-0.89]). Conclusions and Relevance: In this cross-sectional study of 10 000 pairs of ED visits, the LLM accurately identified the patient with higher acuity when given pairs of presenting histories extracted from patients' first ED documentation. These findings suggest that the integration of an LLM into ED workflows could enhance triage processes while maintaining triage quality and warrants further investigation.


Subject(s)
Emergency Service, Hospital , Patient Acuity , Humans , Emergency Service, Hospital/statistics & numerical data , Cross-Sectional Studies , Adult , Male , Female , Middle Aged , Severity of Illness Index , San Francisco
2.
medRxiv ; 2024 Apr 04.
Article in English | MEDLINE | ID: mdl-38633805

ABSTRACT

Importance: Large language models (LLMs) possess a range of capabilities which may be applied to the clinical domain, including text summarization. As ambient artificial intelligence scribes and other LLM-based tools begin to be deployed within healthcare settings, rigorous evaluations of the accuracy of these technologies are urgently needed. Objective: To investigate the performance of GPT-4 and GPT-3.5-turbo in generating Emergency Department (ED) discharge summaries and evaluate the prevalence and type of errors across each section of the discharge summary. Design: Cross-sectional study. Setting: University of California, San Francisco ED. Participants: We identified all adult ED visits from 2012 to 2023 with an ED clinician note. We randomly selected a sample of 100 ED visits for GPT-summarization. Exposure: We investigate the potential of two state-of-the-art LLMs, GPT-4 and GPT-3.5-turbo, to summarize the full ED clinician note into a discharge summary. Main Outcomes and Measures: GPT-3.5-turbo and GPT-4-generated discharge summaries were evaluated by two independent Emergency Medicine physician reviewers across three evaluation criteria: 1) Inaccuracy of GPT-summarized information; 2) Hallucination of information; 3) Omission of relevant clinical information. On identifying each error, reviewers were additionally asked to provide a brief explanation for their reasoning, which was manually classified into subgroups of errors. Results: From 202,059 eligible ED visits, we randomly sampled 100 for GPT-generated summarization and then expert-driven evaluation. In total, 33% of summaries generated by GPT-4 and 10% of those generated by GPT-3.5-turbo were entirely error-free across all evaluated domains. Summaries generated by GPT-4 were mostly accurate, with inaccuracies found in only 10% of cases, however, 42% of the summaries exhibited hallucinations and 47% omitted clinically relevant information. Inaccuracies and hallucinations were most commonly found in the Plan sections of GPT-generated summaries, while clinical omissions were concentrated in text describing patients' Physical Examination findings or History of Presenting Complaint. Conclusions and Relevance: In this cross-sectional study of 100 ED encounters, we found that LLMs could generate accurate discharge summaries, but were liable to hallucination and omission of clinically relevant information. A comprehensive understanding of the location and type of errors found in GPT-generated clinical text is important to facilitate clinician review of such content and prevent patient harm.

3.
JAMIA Open ; 7(1): ooad112, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38223407

ABSTRACT

Objective: Existing research on social determinants of health (SDoH) predominantly focuses on physician notes and structured data within electronic medical records. This study posits that social work notes are an untapped, potentially rich source for SDoH information. We hypothesize that clinical notes recorded by social workers, whose role is to ameliorate social and economic factors, might provide a complementary information source of data on SDoH compared to physician notes, which primarily concentrate on medical diagnoses and treatments. We aimed to use word frequency analysis and topic modeling to identify prevalent terms and robust topics of discussion within a large cohort of social work notes including both outpatient and in-patient consultations. Materials and methods: We retrieved a diverse, deidentified corpus of 0.95 million clinical social work notes from 181 644 patients at the University of California, San Francisco. We conducted word frequency analysis related to ICD-10 chapters to identify prevalent terms within the notes. We then applied Latent Dirichlet Allocation (LDA) topic modeling analysis to characterize this corpus and identify potential topics of discussion, which was further stratified by note types and disease groups. Results: Word frequency analysis primarily identified medical-related terms associated with specific ICD10 chapters, though it also detected some subtle SDoH terms. In contrast, the LDA topic modeling analysis extracted 11 topics explicitly related to social determinants of health risk factors, such as financial status, abuse history, social support, risk of death, and mental health. The topic modeling approach effectively demonstrated variations between different types of social work notes and across patients with different types of diseases or conditions. Discussion: Our findings highlight LDA topic modeling's effectiveness in extracting SDoH-related themes and capturing variations in social work notes, demonstrating its potential for informing targeted interventions for at-risk populations. Conclusion: Social work notes offer a wealth of unique and valuable information on an individual's SDoH. These notes present consistent and meaningful topics of discussion that can be effectively analyzed and utilized to improve patient care and inform targeted interventions for at-risk populations.

4.
Clin Otolaryngol ; 48(3): 442-450, 2023 05.
Article in English | MEDLINE | ID: mdl-36645237

ABSTRACT

OBJECTIVE: There is a paucity of research examining patient experiences of cochlear implants. We sought to use natural language processing methods to explore patient experiences and concerns in the online cochlear implant (CI) community. MATERIALS AND METHODS: Cross-sectional study of posts on the online Reddit r/CochlearImplants forum from 1 March 2015 to 11 November 2021. Natural language processing using the BERTopic automated topic modelling technique was employed to cluster posts into semantically similar topics. Topic categorisation was manually validated by two independent reviewers and Cohen's kappa calculated to determine inter-rater reliability between machine vs human and human vs human categorisation. RESULTS: We retrieved 987 posts from 588 unique Reddit users on the r/CochlearImplants forum. Posts were initially categorised by BERTopic into 16 different Topics, which were increased to 23 Topics following manual inspection. The most popular topics related to CI connectivity (n = 112), adults considering getting a CI (n = 107), surgery-related posts (n = 89) and day-to-day living with a CI (n = 85). Cohen's kappa among all posts was 0.62 (machine vs. human) and 0.72 (human vs. human), and among categorised posts was 0.85 (machine vs. human) and 0.84 (human vs. human). CONCLUSIONS: This cross-sectional study of social media discussions among the online cochlear implant community identified common attitudes, experiences and concerns of patients living with, or seeking, a cochlear implant. Our validation of natural language processing methods to categorise topics shows that automated analysis of similar Otolaryngology-related content is a viable and accurate alternative to manual qualitative approaches.


Subject(s)
Cochlear Implantation , Cochlear Implants , Adult , Humans , Cross-Sectional Studies , Reproducibility of Results , Patient Outcome Assessment
5.
Laryngoscope ; 133(3): 476-484, 2023 03.
Article in English | MEDLINE | ID: mdl-35567387

ABSTRACT

OBJECTIVES: Salivary duct carcinoma (SDC) is a rare, aggressive malignancy with a poor prognosis. These tumors frequently stain positive for HER2/ErbB2, but data on the prognostic significance of HER2 status in SDC are mixed. We sought to determine whether HER2 status affects survival outcomes in SDC. METHODS: PubMed, Embase, and Web of Science databases were searched from inception to October 2020. Eligibility was restricted to studies reporting HER2/ErbB2 overexpression in histologically confirmed de novo SDC or SDC ex pleomorphic adenoma, with corresponding overall (OS) and disease-free (DFS) survival measures. Separate multivariable and univariable meta-analyses were performed using random-effects models. Statistical heterogeneity was estimated by Cochran's Q and I2 tests. Funnel plots were generated and Egger's test was used to assess for publication bias. The risk of bias was assessed with the Newcastle-Ottawa Scale. RESULTS: Of 183 unique citations, 14 studies of 663 patients were included. Most included studies determined HER2 status according to ASCO/CAP guidelines. The univariable meta-analysis did not reveal an effect between HER2 status and OS (HR 1.09, 95% CI 0.84-1.42). In the multivariable analysis, HER2 positivity was associated with a HR of 1.49 for OS (95% CI 0.96-2.30). Fewer studies reported data for DFS than OS, with no relationship between HER2 status and DFS found on multivariable or univariable meta-analyses. CONCLUSION: In patients with salivary duct carcinoma, HER2 positivity was not found to be associated with worse overall survival. This information may be useful when counseling patients and considering treatment options. Laryngoscope, 133:476-484, 2023.


Subject(s)
Adenocarcinoma , Adenoma, Pleomorphic , Salivary Gland Neoplasms , Humans , Salivary Ducts/pathology , Salivary Gland Neoplasms/pathology , Prognosis , Adenoma, Pleomorphic/pathology , Adenocarcinoma/pathology
6.
Prev Med Rep ; 23: 101493, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34367886

ABSTRACT

There has been conflicting public messaging from government and state officials about recommended health behaviours during the COVID-19 pandemic. We examined whether differences in political affiliation influences the public's interest in infection prevention measures in the United States. State-specific data on public search interest in four key infection prevention measures (Quarantine, Social distancing, Hand washing and Masks) were obtained from Google Trends for the period 1 January 2020 to 12 December 2020. Political affiliation was ascertained based on the 2020 U.S. Presidential election results and 2017 Cook Partisan Voting Index. Spearman's rank, partial correlation, and multiple regression analyses were conducted to compare political partisanship with public interest in infection prevention measures and overall case rate per 100 000 population. Statistical analysis was performed in R version 4.0.3. The COVID-19 pandemic has led to significantly increased public interest in infection prevention measures. The greater the support for the Democratic Party, the greater the search interest in all four measures analysed. Political partisanship was most highly correlated with searches relating to Quarantine (ρ = 0.79, p < 0.001), followed by Social distancing (ρ = 0.71, p < 0.001), Hand washing (ρ = 0.69, p < 0.001), and Masks (ρ = 0.66, p < 0.001). These findings were robust to using two different partisanship measures, controlling for state-level demographic variables, different pandemic onset dates, and using exact rather than Topic search methods. This partisan divide among the American people has important health implications that must be better addressed. We call for clear, bipartisan support of simple public health advice to combat the continued SARS-CoV-2 spread across the USA.

7.
J Virol ; 95(10)2021 04 26.
Article in English | MEDLINE | ID: mdl-33658340

ABSTRACT

HIV-1 infection persists in humans despite expression of antiviral type 1 interferons (IFN). Even exogenous administration of IFNα only marginally reduces HIV-1 abundance, raising the hypothesis that people living with HIV-1 (PLWH) are refractory to type 1 IFN. We demonstrated type 1 IFN refractoriness in CD4+ and CD8+ T cells isolated from HIV-1 infected persons by detecting diminished STAT1 phosphorylation (pSTAT1) and interferon-stimulated gene (ISG) induction upon type 1 IFN stimulation compared to healthy controls. Importantly, HIV-1 infected people who were virologically suppressed with antiretrovirals also showed type 1 IFN refractoriness. We found that USP18 levels were elevated in people with refractory pSTAT1 and ISG induction and confirmed this finding ex vivo in CD4+ T cells from another cohort of HIV-HCV coinfected persons who received exogenous pegylated interferon-α2b in a clinical trial. We used a cell culture model to recapitulate type 1 IFN refractoriness in uninfected CD4+ T cells that were conditioned with media from HIV-1 inoculated PBMCs, inhibiting de novo infection with antiretroviral agents. In this model, RNA interference against USP18 partly restored type 1 IFN responses in CD4+ T cells. We found evidence of type 1 IFN refractoriness in PLWH irrespective of virologic suppression that was associated with upregulated USP18, a process that might be therapeutically targeted to improve endogenous control of infection.ImportancePeople living with HIV-1 (PLWH) have elevated constitutive expression of type 1 interferons (IFN). However, it is unclear whether this impacts downstream innate immune responses. We identified refractory responses to type 1 IFN stimulation in T cells from PLWH, independent of antiretroviral treatment. Type 1 IFN refractoriness was linked to elevated USP18 levels in the same cells. Moreover, we found that USP18 levels predicted the anti-HIV-1 effect of type 1 IFN-based therapy on PLWH. In vitro, we demonstrated that refractory type 1 IFN responses were transferrable to HIV-1 uninfected target CD4+ T cells, and this phenomenon was mediated by type 1 IFN from HIV-1 infected cells. Type 1 IFN responses were partially restored by USP18 knockdown. Our findings illuminate a new mechanism by which HIV-1 contributes to innate immune dysfunction in PLWH, through the continuous production of type 1 IFN that induces a refractory state of responsiveness.

8.
PLoS One ; 16(2): e0247139, 2021.
Article in English | MEDLINE | ID: mdl-33596273

ABSTRACT

BACKGROUND: A significant proportion of the worldwide population is at risk of social isolation and loneliness as a result of the COVID-19 pandemic. We aimed to identify effective interventions to reduce social isolation and loneliness that are compatible with COVID-19 shielding and social distancing measures. METHODS AND FINDINGS: In this rapid systematic review, we searched six electronic databases (Medline, Embase, Web of Science, PsycINFO, Cochrane Database of Systematic Reviews and SCOPUS) from inception to April 2020 for systematic reviews appraising interventions for loneliness and/or social isolation. Primary studies from those reviews were eligible if they included: 1) participants in a non-hospital setting; 2) interventions to reduce social isolation and/or loneliness that would be feasible during COVID-19 shielding measures; 3) a relevant control group; and 4) quantitative measures of social isolation, social support or loneliness. At least two authors independently screened studies, extracted data, and assessed risk of bias using the Downs and Black checklist. Study registration: PROSPERO CRD42020178654. We identified 45 RCTs and 13 non-randomised controlled trials; none were conducted during the COVID-19 pandemic. The nature, type, and potential effectiveness of interventions varied greatly. Effective interventions for loneliness include psychological therapies such as mindfulness, lessons on friendship, robotic pets, and social facilitation software. Few interventions improved social isolation. Overall, 37 of 58 studies were of "Fair" quality, as measured by the Downs & Black checklist. The main study limitations identified were the inclusion of studies of variable quality; the applicability of our findings to the entire population; and the current poor understanding of the types of loneliness and isolation experienced by different groups affected by the COVID-19 pandemic. CONCLUSIONS: Many effective interventions involved cognitive or educational components, or facilitated communication between peers. These interventions may require minor modifications to align with COVID-19 shielding/social distancing measures. Future high-quality randomised controlled trials conducted under shielding/social distancing constraints are urgently needed.


Subject(s)
COVID-19/psychology , Quarantine/psychology , Social Isolation/psychology , COVID-19/epidemiology , COVID-19/immunology , Data Management , Female , Humans , Loneliness/psychology , Male , Mental Health/statistics & numerical data , Pandemics , Physical Distancing , Quarantine/trends , SARS-CoV-2/isolation & purification , Social Support
10.
Otol Neurotol ; 41(3): e349-e356, 2020 03.
Article in English | MEDLINE | ID: mdl-31821257

ABSTRACT

OBJECTIVE: To explore the Nijmegen Questionnaire (NQ) and its relationship to vestibular function tests and symptoms in patients with dizziness; to compare patient characteristics between those with a positive Nijmegen score and patients clinically diagnosed with hyperventilation syndrome (HVS). STUDY DESIGN: Retrospective case series. SETTING: Tertiary neurotology referral center. PATIENTS: Patients seen at vestibular assessment were grouped according to positive (≥24) or negative (<24) Nijmegen scores; secondary analysis was performed on patients grouped by a clinical diagnosis of hyperventilation syndrome. INTERVENTION(S): NQ, vestibular function tests, hospital anxiety and depression scale (HADS), vestibular rehabilitation benefit questionnaire (VRBQ). MAIN OUTCOME MEASURE(S): Medical records of patients presenting for vestibular assessment from January to December 2017 were retrospectively reviewed. Demographic data, self-reported questionnaire results, HVS diagnosis, vestibular test results, and reported symptoms were recorded. RESULTS: In total, 359 patients presented for vestibular assessment with completed NQ. One hundred thirty nine patients (39%) had a positive (≥24) Nijmegen score. In 34 patients, a diagnosis of hyperventilation syndrome was recorded; 10 of these patients did not have a positive Nijmegen score.There was no significant difference found in either vestibular lesion type or compensation status between patients with positive and negative Nijmegen scores (p > 0.05). Symptoms commonly described by patients with positive Nijmegen scores include "blurred vision," "tingling," "anxiety," "shortness of breath," "palpitations," "panic," "numbness," "chest pain," and "chest tightness." In contrast, when grouped by HVS diagnosis, patients with HVS were significantly more likely to have No Lesion detected on vestibular function testing (p = 0.0366). "Panic," "anxiety," and "tingling" were the only significant symptoms reported more often in the HVS diagnosis group, while "nausea/vomiting" and "vertigo" were reported significantly less frequently compared with the non-HVS diagnosis group. CONCLUSIONS: Hyperventilation is a complex stimulus, with some effects manifesting in neurotology clinics. This study reveals discrepancies in both vestibular assessment findings and symptom profiles between patients with a positive screening score in the NQ and patients clinically diagnosed with hyperventilation syndrome. This data will inform clinicians' interpretation of the NQ in the neurotologic setting.


Subject(s)
Hyperventilation , Vestibular Function Tests , Anxiety , Humans , Retrospective Studies , Surveys and Questionnaires
SELECTION OF CITATIONS
SEARCH DETAIL
...