Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
PLoS One ; 19(5): e0303519, 2024.
Article in English | MEDLINE | ID: mdl-38723044

ABSTRACT

OBJECTIVE: To establish whether or not a natural language processing technique could identify two common inpatient neurosurgical comorbidities using only text reports of inpatient head imaging. MATERIALS AND METHODS: A training and testing dataset of reports of 979 CT or MRI scans of the brain for patients admitted to the neurosurgery service of a single hospital in June 2021 or to the Emergency Department between July 1-8, 2021, was identified. A variety of machine learning and deep learning algorithms utilizing natural language processing were trained on the training set (84% of the total cohort) and tested on the remaining images. A subset comparison cohort (n = 76) was then assessed to compare output of the best algorithm against real-life inpatient documentation. RESULTS: For "brain compression", a random forest classifier outperformed other candidate algorithms with an accuracy of 0.81 and area under the curve of 0.90 in the testing dataset. For "brain edema", a random forest classifier again outperformed other candidate algorithms with an accuracy of 0.92 and AUC of 0.94 in the testing dataset. In the provider comparison dataset, for "brain compression," the random forest algorithm demonstrated better accuracy (0.76 vs 0.70) and sensitivity (0.73 vs 0.43) than provider documentation. For "brain edema," the algorithm again demonstrated better accuracy (0.92 vs 0.84) and AUC (0.45 vs 0.09) than provider documentation. DISCUSSION: A natural language processing-based machine learning algorithm can reliably and reproducibly identify selected common neurosurgical comorbidities from radiology reports. CONCLUSION: This result may justify the use of machine learning-based decision support to augment provider documentation.


Subject(s)
Comorbidity , Natural Language Processing , Humans , Algorithms , Inpatients/statistics & numerical data , Female , Male , Machine Learning , Magnetic Resonance Imaging/methods , Documentation , Middle Aged , Tomography, X-Ray Computed , Neurosurgical Procedures , Aged , Deep Learning
3.
NPJ Digit Med ; 7(1): 63, 2024 Mar 08.
Article in English | MEDLINE | ID: mdl-38459205

ABSTRACT

Despite the importance of informed consent in healthcare, the readability and specificity of consent forms often impede patients' comprehension. This study investigates the use of GPT-4 to simplify surgical consent forms and introduces an AI-human expert collaborative approach to validate content appropriateness. Consent forms from multiple institutions were assessed for readability and simplified using GPT-4, with pre- and post-simplification readability metrics compared using nonparametric tests. Independent reviews by medical authors and a malpractice defense attorney were conducted. Finally, GPT-4's potential for generating de novo procedure-specific consent forms was assessed, with forms evaluated using a validated 8-item rubric and expert subspecialty surgeon review. Analysis of 15 academic medical centers' consent forms revealed significant reductions in average reading time, word rarity, and passive sentence frequency (all P < 0.05) following GPT-4-faciliated simplification. Readability improved from an average college freshman to an 8th-grade level (P = 0.004), matching the average American's reading level. Medical and legal sufficiency consistency was confirmed. GPT-4 generated procedure-specific consent forms for five varied surgical procedures at an average 6th-grade reading level. These forms received perfect scores on a standardized consent form rubric and withstood scrutiny upon expert subspeciality surgeon review. This study demonstrates the first AI-human expert collaboration to enhance surgical consent forms, significantly improving readability without sacrificing clinical detail. Our framework could be extended to other patient communication materials, emphasizing clear communication and mitigating disparities related to health literacy barriers.

4.
J Neurosurg Case Lessons ; 7(4)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38252936

ABSTRACT

BACKGROUND: Suprasellar masses commonly include craniopharyngiomas and pituitary adenomas. Suprasellar glioblastoma is exceedingly rare with only a few prior case reports in the literature. Suprasellar glioblastoma can mimic craniopharyngioma or other more common suprasellar etiologies preoperatively. OBSERVATIONS: A 65-year-old male with no significant history presented to the emergency department with a subacute decline in mental status. Work-up revealed a large suprasellar mass with extension to the right inferior medial frontal lobe and right lateral ventricle, associated with significant vasogenic edema. The patient underwent an interhemispheric transcallosal approach subtotal resection of the interventricular portion of the mass. Pathological analysis revealed glioblastoma, MGMT partially methylated, with a BRAF V600E mutation. LESSONS: Malignant glioblastomas can mimic benign suprasellar masses and should remain on the differential for a diverse set of brain masses with a broad range of radiological and clinical features. For complex cases accessible from the ventricle where the pituitary complex cannot be confidently preserved via a transsphenoidal approach, an interhemispheric approach is also a practical initial surgical option. In addition to providing diagnostic value, molecular profiling may also reveal therapeutically significant gene alterations such as BRAF mutations.

5.
JAMA Surg ; 159(1): 87-95, 2024 Jan 01.
Article in English | MEDLINE | ID: mdl-37966807

ABSTRACT

Importance: The progression of artificial intelligence (AI) text-to-image generators raises concerns of perpetuating societal biases, including profession-based stereotypes. Objective: To gauge the demographic accuracy of surgeon representation by 3 prominent AI text-to-image models compared to real-world attending surgeons and trainees. Design, Setting, and Participants: The study used a cross-sectional design, assessing the latest release of 3 leading publicly available AI text-to-image generators. Seven independent reviewers categorized AI-produced images. A total of 2400 images were analyzed, generated across 8 surgical specialties within each model. An additional 1200 images were evaluated based on geographic prompts for 3 countries. The study was conducted in May 2023. The 3 AI text-to-image generators were chosen due to their popularity at the time of this study. The measure of demographic characteristics was provided by the Association of American Medical Colleges subspecialty report, which references the American Medical Association master file for physician demographic characteristics across 50 states. Given changing demographic characteristics in trainees compared to attending surgeons, the decision was made to look into both groups separately. Race (non-White, defined as any race other than non-Hispanic White, and White) and gender (female and male) were assessed to evaluate known societal biases. Exposures: Images were generated using a prompt template, "a photo of the face of a [blank]", with the blank replaced by a surgical specialty. Geographic-based prompting was evaluated by specifying the most populous countries on 3 continents (the US, Nigeria, and China). Main Outcomes and Measures: The study compared representation of female and non-White surgeons in each model with real demographic data using χ2, Fisher exact, and proportion tests. Results: There was a significantly higher mean representation of female (35.8% vs 14.7%; P < .001) and non-White (37.4% vs 22.8%; P < .001) surgeons among trainees than attending surgeons. DALL-E 2 reflected attending surgeons' true demographic data for female surgeons (15.9% vs 14.7%; P = .39) and non-White surgeons (22.6% vs 22.8%; P = .92) but underestimated trainees' representation for both female (15.9% vs 35.8%; P < .001) and non-White (22.6% vs 37.4%; P < .001) surgeons. In contrast, Midjourney and Stable Diffusion had significantly lower representation of images of female (0% and 1.8%, respectively; P < .001) and non-White (0.5% and 0.6%, respectively; P < .001) surgeons than DALL-E 2 or true demographic data. Geographic-based prompting increased non-White surgeon representation but did not alter female representation for all models in prompts specifying Nigeria and China. Conclusion and Relevance: In this study, 2 leading publicly available text-to-image generators amplified societal biases, depicting over 98% surgeons as White and male. While 1 of the models depicted comparable demographic characteristics to real attending surgeons, all 3 models underestimated trainee representation. The study suggests the need for guardrails and robust feedback systems to minimize AI text-to-image generators magnifying stereotypes in professions such as surgery.


Subject(s)
Specialties, Surgical , Surgeons , United States , Humans , Male , Female , Cross-Sectional Studies , Artificial Intelligence , Demography
6.
Neurosurgery ; 93(6): 1353-1365, 2023 Dec 01.
Article in English | MEDLINE | ID: mdl-37581444

ABSTRACT

BACKGROUND AND OBJECTIVES: Interest surrounding generative large language models (LLMs) has rapidly grown. Although ChatGPT (GPT-3.5), a general LLM, has shown near-passing performance on medical student board examinations, the performance of ChatGPT or its successor GPT-4 on specialized examinations and the factors affecting accuracy remain unclear. This study aims to assess the performance of ChatGPT and GPT-4 on a 500-question mock neurosurgical written board examination. METHODS: The Self-Assessment Neurosurgery Examinations (SANS) American Board of Neurological Surgery Self-Assessment Examination 1 was used to evaluate ChatGPT and GPT-4. Questions were in single best answer, multiple-choice format. χ 2 , Fisher exact, and univariable logistic regression tests were used to assess performance differences in relation to question characteristics. RESULTS: ChatGPT (GPT-3.5) and GPT-4 achieved scores of 73.4% (95% CI: 69.3%-77.2%) and 83.4% (95% CI: 79.8%-86.5%), respectively, relative to the user average of 72.8% (95% CI: 68.6%-76.6%). Both LLMs exceeded last year's passing threshold of 69%. Although scores between ChatGPT and question bank users were equivalent ( P = .963), GPT-4 outperformed both (both P < .001). GPT-4 answered every question answered correctly by ChatGPT and 37.6% (50/133) of remaining incorrect questions correctly. Among 12 question categories, GPT-4 significantly outperformed users in each but performed comparably with ChatGPT in 3 (functional, other general, and spine) and outperformed both users and ChatGPT for tumor questions. Increased word count (odds ratio = 0.89 of answering a question correctly per +10 words) and higher-order problem-solving (odds ratio = 0.40, P = .009) were associated with lower accuracy for ChatGPT, but not for GPT-4 (both P > .005). Multimodal input was not available at the time of this study; hence, on questions with image content, ChatGPT and GPT-4 answered 49.5% and 56.8% of questions correctly based on contextual context clues alone. CONCLUSION: LLMs achieved passing scores on a mock 500-question neurosurgical written board examination, with GPT-4 significantly outperforming ChatGPT.


Subject(s)
Neurosurgery , Humans , Neurosurgical Procedures , Odds Ratio , Self-Assessment , Spine
7.
Neurosurgery ; 93(5): 1090-1098, 2023 Nov 01.
Article in English | MEDLINE | ID: mdl-37306460

ABSTRACT

BACKGROUND AND OBJECTIVES: General large language models (LLMs), such as ChatGPT (GPT-3.5), have demonstrated the capability to pass multiple-choice medical board examinations. However, comparative accuracy of different LLMs and LLM performance on assessments of predominantly higher-order management questions is poorly understood. We aimed to assess the performance of 3 LLMs (GPT-3.5, GPT-4, and Google Bard) on a question bank designed specifically for neurosurgery oral boards examination preparation. METHODS: The 149-question Self-Assessment Neurosurgery Examination Indications Examination was used to query LLM accuracy. Questions were inputted in a single best answer, multiple-choice format. χ 2 , Fisher exact, and univariable logistic regression tests assessed differences in performance by question characteristics. RESULTS: On a question bank with predominantly higher-order questions (85.2%), ChatGPT (GPT-3.5) and GPT-4 answered 62.4% (95% CI: 54.1%-70.1%) and 82.6% (95% CI: 75.2%-88.1%) of questions correctly, respectively. By contrast, Bard scored 44.2% (66/149, 95% CI: 36.2%-52.6%). GPT-3.5 and GPT-4 demonstrated significantly higher scores than Bard (both P < .01), and GPT-4 outperformed GPT-3.5 ( P = .023). Among 6 subspecialties, GPT-4 had significantly higher accuracy in the Spine category relative to GPT-3.5 and in 4 categories relative to Bard (all P < .01). Incorporation of higher-order problem solving was associated with lower question accuracy for GPT-3.5 (odds ratio [OR] = 0.80, P = .042) and Bard (OR = 0.76, P = .014), but not GPT-4 (OR = 0.86, P = .085). GPT-4's performance on imaging-related questions surpassed GPT-3.5's (68.6% vs 47.1%, P = .044) and was comparable with Bard's (68.6% vs 66.7%, P = 1.000). However, GPT-4 demonstrated significantly lower rates of "hallucination" on imaging-related questions than both GPT-3.5 (2.3% vs 57.1%, P < .001) and Bard (2.3% vs 27.3%, P = .002). Lack of question text description for questions predicted significantly higher odds of hallucination for GPT-3.5 (OR = 1.45, P = .012) and Bard (OR = 2.09, P < .001). CONCLUSION: On a question bank of predominantly higher-order management case scenarios for neurosurgery oral boards preparation, GPT-4 achieved a score of 82.6%, outperforming ChatGPT and Google Bard.


Subject(s)
Neurosurgery , Humans , Neurosurgical Procedures , Odds Ratio , Search Engine , Self-Assessment , Natural Language Processing
10.
Neurosurg Focus ; 51(5): E11, 2021 11.
Article in English | MEDLINE | ID: mdl-34724645

ABSTRACT

OBJECTIVE: Accurate clinical documentation is foundational to any quality improvement endeavor as it is ultimately the medical record that is measured in assessing change. Literature on high-yield interventions to improve the accuracy and completeness of clinical documentation by neurosurgical providers is limited. Therefore, the authors sought to share a single-institution experience of a two-part intervention to enhance clinical documentation by a neurosurgery inpatient service. METHODS: At an urban, level I trauma, academic teaching hospital, a two-part intervention was implemented to enhance the accuracy of clinical documentation of neurosurgery inpatients by residents and advanced practice providers (APPs). Residents and APPs were instructed on the most common neurosurgical complications or comorbidities (CCs) and major complications or comorbidities (MCCs), as defined by Medicare. Additionally, a "system-based" progress note template was changed to a "problem-based" progress note template. Prepost analysis was performed to compare the CC/MCC capture rates for the 12 months prior to the intervention with those for the 3 months after the intervention. RESULTS: The CC/MCC capture rate for the neurosurgery service line rose from 62% in the 12 months preintervention to 74% in the 3 months after intervention, representing a significant change (p = 0.00002). CONCLUSIONS: Existing clinical documentation habits by neurosurgical residents and APPs may fail to capture the extent of neurosurgical inpatients with CC/MCCs. An intervention that focuses on the most common CC/MCCs and utilizes a problem-based progress note template may lead to more accurate appraisals of neurosurgical patient acuity.


Subject(s)
Documentation , Medicare , Academic Medical Centers , Aged , Comorbidity , Humans , Quality Improvement , United States
11.
Clin Neurol Neurosurg ; 207: 106770, 2021 08.
Article in English | MEDLINE | ID: mdl-34182238

ABSTRACT

OBJECTIVES: Opioids are frequently used for analgesia in patients with acute subarachnoid hemorrhage (SAH) due to a high prevalence of headache and neck pain. However, it is unclear if this practice may pose a risk for opioid dependence, as long-term opioid use in this population remains unknown. We sought to determine the prevalence of opioid use in SAH survivors, and to identify potential risk factors for opioid utilization. METHODS: We analyzed a cohort of consecutive patients admitted with non-traumatic and suspected aneurysmal SAH to an academic referral center. We included patients who survived hospitalization and excluded those who were not opioid-naïve. Potential risk factors for opioid prescription at discharge, 3 and 12 months post-discharge were assessed. RESULTS: Of 240 SAH patients who met our inclusion criteria (mean age 58.4 years [SD 14.8], 58% women), 233 (97%) received opioids during hospitalization and 152 (63%) received opioid prescription at discharge. Twenty-eight patients (12%) still continued to use opioids at 3 months post-discharge, and 13 patients (6%) at 12-month follow up. Although patients with poor Hunt and Hess grades (odds ratio 0.19, 95% CI 0.06-0.57) and those with intraventricular hemorrhage (odds ratio 0.38, 95% CI 0.18-0.87) were less likely to receive opioid prescriptions at discharge, we did not find significant differences between patients who had long-term opioid use and those who did not. CONCLUSION: Opioids are regularly used in both the acute SAH setting and immediately after discharge. A considerable number of patients also continue to use opioids in the long-term. Opioid-sparing pain control strategies should be explored in the future.


Subject(s)
Analgesics, Opioid/therapeutic use , Opioid-Related Disorders/epidemiology , Pain/drug therapy , Subarachnoid Hemorrhage/psychology , Survivors , Adult , Aged , Cohort Studies , Female , Humans , Male , Middle Aged , Pain/etiology , Pain/psychology , Risk Factors , Subarachnoid Hemorrhage/complications , Subarachnoid Hemorrhage/therapy
13.
Front Neurol ; 5: 113, 2014.
Article in English | MEDLINE | ID: mdl-25071701

ABSTRACT

Penetrating cranial injury by mechanisms other than gunshots are exceedingly rare, and so strategies and guidelines for the management of PBI are largely informed by data from higher-velocity penetrating injuries. Here, we present a case of penetrating brain injury by the low-velocity mechanism of a harpoon from an underwater fishing speargun in an attempted suicide by a 56-year-old Caucasian male. The case raised a number of interesting points in management of low-velocity penetrating brain injury (LVPBI), including benefit in delaying foreign body removal to allow for tamponade; the importance of history-taking in establishing the social/legal significance of the events surrounding the injury; the use of cerebral angiogram in all cases of PBI; advantages of using dual-energy CT to reduce artifact when available; and antibiotic prophylaxis in the context of idiosyncratic histories of usage of penetrating objects before coming in contact with the intracranial environment. We present here the management of the case in full along with an extended discussion and review of existing literature regarding key points in management of LVPBI vs. higher-velocity forms of intracranial injury.

14.
Clin Neurol Neurosurg ; 114(2): 108-11, 2012 Feb.
Article in English | MEDLINE | ID: mdl-21996584

ABSTRACT

BACKGROUND: Carotid endarterectomy (CEA) is one of the most commonly performed and studied surgical procedures for extracranial ischemic disease. OBJECTIVE: The authors reviewed the outcome of 39 consecutive carotid endarterectomy procedures performed by a single surgeon with emphasis on the safety of discharging patients the same day of the procedure. METHODS: Retrospective analysis was performed over a two-year period on patients who were admitted as outpatients and underwent CEA. Following CEA, patients were observed for 4-6h in the recovery room and Duplex ultrasonography was completed to assess the endarterectomy repair. Determination was then made whether patients could be safely discharged home. RESULTS: Over a two year period, CEA was performed 39 times in 37 outpatients. Twenty-five patients (64%) were discharged within 6h of surgery completion. The remaining 14 patients (36%) were admitted to the hospital for varying reasons. Six patients (43%) stayed either due to personal preference or the lack of supervision at home and six other patients (43%) stayed because of mild hemodynamic instability. Of the two remaining patients, one was admitted for chest pain and the other for a small wound hematoma. No patients developed postoperative neurologic deficits. Two-tailed Fisher test analysis of collected variables revealed that patients who had general anesthesia were more likely to be admitted (p<0.02). CONCLUSION: Patients undergoing CEA can be safely discharged the same day after a brief period of postoperative observation. One factor that may predict the need for postoperative admission is the use of general anesthesia.


Subject(s)
Ambulatory Surgical Procedures/adverse effects , Ambulatory Surgical Procedures/methods , Endarterectomy, Carotid/adverse effects , Endarterectomy, Carotid/methods , Aged , Aged, 80 and over , Anesthesia, Conduction , Anesthesia, General , Blood Pressure , Carotid Stenosis/diagnostic imaging , Carotid Stenosis/physiopathology , Carotid Stenosis/surgery , Constriction , Electroencephalography , Feasibility Studies , Female , Heart Rate , Humans , Length of Stay , Male , Middle Aged , Monitoring, Intraoperative , Oximetry , Patient Discharge , Patient Safety , Postoperative Care , Retrospective Studies , Treatment Outcome , Ultrasonography, Doppler, Duplex
17.
Radiology ; 230(2): 510-8, 2004 Feb.
Article in English | MEDLINE | ID: mdl-14699177

ABSTRACT

PURPOSE: To prospectively compare the effectiveness of multi-detector row computed tomographic (CT) angiography with that of conventional intraarterial digital subtraction angiography (DSA) used to detect intracranial aneurysms in patients with nontraumatic acute subarachnoid hemorrhage. MATERIALS AND METHODS: Thirty-five consecutive adult patients with acute subarachnoid hemorrhage were recruited into the institutional review board-approved study and gave informed consent. All patients underwent both multi-detector row CT angiography and DSA no more than 12 hours apart. CT angiography was performed with a multi-detector row scanner (four detector rows) by using collimation of 1.25 mm and pitch of 3. Images were interpreted at computer workstations in a blinded fashion. Two radiologists independently reviewed the CT images, and two other radiologists independently reviewed the DSA images. The presence and location of aneurysms were rated on a five-point scale for certainty. Sensitivity and specificity were calculated independently for image interpretation performed by the two CT image readers and the second DSA image reader by using the first DSA reader's interpretation as the reference standard. RESULTS: A total of 26 aneurysms were detected at DSA in 21 patients, and no aneurysms were detected in 14 patients. Sensitivity and specificity for CT angiography were, respectively, 90% and 93% for reader 1 and 81% and 93% for reader 2. The mean diameter of aneurysms detected on CT angiographic images was 4.4 mm, and the smallest aneurysm detected was 2.2 mm in diameter. Aneurysms that were missed at initial interpretation of CT angiographic images were identified at retrospective reading. CONCLUSION: Multi-detector row CT angiography has high sensitivity and specificity for detection of intracranial aneurysms, including small aneurysms, in patients with nontraumatic acute subarachnoid hemorrhage.


Subject(s)
Angiography, Digital Subtraction , Cerebral Angiography , Intracranial Aneurysm/diagnostic imaging , Tomography, Spiral Computed , Acute Disease , Adult , Aged , Female , Humans , Male , Middle Aged , Observer Variation , Prospective Studies , Sensitivity and Specificity , Subarachnoid Hemorrhage/diagnostic imaging
18.
AJNR Am J Neuroradiol ; 24(7): 1338-40, 2003 Aug.
Article in English | MEDLINE | ID: mdl-12917124

ABSTRACT

A 62-year-old woman had sudden-onset headache and posterior neck pain, and a subarachnoid hemorrhage was revealed by unenhanced CT. Both multi-detector CT angiography and digital subtraction angiography were performed and revealed a small intracanalicular aneurysm of the left anterior inferior cerebellar artery. The patient underwent successful retrosigmoid craniectomy and trapping of the aneurysm. This case shows the ability of multi-detector CT angiography to indicate bony landmarks that can alter the surgical approach.


Subject(s)
Cerebellum/blood supply , Cerebral Angiography/methods , Intracranial Aneurysm/diagnosis , Tomography, X-Ray Computed , Angiography, Digital Subtraction , Arteries/pathology , Cerebellum/diagnostic imaging , Cerebellum/pathology , Cranial Nerve Diseases/diagnosis , Female , Humans , Middle Aged
SELECTION OF CITATIONS
SEARCH DETAIL
...