Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 52
Filter
1.
JCO Clin Cancer Inform ; 8: e2400051, 2024 May.
Article in English | MEDLINE | ID: mdl-38713889

ABSTRACT

This new editorial discusses the promise and challenges of successful integration of natural language processing methods into electronic health records for timely, robust, and fair oncology pharmacovigilance.


Subject(s)
Artificial Intelligence , Electronic Health Records , Medical Oncology , Natural Language Processing , Pharmacovigilance , Humans , Medical Oncology/methods , Data Collection/methods , Neoplasms/drug therapy , Adverse Drug Reaction Reporting Systems
3.
medRxiv ; 2024 Mar 27.
Article in English | MEDLINE | ID: mdl-38585973

ABSTRACT

Natural Language Processing (NLP) is a study of automated processing of text data. Application of NLP in the clinical domain is important due to the rich unstructured information implanted in clinical documents, which often remains inaccessible in structured data. Empowered by the recent advance of language models (LMs), there is a growing interest in their application within the clinical domain. When applying NLP methods to a certain domain, the role of benchmark datasets are crucial as benchmark datasets not only guide the selection of best-performing models but also enable assessing of the reliability of the generated outputs. Despite the recent availability of LMs capable of longer context, benchmark datasets targeting long clinical document classification tasks are absent. To address this issue, we propose LCD benchmark, a benchmark for the task of predicting 30-day out-of-hospital mortality using discharge notes of MIMIC-IV and statewide death data. Our notes have a median word count of 1687 and an interquartile range of 1308 to 2169. We evaluated this benchmark dataset using baseline models, from bag-of-words and CNN to Hierarchical Transformer and an open-source instruction-tuned large language model. Additionally, we provide a comprehensive analysis of the model outputs, including manual review and visualization of model weights, to offer insights into their predictive capabilities and limitations. We expect LCD benchmarks to become a resource for the development of advanced supervised models, prompting methods, or the foundation models themselves, tailored for clinical text. The benchmark dataset is available at https://github.com/Machine-Learning-for-Medical-Language/long-clinical-doc.

4.
J Clin Oncol ; 42(14): 1607-1611, 2024 May 10.
Article in English | MEDLINE | ID: mdl-38452323

ABSTRACT

A call to action to bring stakeholders together to plan for the future of LLM-enhanced cancer survivorship.


Subject(s)
Cancer Survivors , Neoplasms , Humans , Neoplasms/therapy , Neoplasms/mortality , Neoplasms/psychology , Survivorship
5.
6.
JAMA Oncol ; 10(4): 538-539, 2024 Apr 01.
Article in English | MEDLINE | ID: mdl-38358777

Subject(s)
Language , Humans
7.
NPJ Digit Med ; 7(1): 6, 2024 Jan 11.
Article in English | MEDLINE | ID: mdl-38200151

ABSTRACT

Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.

8.
J Am Med Inform Assoc ; 31(4): 940-948, 2024 Apr 03.
Article in English | MEDLINE | ID: mdl-38261400

ABSTRACT

OBJECTIVE: Large language models (LLMs) have shown impressive ability in biomedical question-answering, but have not been adequately investigated for more specific biomedical applications. This study investigates ChatGPT family of models (GPT-3.5, GPT-4) in biomedical tasks beyond question-answering. MATERIALS AND METHODS: We evaluated model performance with 11 122 samples for two fundamental tasks in the biomedical domain-classification (n = 8676) and reasoning (n = 2446). The first task involves classifying health advice in scientific literature, while the second task is detecting causal relations in biomedical literature. We used 20% of the dataset for prompt development, including zero- and few-shot settings with and without chain-of-thought (CoT). We then evaluated the best prompts from each setting on the remaining dataset, comparing them to models using simple features (BoW with logistic regression) and fine-tuned BioBERT models. RESULTS: Fine-tuning BioBERT produced the best classification (F1: 0.800-0.902) and reasoning (F1: 0.851) results. Among LLM approaches, few-shot CoT achieved the best classification (F1: 0.671-0.770) and reasoning (F1: 0.682) results, comparable to the BoW model (F1: 0.602-0.753 and 0.675 for classification and reasoning, respectively). It took 78 h to obtain the best LLM results, compared to 0.078 and 0.008 h for the top-performing BioBERT and BoW models, respectively. DISCUSSION: The simple BoW model performed similarly to the most complex LLM prompting. Prompt engineering required significant investment. CONCLUSION: Despite the excitement around viral ChatGPT, fine-tuning for two fundamental biomedical natural language processing tasks remained the best strategy.


Subject(s)
Language , Natural Language Processing
9.
JCO Clin Cancer Inform ; 7: e2300136, 2023 Sep.
Article in English | MEDLINE | ID: mdl-38055914

ABSTRACT

In August 2022, the Cancer Informatics for Cancer Centers brought together cancer informatics leaders for its biannual symposium, Precision Medicine Applications in Radiation Oncology, co-chaired by Quynh-Thu Le, MD (Stanford University), and Walter J. Curran, MD (GenesisCare). Over the course of 3 days, presenters discussed a range of topics relevant to radiation oncology and the cancer informatics community more broadly, including biomarker development, decision support algorithms, novel imaging tools, theranostics, and artificial intelligence (AI) for the radiotherapy workflow. Since the symposium, there has been an impressive shift in the promise and potential for integration of AI in clinical care, accelerated in large part by major advances in generative AI. AI is now poised more than ever to revolutionize cancer care. Radiation oncology is a field that uses and generates a large amount of digital data and is therefore likely to be one of the first fields to be transformed by AI. As experts in the collection, management, and analysis of these data, the informatics community will take a leading role in ensuring that radiation oncology is prepared to take full advantage of these technological advances. In this report, we provide highlights from the symposium, which took place in Santa Barbara, California, from August 29 to 31, 2022. We discuss lessons learned from the symposium for data acquisition, management, representation, and sharing, and put these themes into context to prepare radiation oncology for the successful and safe integration of AI and informatics technologies.


Subject(s)
Neoplasms , Radiation Oncology , Humans , Artificial Intelligence , Informatics , Neoplasms/diagnosis , Neoplasms/radiotherapy
10.
medRxiv ; 2023 Sep 12.
Article in English | MEDLINE | ID: mdl-37745558

ABSTRACT

Because humans age at different rates, a person's physical appearance may yield insights into their biological age and physiological health more reliably than their chronological age. In medicine, however, appearance is incorporated into medical judgments in a subjective and non-standardized fashion. In this study, we developed and validated FaceAge, a deep learning system to estimate biological age from easily obtainable and low-cost face photographs. FaceAge was trained on data from 58,851 healthy individuals, and clinical utility was evaluated on data from 6,196 patients with cancer diagnoses from two institutions in the United States and The Netherlands. To assess the prognostic relevance of FaceAge estimation, we performed Kaplan Meier survival analysis. To test a relevant clinical application of FaceAge, we assessed the performance of FaceAge in end-of-life patients with metastatic cancer who received palliative treatment by incorporating FaceAge into clinical prediction models. We found that, on average, cancer patients look older than their chronological age, and looking older is correlated with worse overall survival. FaceAge demonstrated significant independent prognostic performance in a range of cancer types and stages. We found that FaceAge can improve physicians' survival predictions in incurable patients receiving palliative treatments, highlighting the clinical utility of the algorithm to support end-of-life decision-making. FaceAge was also significantly associated with molecular mechanisms of senescence through gene analysis, while age was not. These findings may extend to diseases beyond cancer, motivating using deep learning algorithms to translate a patient's visual appearance into objective, quantitative, and clinically useful measures.

11.
Front Oncol ; 13: 1135400, 2023.
Article in English | MEDLINE | ID: mdl-37746299

ABSTRACT

Introduction: Approximately 1.6 million people in the US identify as transgender, many of whom undergo gender-affirming medical or surgical therapies. While transgender individuals are diagnosed with cancer at similar rates as those who are cisgender, the impacts of radiation therapy on outcomes of gender-affirming care in transgender, nonbinary, and gender-expansive people with cancer are understudied. We report on the experiences and outcomes of transgender and gender-expansive patients receiving radiation therapy for cancer treatment. Methods: This study is a multi-institutional retrospective review of patients evaluated from 2005-2019 identified as transgender or gender-expansive in the medical record and treated with radiation therapy. Results: We identified 23 patients who received radiation to 32 sites, including 12 (38%) to the brain, head, or neck, 8 (25%) to the thorax, and 7 (22%) to the pelvis. Seventeen patients (74%) received gender-affirming hormone therapy and 13 patients (57%) underwent gender-affirming surgery. Four patients had pelvic radiation before or after gender-affirming pelvic surgery, including two trans women who had pelvic radiation after vaginoplasty. Four patients had radiation to the chest or thorax and gender-affirming chest or breast surgery, including two trans men with breast cancer. Two pediatric patients developed hypopituitarism and hypogonadism secondary to radiation therapy and, as adults, changed their hormone replacement therapy to affirm their transgender identities. Discussion: Transgender people with cancer undergo radiation therapy for a wide range of cancers. Understanding their prior gender-affirming medical or surgical treatments and future gender affirmation goals may identify important considerations for their oncologic care.

12.
JAMA Oncol ; 9(10): 1459-1462, 2023 Oct 01.
Article in English | MEDLINE | ID: mdl-37615976

ABSTRACT

This survey study examines the performance of a large language model chatbot in providing cancer treatment recommendations that are concordant with National Comprehensive Cancer Network guidelines.


Subject(s)
Artificial Intelligence , Neoplasms , Humans , Neoplasms/therapy
13.
JCO Clin Cancer Inform ; 7: e2300048, 2023 07.
Article in English | MEDLINE | ID: mdl-37506330

ABSTRACT

PURPOSE: Radiotherapy (RT) toxicities can impair survival and quality of life, yet remain understudied. Real-world evidence holds potential to improve our understanding of toxicities, but toxicity information is often only in clinical notes. We developed natural language processing (NLP) models to identify the presence and severity of esophagitis from notes of patients treated with thoracic RT. METHODS: Our corpus consisted of a gold-labeled data set of 1,524 clinical notes from 124 patients with lung cancer treated with RT, manually annotated for Common Terminology Criteria for Adverse Events (CTCAE) v5.0 esophagitis grade, and a silver-labeled data set of 2,420 notes from 1,832 patients from whom toxicity grades had been collected as structured data during clinical care. We fine-tuned statistical and pretrained Bidirectional Encoder Representations from Transformers-based models for three esophagitis classification tasks: task 1, no esophagitis versus grade 1-3; task 2, grade ≤1 versus >1; and task 3, no esophagitis versus grade 1 versus grade 2-3. Transferability was tested on 345 notes from patients with esophageal cancer undergoing RT. RESULTS: Fine-tuning of PubMedBERT yielded the best performance. The best macro-F1 was 0.92, 0.82, and 0.74 for tasks 1, 2, and 3, respectively. Selecting the most informative note sections during fine-tuning improved macro-F1 by ≥2% for all tasks. Silver-labeled data improved the macro-F1 by ≥3% across all tasks. For the esophageal cancer notes, the best macro-F1 was 0.73, 0.74, and 0.65 for tasks 1, 2, and 3, respectively, without additional fine-tuning. CONCLUSION: To our knowledge, this is the first effort to automatically extract esophagitis toxicity severity according to CTCAE guidelines from clinical notes. This provides proof of concept for NLP-based automated detailed toxicity monitoring in expanded domains.


Subject(s)
Esophageal Neoplasms , Esophagitis , Humans , Natural Language Processing , Quality of Life , Silver , Esophagitis/diagnosis , Esophagitis/etiology
14.
JCO Clin Cancer Inform ; 7: e2200196, 2023 05.
Article in English | MEDLINE | ID: mdl-37235847

ABSTRACT

PURPOSE: There is an unmet need to empirically explore and understand drivers of cancer disparities, particularly social determinants of health. We explored natural language processing methods to automatically and empirically extract clinical documentation of social contexts and needs that may underlie disparities. METHODS: This was a retrospective analysis of 230,325 clinical notes from 5,285 patients treated with radiotherapy from 2007 to 2019. We compared linguistic features among White versus non-White, low-income insurance versus other insurance, and male versus female patients' notes. Log odds ratios with an informative Dirichlet prior were calculated to compare words over-represented in each group. A variational autoencoder topic model was applied, and topic probability was compared between groups. The presence of machine-learnable bias was explored by developing statistical and neural demographic group classifiers. RESULTS: Terms associated with varied social contexts and needs were identified for all demographic group comparisons. For example, notes of non-White and low-income insurance patients were over-represented with terms associated with housing and transportation, whereas notes of White and other insurance patients were over-represented with terms related to physical activity. Topic models identified a social history topic, and topic probability varied significantly between the demographic group comparisons. Classification models performed poorly at classifying notes of non-White and low-income insurance patients (F1 of 0.30 and 0.23, respectively). CONCLUSION: Exploration of linguistic differences in clinical notes between patients of different race/ethnicity, insurance status, and sex identified social contexts and needs in patients with cancer and revealed high-level differences in notes. Future work is needed to validate whether these findings may play a role in cancer disparities.


Subject(s)
Natural Language Processing , Neoplasms , Humans , Male , Female , Retrospective Studies , Social Environment , Neoplasms/diagnosis , Neoplasms/epidemiology , Neoplasms/therapy
16.
Int J Radiat Oncol Biol Phys ; 117(1): 262-273, 2023 09 01.
Article in English | MEDLINE | ID: mdl-36990288

ABSTRACT

PURPOSE: Real-world evidence for radiation therapy (RT) is limited because it is often documented only in the clinical narrative. We developed a natural language processing system for automated extraction of detailed RT events from text to support clinical phenotyping. METHODS AND MATERIALS: A multi-institutional data set of 96 clinician notes, 129 North American Association of Central Cancer Registries cancer abstracts, and 270 RT prescriptions from HemOnc.org was used and divided into train, development, and test sets. Documents were annotated for RT events and associated properties: dose, fraction frequency, fraction number, date, treatment site, and boost. Named entity recognition models for properties were developed by fine-tuning BioClinicalBERT and RoBERTa transformer models. A multiclass RoBERTa-based relation extraction model was developed to link each dose mention with each property in the same event. Models were combined with symbolic rules to create a hybrid end-to-end pipeline for comprehensive RT event extraction. RESULTS: Named entity recognition models were evaluated on the held-out test set with F1 results of 0.96, 0.88, 0.94, 0.88, 0.67, and 0.94 for dose, fraction frequency, fraction number, date, treatment site, and boost, respectively. The relation model achieved an average F1 of 0.86 when the input was gold-labeled entities. The end-to-end system F1 result was 0.81. The end-to-end system performed best on North American Association of Central Cancer Registries abstracts (average F1 0.90), which are mostly copy-paste content from clinician notes. CONCLUSIONS: We developed methods and a hybrid end-to-end system for RT event extraction, which is the first natural language processing system for this task. This system provides proof-of-concept for real-world RT data collection for research and is promising for the potential of natural language processing methods to support clinical care.


Subject(s)
Natural Language Processing , Neoplasms , Humans , Neoplasms/radiotherapy , Electronic Health Records
18.
Cancer Med ; 12(4): 4715-4724, 2023 02.
Article in English | MEDLINE | ID: mdl-36398619

ABSTRACT

BACKGROUND: Cancer trial accrual is a national priority, yet up to 20% of trials fail to accrue. Trial eligibility criteria growth may be associated with accrual failure. We sought to quantify eligibility criteria growth within National Cancer Institute (NCI)-affiliated trials and determine impact on accrual. METHODS: Utilizing the Aggregated Analysis of ClinicalTrials.gov, we analyzed phase II/III interventional NCI-affiliated trials initiated between 2008 and 2018. Eligibility criteria growth was assessed via number of unique content words within combined inclusion and exclusion criteria. Association between unique word count and accrual failure was evaluated with multivariable logistic regression, adjusting for known predictors of failure. Medical terms associated with accrual failure were identified via natural language processing and categorized. RESULTS: Of 1197 trials, 231 (19.3%) failed due to low accrual. Accrual failure rate increased with eligibility criteria growth, from 11.8% in the lowest decile (12-112 words) to 29.4% in the highest decile (445-750 words). Median eligibility criteria increased over time, from 214 (IQR [23, 282]) unique content words in 2008 to 417 (IQR [289, 514]) in 2018 (r2  = 0.73, P < 0.001). Eligibility criteria growth was independently associated with accrual failure (OR: 1.09 per decile, 95% CI [1.03-1.15], p = 0.004). Eighteen exclusion criteria categories were significantly associated with accrual failure, including renal, pulmonary, and diabetic, among others (Bonferroni-corrected p < 0.001). CONCLUSIONS: Eligibility criteria content growth is increasing dramatically among NCI-affiliated trials and is strongly associated with accrual failure. These findings support national initiatives to simplify eligibility criteria and suggest that further efforts are warranted to improve cancer trial accrual.


Subject(s)
Neoplasms , United States , Humans , National Cancer Institute (U.S.) , Neoplasms/therapy , Neoplasms/drug therapy , Research Design , Patient Selection , Logistic Models
19.
Front Oncol ; 13: 1305511, 2023.
Article in English | MEDLINE | ID: mdl-38239639

ABSTRACT

Introduction: Artificial intelligence (AI)-based technologies embody countless solutions in radiation oncology, yet translation of AI-assisted software tools to actual clinical environments remains unrealized. We present the Deep Learning On-Demand Assistant (DL-ODA), a fully automated, end-to-end clinical platform that enables AI interventions for any disease site featuring an automated model-training pipeline, auto-segmentations, and QA reporting. Materials and methods: We developed, tested, and prospectively deployed the DL-ODA system at a large university affiliated hospital center. Medical professionals activate the DL-ODA via two pathways (1): On-Demand, used for immediate AI decision support for a patient-specific treatment plan, and (2) Ambient, in which QA is provided for all daily radiotherapy (RT) plans by comparing DL segmentations with manual delineations and calculating the dosimetric impact. To demonstrate the implementation of a new anatomy segmentation, we used the model-training pipeline to generate a breast segmentation model based on a large clinical dataset. Additionally, the contour QA functionality of existing models was assessed using a retrospective cohort of 3,399 lung and 885 spine RT cases. Ambient QA was performed for various disease sites including spine RT and heart for dosimetric sparing. Results: Successful training of the breast model was completed in less than a day and resulted in clinically viable whole breast contours. For the retrospective analysis, we evaluated manual-versus-AI similarity for the ten most common structures. The DL-ODA detected high similarities in heart, lung, liver, and kidney delineations but lower for esophagus, trachea, stomach, and small bowel due largely to incomplete manual contouring. The deployed Ambient QAs for heart and spine sites have prospectively processed over 2,500 cases and 230 cases over 9 months and 5 months, respectively, automatically alerting the RT personnel. Discussion: The DL-ODA capabilities in providing universal AI interventions were demonstrated for On-Demand contour QA, DL segmentations, and automated model training, and confirmed successful integration of the system into a large academic radiotherapy department. The novelty of deploying the DL-ODA as a multi-modal, fully automated end-to-end AI clinical implementation solution marks a significant step towards a generalizable framework that leverages AI to improve the efficiency and reliability of RT systems.

20.
Yearb Med Inform ; 31(1): 121-130, 2022 Aug.
Article in English | MEDLINE | ID: mdl-36463869

ABSTRACT

OBJECTIVES: Disparities in cancer incidence and outcomes across race, ethnicity, gender, socioeconomic status, and geography are well-documented, but their etiologies are often poorly understood and multifactorial. Clinical informatics can provide tools to better understand and address these disparities by enabling high-throughput analysis of multiple types of data. Here, we review recent efforts in clinical informatics to study and measure disparities in cancer. METHODS: We carried out a narrative review of clinical informatics studies related to cancer disparities and bias published from 2018-2021, with a focus on domains such as real-world data (RWD) analysis, natural language processing (NLP), radiomics, genomics, proteomics, metabolomics, and metagenomics. RESULTS: Clinical informatics studies that investigated cancer disparities across race, ethnicity, gender, and age were identified. Most cancer disparities work within clinical informatics used RWD analysis, NLP, radiomics, and genomics. Emerging applications of clinical informatics to understand cancer disparities, including proteomics, metabolomics, and metagenomics, were less well represented in the literature but are promising future research avenues. Algorithmic bias was identified as an important consideration when developing and implementing cancer clinical informatics techniques, and efforts to address this bias were reviewed. CONCLUSIONS: In recent years, clinical informatics has been used to probe a range of data sources to understand cancer disparities across different populations. As informatics tools become integrated into clinical decision-making, attention will need to be paid to ensure that algorithmic bias does not amplify existing disparities. In our increasingly interconnected medical systems, clinical informatics is poised to untap the full potential of multi-platform health data to address cancer disparities.


Subject(s)
Medical Informatics , Neoplasms , Humans , Neoplasms/epidemiology , Neoplasms/therapy , Genomics , Natural Language Processing , Proteomics
SELECTION OF CITATIONS
SEARCH DETAIL
...