Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
1.
AMIA Jt Summits Transl Sci Proc ; 2024: 652-661, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38827051

RESUMO

Accurate prediction of future clinical events such as discharge from hospital can not only improve hospital resource management but also provide an indicator of a patient's clinical condition. Within the scope of this work, we perform a comparative analysis of deep learning based fusion strategies against traditional single source models for prediction of discharge from hospital by fusing information encoded in two diverse but relevant data modalities, i.e., chest X-ray images and tabular electronic health records (EHR). We evaluate multiple fusion strategies including late, early and joint fusion in terms of their efficacy for target prediction compared to EHR-only and Image-only predictive models. Results indicated the importance of merging information from two modalities for prediction as fusion models tended to outperform single modality models and indicate that the joint fusion scheme was the most effective for target prediction. Joint fusion model merges the two modalities through a branched neural network that is jointly trained in an end-to-end fashion to extract target-relevant information from both modalities.

3.
medRxiv ; 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38562849

RESUMO

Training Large Language Models (LLMs) with billions of parameters on a dataset and publishing the model for public access is the standard practice currently. Despite their transformative impact on natural language processing, public LLMs present notable vulnerabilities given the source of training data is often web-based or crowdsourced, and hence can be manipulated by perpetrators. We delve into the vulnerabilities of clinical LLMs, particularly BioGPT which is trained on publicly available biomedical literature and clinical notes from MIMIC-III, in the realm of data poisoning attacks. Exploring susceptibility to data poisoning-based attacks on de-identified breast cancer clinical notes, our approach is the first one to assess the extent of such attacks and our findings reveal successful manipulation of LLM outputs. Through this work, we emphasize on the urgency of comprehending these vulnerabilities in LLMs, and encourage the mindful and responsible usage of LLMs in the clinical domain.

4.
Circ Cardiovasc Imaging ; 16(12): e014533, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-38073535

RESUMO

In addition to the traditional clinical risk factors, an increasing amount of imaging biomarkers have shown value for cardiovascular risk prediction. Clinical and imaging data are captured from a variety of data sources during multiple patient encounters and are often analyzed independently. Initial studies showed that fusion of both clinical and imaging features results in superior prognostic performance compared with traditional scores. There are different approaches to fusion modeling, combining multiple data resources to optimize predictions, each with its own advantages and disadvantages. However, manual extraction of clinical and imaging data is time and labor intensive and often not feasible in clinical practice. An automated approach for clinical and imaging data extraction is highly desirable. Convolutional neural networks and natural language processing can be utilized for the extraction of electronic medical record data, imaging studies, and free-text data. This review outlines the current status of cardiovascular risk prediction and fusion modeling; and in addition gives an overview of different artificial intelligence approaches to automatically extract data from images and electronic medical records for this purpose.


Assuntos
Inteligência Artificial , Redes Neurais de Computação , Humanos , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Diagnóstico por Imagem
5.
Int J Med Inform ; 179: 105212, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37729838

RESUMO

BACKGROUND: Billing codes are utilized for medical reimbursement, clinical quality metric valuation and for epidemiologic purposes to report and follow disease trends and outcomes. The current paradigm of manual coding can be expensive, time-consuming, and subject to human error. Though automation of the billing codes has been widely reported in the literature via rule-based and supervised approaches, existing strategies lack generalizability and robustness towards large and constantly changing ICD hierarchical structure. METHOD: We propose a weakly supervised training strategy by leveraging contrastive learning, contrastive diagnosis embedding (CDE) to capture the fine semantic variations between the diagnosis codes. The approach consists of a two-phase contrastive training for generating the semantic embedding space adapted to incorporate hierarchical information of ICD-10 vocabulary and a weakly supervised retrieval scheme. Core strength of the proposed method is that it puts no limit on the 70 K ICD-10 codes set and can handle all rare codes for coding the diagnosis. RESULTS: Our CDE model outperformed string-based partial matching and ClinicalBERT embedding on three test cases (a retrospective testset, a prospective testset, and external testset) and produced an accurate prediction of rare and newly introduced diagnosis codes. A detailed ablation study showed the importance of each phase of the proposed multi-phase training. Each successive phase of training - ICD-10 group sensitive training (phase 1.1), ICD-10 subgroup sensitive training (phase 1.2), free-text diagnosis description-based training (phase 2) - improved performance beyond the previous phase of training. The model also outperformed existing supervised models like CAML and PLM-ICD and produced satisfactory performance on the rare codes. CONCLUSION: Compared to the existing rule-based and supervised models, the proposed weakly supervised contrastive learning overcomes the limitations in terms of generalization capability and increases the robustness of the automated billing. Such a model will allow flexibility through accurate billing code automation for practice convergence and gains efficiencies in a value-based care payment environment.

6.
J Med Imaging (Bellingham) ; 10(3): 034004, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37388280

RESUMO

Purpose: Our study investigates whether graph-based fusion of imaging data with non-imaging electronic health records (EHR) data can improve the prediction of the disease trajectories for patients with coronavirus disease 2019 (COVID-19) beyond the prediction performance of only imaging or non-imaging EHR data. Approach: We present a fusion framework for fine-grained clinical outcome prediction [discharge, intensive care unit (ICU) admission, or death] that fuses imaging and non-imaging information using a similarity-based graph structure. Node features are represented by image embedding, and edges are encoded with clinical or demographic similarity. Results: Experiments on data collected from the Emory Healthcare Network indicate that our fusion modeling scheme performs consistently better than predictive models developed using only imaging or non-imaging features, with area under the receiver operating characteristics curve of 0.76, 0.90, and 0.75 for discharge from hospital, mortality, and ICU admission, respectively. External validation was performed on data collected from the Mayo Clinic. Our scheme highlights known biases in the model prediction, such as bias against patients with alcohol abuse history and bias based on insurance status. Conclusions: Our study signifies the importance of the fusion of multiple data modalities for the accurate prediction of clinical trajectories. The proposed graph structure can model relationships between patients based on non-imaging EHR data, and graph convolutional networks can fuse this relationship information with imaging data to effectively predict future disease trajectory more effectively than models employing only imaging or non-imaging data. Our graph-based fusion modeling frameworks can be easily extended to other prediction tasks to efficiently combine imaging data with non-imaging clinical data.

7.
J Am Med Inform Assoc ; 30(6): 1056-1067, 2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37027831

RESUMO

OBJECTIVE: Hospital acquired infections (HAIs) are one of the top 10 leading causes of death within the United States. While current standard of HAI risk prediction utilizes only a narrow set of predefined clinical variables, we propose a graph convolutional neural network (GNN)-based model which incorporates a wide variety of clinical features. MATERIALS AND METHODS: Our GNN-based model defines patients' similarity based on comprehensive clinical history and demographics and predicts all types of HAI rather than focusing on a single subtype. An HAI model was trained on 38 327 unique hospitalizations while a distinct model for surgical site infection (SSI) prediction was trained on 18 609 hospitalization. Both models were tested internally and externally on a geographically disparate site with varying infection rates. RESULTS: The proposed approach outperformed all baselines (single-modality models and length-of-stay [LoS]) with achieved area under the receiver operating characteristics of 0.86 [0.84-0.88] and 0.79 [0.75-0.83] (HAI), and 0.79 [0.75-0.83] and 0.76 [0.71-0.76] (SSI) for internal and external testing. Cost-effective analysis shows that the GNN modeling dominated the standard LoS model strategy on the basis of lower mean costs ($1651 vs $1915). DISCUSSION: The proposed HAI risk prediction model can estimate individualized risk of infection for patient by taking into account not only the patient's clinical features, but also clinical features of similar patients as indicated by edges of the patients' graph. CONCLUSIONS: The proposed model could allow prevention or earlier detection of HAI, which in turn could decrease hospital LoS and associated mortality, and ultimately reduce the healthcare cost.


Assuntos
Infecção Hospitalar , Humanos , Estados Unidos , Infecção Hospitalar/prevenção & controle , Hospitalização , Tempo de Internação , Custos de Cuidados de Saúde , Infecção da Ferida Cirúrgica , Hospitais
8.
Artigo em Inglês | MEDLINE | ID: mdl-37018684

RESUMO

Reduction in 30-day readmission rate is an important quality factor for hospitals as it can reduce the overall cost of care and improve patient post-discharge outcomes. While deep-learning-based studies have shown promising empirical results, several limitations exist in prior models for hospital readmission prediction, such as: (a) only patients with certain conditions are considered, (b) do not leverage data temporality, (c) individual admissions are assumed independent of each other, which ignores patient similarity, (d) limited to single modality or single center data. In this study, we propose a multimodal, spatiotemporal graph neural network (MM-STGNN) for prediction of 30-day all-cause hospital readmission, which fuses in-patient multimodal, longitudinal data and models patient similarity using a graph. Using longitudinal chest radiographs and electronic health records from two independent centers, we show that MM-STGNN achieved an area under the receiver operating characteristic curve (AUROC) of 0.79 on both datasets. Furthermore, MM-STGNN significantly outperformed the current clinical reference standard, LACE+ (AUROC=0.61), on the internal dataset. For subset populations of patients with heart disease, our model significantly outperformed baselines, such as gradient-boosting and Long Short-Term Memory models (e.g., AUROC improved by 3.7 points in patients with heart disease). Qualitative interpretability analysis indicated that while patients' primary diagnoses were not explicitly used to train the model, features crucial for model prediction may reflect patients' diagnoses. Our model could be utilized as an additional clinical decision aid during discharge disposition and triaging high-risk patients for closer post-discharge follow-up for potential preventive measures.

9.
Med Phys ; 50(7): 4296-4307, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36748265

RESUMO

BACKGROUND: While low bone density is a major burden on US health system, current osteoporosis screening guidelines by the US Preventive Services Task Force are limited to women aged ≥65 and all postmenopausal women with certain risk factors. Even within recommended screening groups, actual screening rates are low (<26%) and vary across socioeconomic groups. The proposed model can opportunistically screen patients using abdominal CT studies for low bone density who may otherwise go undiagnosed. PURPOSE: To develop an artificial intelligence (AI) model for opportunistic screening of low bone density using both contrast and non-contrast abdominopelvic computed tomography (CT) exams, for the purpose of referral to traditional bone health management, which typically begins with dual energy X-ray absorptiometry (DXA). METHODS: We collected 6083 contrast-enhanced CT imaging exams paired with DXA exams within ±6 months documented between May 2015 and August 2021 in a single institution with four major healthcare practice regions. Our fusion AI pipeline receives the coronal and axial plane images of a contrast enhanced abdominopelvic CT exam and basic patient demographics (age, gender, body cross section lengths) to predict risk of low bone mass. The models were trained on lumbar spine T-scores from DXA exams and tested on multi-site imaging exams. The model was again tested in a prospective group (N = 344) contrast-enhanced and non-contrast-enhanced studies. RESULTS: The models were evaluated on the same test set (1208 exams)-(1) Baseline model using demographic factors from electronic medical records (EMR) - 0.7 area under the curve of receiver operator characteristic (AUROC); Imaging based models: (2) axial view - 0.83 AUROC; (3) coronal view- 0.83 AUROC; (4) Fusion model-Imaging + demographic factors - 0.86 AUROC. The prospective test yielded one missed positive DXA case with a hip prosthesis among 23 positive contrast-enhanced CT exams and 0% false positive rate for non-contrast studies. Both positive cases among non-contrast enhanced CT exams were successfully detected. While only about 8% patients from prospective study received a DXA exam within 2 years, about 30% were detected with low bone mass by the fusion model, highlighting the need for opportunistic screening. CONCLUSIONS: The fusion model, which combines two planes of CT images and EMRs data, outperformed individual models and provided a high, robust diagnostic performance for opportunistic screening of low bone density using contrast and non-contrast CT exams. This model could potentially improve bone health risk assessment with no additional cost. The model's handling of metal implants is an ongoing effort.


Assuntos
Doenças Ósseas Metabólicas , Osteoporose , Humanos , Feminino , Osteoporose/diagnóstico por imagem , Densidade Óssea , Inteligência Artificial , Estudos Prospectivos , Absorciometria de Fóton , Tomografia Computadorizada por Raios X/métodos , Vértebras Lombares , Estudos Retrospectivos
10.
AMIA Annu Symp Proc ; 2023: 679-688, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222398

RESUMO

Intelligent prediction of risk of blood transfusion among hospitalized patients can identify at-risk patients and provide timely information to the hospital to plan and reserve resources to meet the demand of blood transfusion. While previously proposed solutions focus on sub-populations such as patients admitted to ICU after gastrointestinal bleeding or postpartum patients with hemorrhage, we design a predictive model applicable to complete in-patient population. Our model relies on patients' similarity graph based on temporal patterns among clinical history of the patients. These graphs are processed through graph convolutional neural network (GCNN) to estimate node or patient level risk of blood transfusion. Thus, our model not only learns from the patient's own clinical history but also from other patients with similar clinical history. The model is also capable of fusing diverse data elements from electronic health records (EHR) such as demographic information, billing codes, and recorded vital signs. Our model was validated on both internal and external sets and outperformed all comparative baseline models.


Assuntos
Transfusão de Sangue , Redes Neurais de Computação , Feminino , Humanos , Sinais Vitais
11.
medRxiv ; 2022 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-36324799

RESUMO

We propose a relational graph to incorporate clinical similarity between patients while building personalized clinical event predictors with a focus on hospitalized COVID-19 patients. Our graph formation process fuses heterogeneous data, i.e., chest X-rays as node features and non-imaging EHR for edge formation. While node represents a snap-shot in time for a single patient, weighted edge structure encodes complex clinical patterns among patients. While age and gender have been used in the past for patient graph formation, our method incorporates complex clinical history while avoiding manual feature selection. The model learns from the patient's own data as well as patterns among clinically-similar patients. Our visualization study investigates the effects of 'neighborhood' of a node on its predictiveness and showcases the model's tendency to focus on edge-connected patients with highly suggestive clinical features common with the node. The proposed model generalizes well by allowing edge formation process to adapt to an external cohort.

12.
J Pathol Inform ; 13: 100003, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35242443

RESUMO

Pathology reports primarily consist of unstructured free text and thus the clinical information contained in the reports is not trivial to access or query. Multiple natural language processing (NLP) techniques have been proposed to automate the coding of pathology reports via text classification. In this systematic review, we follow the guidelines proposed by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA; Page et al., 2020: BMJ.) to identify the NLP systems for classifying pathology reports published between the years of 2010 and 2021. Based on our search criteria, a total of 3445 records were retrieved, and 25 articles met the final review criteria. We benchmarked the systems based on methodology, complexity of the prediction task and core types of NLP models: i) Rule-based and Intelligent systems, ii) statistical machine learning, and iii) deep learning. While certain tasks are well addressed by these models, many others have limitations and remain as open challenges, such as, extraction of many cancer characteristics (size, shape, type of cancer, others) from pathology reports. We investigated the final set of papers (25) and addressed their potential as well as their limitations. We hope that this systematic review helps researchers prioritize the development of innovated approaches to tackle the current limitations and help the advancement of cancer research.

13.
J Biomed Semantics ; 13(1): 8, 2022 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-35197110

RESUMO

BACKGROUND: Transfer learning is a common practice in image classification with deep learning where the available data is often limited for training a complex model with millions of parameters. However, transferring language models requires special attention since cross-domain vocabularies (e.g. between two different modalities MR and US) do not always overlap as the pixel intensity range overlaps mostly for images. METHOD: We present a concept of similar domain adaptation where we transfer inter-institutional language models (context-dependent and context-independent) between two different modalities (ultrasound and MRI) to capture liver abnormalities. RESULTS: We use MR and US screening exam reports for hepatocellular carcinoma as the use-case and apply the transfer language space strategy to automatically label imaging exams with and without structured template with > 0.9 average f1-score. CONCLUSION: We conclude that transfer learning along with fine-tuning the discriminative model is often more effective for performing shared targeted tasks than the training for a language space from scratch.


Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Carcinoma Hepatocelular/diagnóstico por imagem , Humanos , Idioma , Neoplasias Hepáticas/diagnóstico por imagem , Processamento de Linguagem Natural
14.
J Digit Imaging ; 35(2): 137-152, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35022924

RESUMO

In recent years, generative adversarial networks (GANs) have gained tremendous popularity for various imaging related tasks such as artificial image generation to support AI training. GANs are especially useful for medical imaging-related tasks where training datasets are usually limited in size and heavily imbalanced against the diseased class. We present a systematic review, following the PRISMA guidelines, of recent GAN architectures used for medical image analysis to help the readers in making an informed decision before employing GANs in developing medical image classification and segmentation models. We have extracted 54 papers that highlight the capabilities and application of GANs in medical imaging from January 2015 to August 2020 and inclusion criteria for meta-analysis. Our results show four main architectures of GAN that are used for segmentation or classification in medical imaging. We provide a comprehensive overview of recent trends in the application of GANs in clinical diagnosis through medical image segmentation and classification and ultimately share experiences for task-based GAN implementations.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Humanos , Processamento de Imagem Assistida por Computador/métodos
15.
AMIA Annu Symp Proc ; 2022: 962-971, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37128387

RESUMO

Pathology text mining is a challenging task given the reporting variability and constant new findings in cancer sub-type definitions. However, successful text mining of a large pathology database can play a critical role to advance 'big data' cancer research like similarity-based treatment selection, case identification, prognostication, surveillance, clinical trial screening, risk stratification, and many others. While there is a growing interest in developing language models for more specific clinical domains, no pathology-specific language space exist to support the rapid data-mining development in pathology space. In literature, a few approaches fine-tuned general transformer models on specialized corpora while maintaining the original tokenizer, but in fields requiring specialized terminology, these models often fail to perform adequately. We propose PathologyBERT - a pre-trained masked language model which was trained on 347,173 histopathology specimen reports and publicly released in the Huggingface1 repository2. Our comprehensive experiments demonstrate that pre-training of transformer model on pathology corpora yields performance improvements on Natural Language Understanding (NLU) and Breast Cancer Diagnose Classification when compared to nonspecific language models.


Assuntos
Neoplasias da Mama , Processamento de Linguagem Natural , Humanos , Feminino , Idioma , Mineração de Dados , Big Data
16.
AMIA Annu Symp Proc ; 2022: 1052-1061, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37128395

RESUMO

We propose a relational graph to incorporate clinical similarity between patients while building personalized clinical event predictors with a focus on hospitalized COVID-19 patients. Our graph formation process fuses heterogeneous data, i.e., chest X-rays as node features and non-imaging EHR for edge formation. While node represents a snap-shot in time for a single patient, weighted edge structure encodes complex clinical patterns among patients. While age and gender have been used in the past for patient graph formation, our method incorporates complex clinical history while avoiding manual feature selection. The model learns from the patient's own data as well as patterns among clinically-similar patients. Our visualization study investigates the effects of 'neighborhood' of a node on its predictiveness and showcases the model's tendency to focus on edge-connected patients with highly suggestive clinical features common with the node. The proposed model generalizes well by allowing edge formation process to adapt to an external cohort.


Assuntos
COVID-19 , Humanos , Aprendizagem
17.
PLOS Glob Public Health ; 2(8): e0000918, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36962801

RESUMO

Despite successes on the Sustainable Development Goals for access to improved water sources and sanitation, many low and middle-income countries (LMICs) continue to struggle with high rates of diarrheal disease. In Guatemala, 98% of water sources are estimated to have E. coli contamination. This project moves toward a novel low-cost approach to bridge the gap between the microbiologic identification of E. coli and the vast impact that this pathogen has on human health within marginalized communities using co-designed community-based tools, low-cost technology, and AI. An agile co-design process was followed with water quality stakeholders, community staff, and local graphic design artists to develop a community water quality education mobile app. A series of alpha- and beta-testers completed interactive demonstration, feedback, and in-depth interview sessions. A microbiology lab in Guatemala developed and piloted field protocols with lay community workers to collect and process water samples. A preliminary artificial intelligence (AI) algorithm was developed to detect the presence of E. coli in images generated from community-derived water samples. The mobile app emerged as a pictorial and audio-driven community-facing tool. The field protocol for water sampling and testing was successfully implemented by lay community workers. Feedback from the community workers indicated both desire and ability to conduct the water sampling and testing protocol under field conditions. However, images derived from the low-cost $2 microscope in field conditions were not of a suitable quality for AI object detection of E. coli, and additional low-cost technologies are being considered. The preliminary AI object detection algorithm from lab-derived images performed at 94% accuracy in identifying E. coli in comparison to the Chromocult gold-standard.

18.
J Biomed Inform ; 123: 103918, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34560275

RESUMO

OBJECTIVE: With increasing patient complexity whose data are stored in fragmented health information systems, automated and time-efficient ways of gathering important information from the patients' medical history are needed for effective clinical decision making. Using COVID-19 as a case study, we developed a query-bot information retrieval system with user-feedback to allow clinicians to ask natural questions to retrieve data from patient notes. MATERIALS AND METHODS: We applied clinicalBERT, a pre-trained contextual language model, to our dataset of patient notes to obtain sentence embeddings, using K-Means to reduce computation time for real-time interaction. Rocchio algorithm was then employed to incorporate user-feedback and improve retrieval performance. RESULTS: In an iterative feedback loop experiment, MAP for final iteration was 0.93/0.94 as compared to initial MAP of 0.66/0.52 for generic and 1./1. compared to 0.79/0.83 for COVID-19 specific queries confirming that contextual model handles the ambiguity in natural language queries and feedback helps to improve retrieval performance. User-in-loop experiment also outperformed the automated pseudo relevance feedback method. Moreover, the null hypothesis which assumes identical precision between initial retrieval and relevance feedback was rejected with high statistical significance (p â‰ª 0.05). Compared to Word2Vec, TF-IDF and bioBERT models, clinicalBERT works optimally considering the balance between response precision and user-feedback. DISCUSSION: Our model works well for generic as well as COVID-19 specific queries. However, some generic queries are not answered as well as others because clustering reduces query performance and vague relations between queries and sentences are considered non-relevant. We also tested our model for queries with the same meaning but different expressions and demonstrated that these query variations yielded similar performance after incorporation of user-feedback. CONCLUSION: In conclusion, we develop an NLP-based query-bot that handles synonyms and natural language ambiguity in order to retrieve relevant information from the patient chart. User-feedback is critical to improve model performance.


Assuntos
COVID-19 , Algoritmos , Retroalimentação , Humanos , Armazenamento e Recuperação da Informação , SARS-CoV-2
19.
NPJ Digit Med ; 4(1): 94, 2021 Jun 03.
Artigo em Inglês | MEDLINE | ID: mdl-34083734

RESUMO

The strain on healthcare resources brought forth by the recent COVID-19 pandemic has highlighted the need for efficient resource planning and allocation through the prediction of future consumption. Machine learning can predict resource utilization such as the need for hospitalization based on past medical data stored in electronic medical records (EMR). We conducted this study on 3194 patients (46% male with mean age 56.7 (±16.8), 56% African American, 7% Hispanic) flagged as COVID-19 positive cases in 12 centers under Emory Healthcare network from February 2020 to September 2020, to assess whether a COVID-19 positive patient's need for hospitalization can be predicted at the time of RT-PCR test using the EMR data prior to the test. Five main modalities of EMR, i.e., demographics, medication, past medical procedures, comorbidities, and laboratory results, were used as features for predictive modeling, both individually and fused together using late, middle, and early fusion. Models were evaluated in terms of precision, recall, F1-score (within 95% confidence interval). The early fusion model is the most effective predictor with 84% overall F1-score [CI 82.1-86.1]. The predictive performance of the model drops by 6 % when using recent clinical data while omitting the long-term medical history. Feature importance analysis indicates that history of cardiovascular disease, emergency room visits in the past year prior to testing, and demographic factors are predictive of the disease trajectory. We conclude that fusion modeling using medical history and current treatment data can forecast the need for hospitalization for patients infected with COVID-19 at the time of the RT-PCR test.

20.
Sci Rep ; 11(1): 9461, 2021 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-33947927

RESUMO

Efficient prediction of cancer recurrence in advance may help to recruit high risk breast cancer patients for clinical trial on-time and can guide a proper treatment plan. Several machine learning approaches have been developed for recurrence prediction in previous studies, but most of them use only structured electronic health records and only a small training dataset, with limited success in clinical application. While free-text clinic notes may offer the greatest nuance and detail about a patient's clinical status, they are largely excluded in previous predictive models due to the increase in processing complexity and need for a complex modeling framework. In this study, we developed a weak-supervision framework for breast cancer recurrence prediction in which we trained a deep learning model on a large sample of free-text clinic notes by utilizing a combination of manually curated labels and NLP-generated non-perfect recurrence labels. The model was trained jointly on manually curated data from 670 patients and NLP-curated data of 8062 patients. It was validated on manually annotated data from 224 patients with recurrence and achieved 0.94 AUROC. This weak supervision approach allowed us to learn from a larger dataset using imperfect labels and ultimately provided greater accuracy compared to a smaller hand-curated dataset, with less manual effort invested in curation.


Assuntos
Neoplasias da Mama/patologia , Recidiva Local de Neoplasia/patologia , Doença Crônica , Registros Eletrônicos de Saúde , Feminino , Humanos , Aprendizado de Máquina , Pessoa de Meia-Idade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...