Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Biomed Inform ; 157: 104685, 2024 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-39004109

RESUMO

BACKGROUND: Risk prediction plays a crucial role in planning for prevention, monitoring, and treatment. Electronic Health Records (EHRs) offer an expansive repository of temporal medical data encompassing both risk factors and outcome indicators essential for effective risk prediction. However, challenges emerge due to the lack of readily available gold-standard outcomes and the complex effects of various risk factors. Compounding these challenges are the false positives in diagnosis codes, and formidable task of pinpointing the onset timing in annotations. OBJECTIVE: We develop a Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) algorithm based on extensive unlabeled longitudinal Electronic Health Records (EHR) data augmented by a limited set of gold standard labels on the binary status information indicating whether the clinical event of interest occurred during the follow-up period. METHODS: The SeDDLeR algorithm calculates an individualized risk of developing future clinical events over time using each patient's baseline EHR features via the following steps: (1) construction of an initial EHR-derived surrogate as a proxy for the onset status; (2) deep learning calibration of the surrogate along gold-standard onset status; and (3) semi-supervised deep learning for risk prediction combining calibrated surrogates and gold-standard onset status. To account for missing onset time and heterogeneous follow-up, we introduce temporal kernel weighting. We devise a Gated Recurrent Units (GRUs) module to capture temporal characteristics. We subsequently assess our proposed SeDDLeR method in simulation studies and apply the method to the Massachusetts General Brigham (MGB) Biobank to predict type 2 diabetes (T2D) risk. RESULTS: SeDDLeR outperforms benchmark risk prediction methods, including Semi-parametric Transformation Model (STM) and DeepHit, with consistently best accuracy across experiments. SeDDLeR achieved the best C-statistics ( 0.815, SE 0.023; vs STM +.084, SE 0.030, P-value .004; vs DeepHit +.055, SE 0.027, P-value .024) and best average time-specific AUC (0.778, SE 0.022; vs STM + 0.059, SE 0.039, P-value .067; vs DeepHit + 0.168, SE 0.032, P-value <0.001) in the MGB T2D study. CONCLUSION: SeDDLeR can train robust risk prediction models in both real-world EHR and synthetic datasets with minimal requirements of labeling event times. It holds the potential to be incorporated for future clinical trial recruitment or clinical decision-making.

2.
Sci Rep ; 14(1): 8021, 2024 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580710

RESUMO

The Phenome-Wide Association Study (PheWAS) is increasingly used to broadly screen for potential treatment effects, e.g., IL6R variant as a proxy for IL6R antagonists. This approach offers an opportunity to address the limited power in clinical trials to study differential treatment effects across patient subgroups. However, limited methods exist to efficiently test for differences across subgroups in the thousands of multiple comparisons generated as part of a PheWAS. In this study, we developed an approach that maximizes the power to test for heterogeneous genotype-phenotype associations and applied this approach to an IL6R PheWAS among individuals of African (AFR) and European (EUR) ancestries. We identified 29 traits with differences in IL6R variant-phenotype associations, including a lower risk of type 2 diabetes in AFR (OR 0.96) vs EUR (OR 1.0, p-value for heterogeneity = 8.5 × 10-3), and higher white blood cell count (p-value for heterogeneity = 8.5 × 10-131). These data suggest a more salutary effect of IL6R blockade for T2D among individuals of AFR vs EUR ancestry and provide data to inform ongoing clinical trials targeting IL6 for an expanding number of conditions. Moreover, the method to test for heterogeneity of associations can be applied broadly to other large-scale genotype-phenotype screens in diverse populations.


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/tratamento farmacológico , Diabetes Mellitus Tipo 2/genética , Estudos de Associação Genética , Fenótipo , Polimorfismo de Nucleotídeo Único , Receptores de Interleucina-6/genética
3.
J Biomed Inform ; 134: 104175, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36064111

RESUMO

OBJECTIVE: Electronic Health Record (EHR) based phenotyping is a crucial yet challenging problem in the biomedical field. Though clinicians typically determine patient-level diagnoses via manual chart review, the sheer volume and heterogeneity of EHR data renders such tasks challenging, time-consuming, and prohibitively expensive, thus leading to a scarcity of clinical annotations in EHRs. Weakly supervised learning algorithms have been successfully applied to various EHR phenotyping problems, due to their ability to leverage information from large quantities of unlabeled samples to better inform predictions based on a far smaller number of patients. However, most weakly supervised methods are subject to the challenge to choose the right cutoff value to generate an optimal classifier. Furthermore, since they only utilize the most informative features (i.e., main ICD and NLP counts) they may fail for episodic phenotypes that cannot be consistently detected via ICD and NLP data. In this paper, we propose a label-efficient, weakly semi-supervised deep learning algorithm for EHR phenotyping (WSS-DL), which overcomes the limitations above. MATERIALS AND METHODS: WSS-DL classifies patient-level disease status through a series of learning stages: 1) generating silver standard labels, 2) deriving enhanced-silver-standard labels by fitting a weakly supervised deep learning model to data with silver standard labels as outcomes and high dimensional EHR features as input, and 3) obtaining the final prediction score and classifier by fitting a supervised learning model to data with a minimal number of gold standard labels as the outcome, and the enhanced-silver-standard labels and a minimal set of most informative EHR features as input. To assess the generalizability of WSS-DL across different phenotypes and medical institutions, we apply WSS-DL to classify a total of 17 diseases, including both acute and chronic conditions, using EHR data from three healthcare systems. Additionally, we determine the minimum quantity of training labels required by WSS-DL to outperform existing supervised and semi-supervised phenotyping methods. RESULTS: The proposed method, in combining the strengths of deep learning and weakly semi-supervised learning, successfully leverages the crucial phenotyping information contained in EHR features from unlabeled samples. Indeed, the deep learning model's ability to handle high-dimensional EHR features allows it to generate strong phenotype status predictions from silver standard labels. These predictions, in turn, provide highly effective features in the final logistic regression stage, leading to high phenotyping accuracy in notably small subsets of labeled data (e.g. n = 40 labeled samples). CONCLUSION: Our method's high performance in EHR datasets with very small numbers of labels indicates its potential value in aiding doctors to diagnose rare diseases as well as conditions susceptible to misdiagnosis.


Assuntos
Registros Eletrônicos de Saúde , Aprendizado de Máquina Supervisionado , Algoritmos , Modelos Logísticos , Fenótipo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...