Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 88
Filter
1.
JAMIA Open ; 7(1): ooae015, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38414534

ABSTRACT

Objectives: In the United States, end-stage kidney disease (ESKD) is responsible for high mortality and significant healthcare costs, with the number of cases sharply increasing in the past 2 decades. In this study, we aimed to reduce these impacts by developing an ESKD model for predicting its occurrence in a 2-year period. Materials and Methods: We developed a machine learning (ML) pipeline to test different models for the prediction of ESKD. The electronic health record was used to capture several kidney disease-related variables. Various imputation methods, feature selection, and sampling approaches were tested. We compared the performance of multiple ML models using area under the ROC curve (AUCROC), area under the Precision-Recall curve (PR-AUC), and Brier scores for discrimination, precision, and calibration, respectively. Explainability methods were applied to the final model. Results: Our best model was a gradient-boosting machine with feature selection and imputation methods as additional components. The model exhibited an AUCROC of 0.97, a PR-AUC of 0.33, and a Brier score of 0.002 on a holdout test set. A chart review analysis by expert physicians indicated clinical utility. Discussion and Conclusion: An ESKD prediction model can identify individuals at risk for ESKD and has been successfully deployed within our health system.

2.
Open Forum Infect Dis ; 11(2): ofae030, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38379573

ABSTRACT

Introduction: Initiation of medications for opioid use disorder (MOUD) within the hospital setting may improve outcomes for people who inject drugs (PWID) hospitalized because of an infection. Many studies used International Classification of Diseases (ICD) codes to identify PWID, although these may be misclassified and thus, inaccurate. We hypothesized that bias from misclassification of PWID using ICD codes may impact analyses of MOUD outcomes. Methods: We analyzed a cohort of 36 868 cases of patients diagnosed with Staphylococcus aureus bacteremia at 124 US Veterans Health Administration hospitals between 2003 and 2014. To identify PWID, we implemented an ICD code-based algorithm and a natural language processing (NLP) algorithm for classification of admission notes. We analyzed outcomes of prescribing MOUD as an inpatient using both approaches. Our primary outcome was 365-day all-cause mortality. We fit mixed-effects Cox regression models with receipt or not of MOUD during the index hospitalization as the primary predictor and 365-day mortality as the outcome. Results: NLP identified 2389 cases as PWID, whereas ICD codes identified 6804 cases as PWID. In the cohort identified by NLP, receipt of inpatient MOUD was associated with a protective effect on 365-day survival (adjusted hazard ratio, 0.48; 95% confidence interval, .29-.81; P < .01) compared with those not receiving MOUD. There was no significant effect of MOUD receipt in the cohort identified by ICD codes (adjusted hazard ratio, 1.00; 95% confidence interval, .77-1.30; P = .99). Conclusions: MOUD was protective of all-cause mortality when NLP was used to identify PWID, but not significant when ICD codes were used to identify the analytic subjects.

3.
J Biomed Inform ; 149: 104551, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38000765

ABSTRACT

The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lacks standard methodologies. Although tools for model replication are numerous, without a unifying blueprint it remains difficult to scientifically reproduce predictive ML models for any number of reasons (e.g., assumptions regarding data distributions and preprocessing, unclear test metrics, etc.) and ultimately, questions around generalizability and transportability are not readily answered. To facilitate scientific reproducibility, we built upon the Predictive Model Markup Language (PMML) to capture essential information. As a key component of the PREdictive Model Index and Exchange REpository (PREMIERE) platform, we present the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that autocompletes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets, providing a foundation for future predictive model reproducibility, sharing, and comparison.


Subject(s)
Biomedical Research , Reproducibility of Results , Algorithms , Records , Metadata
4.
Res Sq ; 2023 Nov 14.
Article in English | MEDLINE | ID: mdl-38014280

ABSTRACT

Continuous renal replacement therapy (CRRT) is a form of dialysis prescribed to severely ill patients who cannot tolerate regular hemodialysis. However, as the patients are typically very ill to begin with, there is always uncertainty as to whether they will survive during or after CRRT treatment. Because of outcome uncertainty, a large percentage of patients treated with CRRT do not survive, utilizing scarce resources and raising false hope in patients and their families. To address these issues, we present a machine-learning-based algorithm to predict if patients will survive after being treated with CRRT. We use information extracted from electronic health records from patients who were placed on CRRT at multiple institutions to train a model that predicts CRRT survival outcome; on a held-out test set, the model achieved an area under the receiver operating curve of 0.929 (CI=0.917-0.942). Feature importance, error, and subgroup analyses identified consistently, mean corpuscular volume as a driving feature for model predictions. Overall, we demonstrate the potential for predictive machine-learning models to assist clinicians in alleviating the uncertainty of CRRT patient survival outcomes, with opportunities for future improvement through further data collection and advanced modeling.

5.
J Biomed Inform ; 135: 104214, 2022 11.
Article in English | MEDLINE | ID: mdl-36220544

ABSTRACT

To better understand the challenges of generally implementing and adapting computational phenotyping approaches, the performance of a Phenotype KnowledgeBase (PheKB) algorithm for rheumatoid arthritis (RA) was evaluated on a University of California, Los Angeles (UCLA) patient population, focusing on examining its performance on ambiguous cases. The algorithm was evaluated on a cohort of 4,766 patients, along with a chart review of 300 patients by rheumatologists against accepted diagnostic guidelines. The performance revealed low sensitivity towards specific subtypes of positive RA cases, which suggests revisions in features used for phenotyping. A close examination of select cases also indicated a significant portion of patients with missing data, drawing attention to the need to consider data integrity as an integral part of phenotyping pipelines, as well as issues around the usability of various codes for distinguishing cases. We use patterns in the PheKB algorithm's errors to further demonstrate important considerations when designing a phenotyping algorithm.


Subject(s)
Arthritis, Rheumatoid , Electronic Health Records , Humans , Algorithms , Knowledge Bases , Phenotype , Arthritis, Rheumatoid/diagnosis , Arthritis, Rheumatoid/epidemiology
6.
Open Forum Infect Dis ; 9(9): ofac471, 2022 Sep.
Article in English | MEDLINE | ID: mdl-36168546

ABSTRACT

Background: Improving the identification of people who inject drugs (PWID) in electronic medical records can improve clinical decision making, risk assessment and mitigation, and health service research. Identification of PWID currently consists of heterogeneous, nonspecific International Classification of Diseases (ICD) codes as proxies. Natural language processing (NLP) and machine learning (ML) methods may have better diagnostic metrics than nonspecific ICD codes for identifying PWID. Methods: We manually reviewed 1000 records of patients diagnosed with Staphylococcus aureus bacteremia admitted to Veterans Health Administration hospitals from 2003 through 2014. The manual review was the reference standard. We developed and trained NLP/ML algorithms with and without regular expression filters for negation (NegEx) and compared these with 11 proxy combinations of ICD codes to identify PWID. Data were split 70% for training and 30% for testing. We calculated diagnostic metrics and estimated 95% confidence intervals (CIs) by bootstrapping the hold-out test set. Best models were determined by best F-score, a summary of sensitivity and positive predictive value. Results: Random forest with and without NegEx were the best-performing NLP/ML algorithms in the training set. Random forest with NegEx outperformed all ICD-based algorithms. F-score for the best NLP/ML algorithm was 0.905 (95% CI, .786-.967) and 0.592 (95% CI, .550-.632) for the best ICD-based algorithm. The NLP/ML algorithm had a sensitivity of 92.6% and specificity of 95.4%. Conclusions: NLP/ML outperformed ICD-based coding algorithms at identifying PWID in electronic health records. NLP/ML models should be considered in identifying cohorts of PWID to improve clinical decision making, health services research, and administrative surveillance.

7.
J Biomed Inform ; 134: 104168, 2022 10.
Article in English | MEDLINE | ID: mdl-35987449

ABSTRACT

Early detection of heart failure (HF) can provide patients with the opportunity for more timely intervention and better disease management, as well as efficient use of healthcare resources. Recent machine learning (ML) methods have shown promising performance on diagnostic prediction using temporal sequences from electronic health records (EHRs). In practice, however, these models may not generalize to other populations due to dataset shift. Shifts in datasets can be attributed to a range of factors such as variations in demographics, data management methods, and healthcare delivery patterns. In this paper, we use unsupervised adversarial domain adaptation methods to adaptively reduce the impact of dataset shift on cross-institutional transfer performance. The proposed framework is validated on a next-visit HF onset prediction task using a BERT-style Transformer-based language model pre-trained with a masked language modeling (MLM) task. Our model empirically demonstrates superior prediction performance relative to non-adversarial baselines in both transfer directions on two different clinical event sequence data sources.


Subject(s)
Heart Failure , Neural Networks, Computer , Electronic Health Records , Heart Failure/diagnosis , Humans , Information Storage and Retrieval , Language , Machine Learning
8.
JAMA Netw Open ; 5(8): e2225593, 2022 08 01.
Article in English | MEDLINE | ID: mdl-35939303

ABSTRACT

Importance: Overdose is one of the leading causes of death in the US; however, surveillance data lag considerably from medical examiner determination of the death to reporting in national surveillance reports. Objective: To automate the classification of deaths related to substances in medical examiner data using natural language processing (NLP) and machine learning (ML). Design, Setting, and Participants: Diagnostic study comparing different natural language processing and machine learning algorithms to identify substances related to overdose in 10 health jurisdictions in the US from January 1, 2020, to December 31, 2020. Unstructured text from 35 433 medical examiner and coroners' death records was examined. Exposures: Text from each case was manually classified to a substance that was related to the death. Three feature representation methods were used and compared: text frequency-inverse document frequency (TF-IDF), global vectors for word representations (GloVe), and concept unique identifier (CUI) embeddings. Several ML algorithms were trained and best models were selected based on F-scores. The best models were tested on a hold-out test set and results were reported with 95% CIs. Main Outcomes and Measures: Text data from death certificates were classified as any opioid, fentanyl, alcohol, cocaine, methamphetamine, heroin, prescription opioid, and an aggregate of other substances. Diagnostic metrics and 95% CIs were calculated for each combination of feature extraction method and machine learning classifier. Results: Of 35 433 death records analyzed (decedent median age, 58 years [IQR, 41-72 years]; 24 449 [69%] were male), the most common substances related to deaths included any opioid (5739 [16%]), fentanyl (4758 [13%]), alcohol (2866 [8%]), cocaine (2247 [6%]), methamphetamine (1876 [5%]), heroin (1613 [5%]), prescription opioids (1197 [3%]), and any benzodiazepine (1076 [3%]). The CUI embeddings had similar or better diagnostic metrics compared with word embeddings and TF-IDF for all substances except alcohol. ML classifiers had perfect or near perfect performance in classifying deaths related to any opioids, heroin, fentanyl, prescription opioids, methamphetamine, cocaine, and alcohol. Classification of benzodiazepines was suboptimal using all 3 feature extraction methods. Conclusions and Relevance: In this diagnostic study, NLP/ML algorithms demonstrated excellent diagnostic performance at classifying substances related to overdoses. These algorithms should be integrated into workflows to decrease the lag time in reporting overdose surveillance data.


Subject(s)
Cocaine , Drug Overdose , Methamphetamine , Analgesics, Opioid , Benzodiazepines , Drug Overdose/epidemiology , Female , Fentanyl , Heroin , Humans , Male , Middle Aged , Natural Language Processing
9.
Am J Perinatol ; 2022 Dec 29.
Article in English | MEDLINE | ID: mdl-35752169

ABSTRACT

OBJECTIVE: This study aimed to develop and validate a machine learning (ML) model to predict the probability of a vaginal delivery (Partometer) using data iteratively obtained during labor from the electronic health record. STUDY DESIGN: A retrospective cohort study of deliveries at an academic, tertiary care hospital was conducted from 2013 to 2019 who had at least two cervical examinations. The population was divided into those delivered by physicians with nulliparous term singleton vertex (NTSV) cesarean delivery rates <23.9% (Partometer cohort) and the remainder (control cohort). The cesarean rate among this population of lower risk patients is a standard metric by which to compare provider rates; <23.9% was the Healthy People 2020 goal. A supervised automated ML approach was applied to generate a model for each population. The primary outcome was accuracy of the model developed on the Partometer cohort at 4 hours from admission to labor and delivery. Secondary outcomes included discrimination ability (receiver operating characteristics-area under the curve [ROC-AUC]), precision-recall AUC, and calibration of the Partometer. To assess generalizability, we compared the performance and clinical predictors identified by the Partometer to the control model. RESULTS: There were 37,932 deliveries during the study period; after exclusions, 9,385 deliveries were included in the Partometer cohort and 19,683 in the control cohort. Accuracy of predicting vaginal delivery at 4 hours was 87.1% for the Partometer (ROC-AUC: 0.82). Clinical predictors of greatest importance in the stacked Intrapartum Partometer Model included the Admission Model prediction and ongoing measures of dilatation and station which mirrored those found in the control population. CONCLUSION: Using automated ML and intrapartum factors improved the accuracy of prediction of probability of a vaginal delivery over both previously published models based on logistic regression. Harnessing real-time data and ML could represent the bridge to generating a truly prescriptive tool to augment clinical decision-making, predict labor outcomes, and reduce maternal and neonatal morbidity. KEY POINTS: · Our ML-based model yielded accurate predictions of mode of delivery early in labor.. · Predictors for models created on populations with high and low cesarean rates were the same.. · A ML-based model may provide meaningful guidance to clinicians managing labor..

10.
Article in English | MEDLINE | ID: mdl-35329265

ABSTRACT

Background: Exposure to air pollution is associated with acute pediatric asthma exacerbations, including reduced lung function, rescue medication usage, and increased symptoms; however, most studies are limited in investigating longitudinal changes in these acute effects. This study aims to investigate the effects of daily air pollution exposure on acute pediatric asthma exacerbation risk using a repeated-measures design. Methods: We conducted a panel study of 40 children aged 8−16 years with moderate-to-severe asthma. We deployed the Biomedical REAI-Time Health Evaluation (BREATHE) Kit developed in the Los Angeles PRISMS Center to continuously monitor personal exposure to particulate matter of aerodynamic diameter < 2.5 µm (PM2.5), relative humidity and temperature, geolocation (GPS), and asthma outcomes including lung function, medication use, and symptoms for 14 days. Hourly ambient (PM2.5, nitrogen dioxide (NO2), ozone (O3)) and traffic-related (nitrogen oxides (NOx) and PM2.5) air pollution exposures were modeled based on location. We used mixed-effects models to examine the association of same day and lagged (up to 2 days) exposures with daily changes in % predicted forced expiratory volume in 1 s (FEV1) and % predicted peak expiratory flow (PEF), count of rescue inhaler puffs, and symptoms. Results: Participants were on average 12.0 years old (range: 8.4−16.8) with mean (SD) morning %predicted FEV1 of 67.9% (17.3%) and PEF of 69.1% (18.4%) and 1.4 (3.5) puffs per day of rescue inhaler use. Participants reported chest tightness, wheeze, trouble breathing, and cough symptoms on 36.4%, 17.5%, 32.3%, and 42.9%, respectively (n = 217 person-days). One SD increase in previous day O3 exposure was associated with reduced morning (beta [95% CI]: −4.11 [−6.86, −1.36]), evening (−2.65 [−5.19, −0.10]) and daily average %predicted FEV1 (−3.45 [−6.42, −0.47]). Daily (lag 0) exposure to traffic-related PM2.5 exposure was associated with reduced morning %predicted PEF (−3.97 [−7.69, −0.26]) and greater odds of "feeling scared of trouble breathing" symptom (odds ratio [95% CI]: 1.83 [1.03, 3.24]). Exposure to ambient O3, NOx, and NO was significantly associated with increased rescue inhaler use (rate ratio [95% CI]: O3 1.52 [1.02, 2.27], NOx 1.61 [1.23, 2.11], NO 1.80 [1.37, 2.35]). Conclusions: We found significant associations of air pollution exposure with lung function, rescue inhaler use, and "feeling scared of trouble breathing." Our study demonstrates the potential of informatics and wearable sensor technologies at collecting highly resolved, contextual, and personal exposure data for understanding acute pediatric asthma triggers.


Subject(s)
Air Pollutants , Air Pollution , Asthma , Ozone , Air Pollutants/adverse effects , Air Pollutants/analysis , Air Pollution/adverse effects , Air Pollution/analysis , Asthma/epidemiology , Child , Environmental Exposure/adverse effects , Environmental Exposure/analysis , Humans , Nitrogen Dioxide , Ozone/analysis , Particulate Matter/adverse effects , Particulate Matter/analysis
11.
JAMA Netw Open ; 5(3): e222037, 2022 03 01.
Article in English | MEDLINE | ID: mdl-35285922

ABSTRACT

Importance: Living alone, a key proxy of social isolation, is a risk factor for cardiovascular disease. In addition, Black race is associated with less optimal blood pressure (BP) control than in other racial or ethnic groups. However, it is not clear whether living arrangement status modifies the beneficial effects of intensive BP control on reduction in cardiovascular events among Black individuals. Objective: To examine whether the association of intensive BP control with cardiovascular events differs by living arrangement among Black individuals and non-Black individuals (eg, individuals who identified as Alaskan Native, American Indian, Asian, Native Hawaiian, Pacific Islander, White, or other) in the Systolic Blood Pressure Intervention Trial (SPRINT). Design, Setting, and Participants: This secondary analysis incorporated data from SPRINT, a multicenter study of individuals with increased risk for cardiovascular disease and free of diabetes, enrolled at 102 clinical sites in the United States between November 2010 and March 2013. Race and living arrangement (ie, living alone or living with others) were self-reported. Data were collected between November 2010 and March 2013 and analyzed from January 2021 to October 2021. Exposures: The SPRINT participants were randomized to a systolic BP target of either less than 120 mm Hg (intensive treatment group) or less than 140 mm Hg (standard treatment group). Antihypertensive medications were adjusted to achieve the targets in each group. Main Outcomes and Measures: Cox proportional hazards model was used to investigate the association of intensive treatment with the incident composite cardiovascular outcome (by August 20, 2015) according to living arrangement among Black individuals and other individuals. Transportability formula was applied to generalize the SPRINT findings to hypothetical external populations by varying the proportion of Black race and living arrangement status. Results: Among the 9342 total participants, the mean (SD) age was 67.9 (9.4) years; 2793 participants [30%] were Black, 2714 [29%] lived alone, and 3320 participants (35.5%) were female. Over a median (IQR) follow-up of 3.22 (2.74-3.76) years, the primary composite cardiovascular outcome was observed in 67 of 1001 Black individuals living alone (6.7%), 76 of 1792 Black individuals living with others (4.2%), 108 of 1713 non-Black individuals living alone (6.3%), and 311 of 4836 non-Black individuals living with others (6.4%). The intensive treatment group showed a significantly lower rate of the composite cardiovascular outcome than the standard treatment group among Black individuals living with others (hazard ratio [HR], 0.53 [95% CI, 0.33-0.85]) but not among those living alone (HR, 1.07 [95% CI, 0.66-1.73]; P for interaction = .04). The association was observed among individuals who were not Black regardless of living arrangement status. Using transportability, we found a smaller or null association between intensive control and cardiovascular outcomes among hypothetical populations of 60% Black individuals or more and 60% or more of individuals living alone. Conclusions and Relevance: Intensive BP control was associated with a lower rate of cardiovascular events among Black individuals living with others and individuals who were not Black but not among Black individuals living alone. Trial Registration: ClinicalTrials.gov Identifier: NCT01206062.


Subject(s)
Cardiovascular Diseases , Hypertension , Aged , Antihypertensive Agents/pharmacology , Blood Pressure/physiology , Blood Pressure Determination , Cardiovascular Diseases/chemically induced , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/prevention & control , Female , Humans , Hypertension/diagnosis , Male
12.
AMIA Annu Symp Proc ; 2022: 709-718, 2022.
Article in English | MEDLINE | ID: mdl-37128415

ABSTRACT

Determining factors influencing patient participation in and adherence to cancer screening recommendations is key to successful cancer screening programs. However, the collection of variables necessary to anticipate patient behavior in cancer screening has not been systematically examined. Using lung cancer screening as a representative example, we conducted an exploratory analysis to characterize the current representations of 18 demographic, health-related, and psychosocial variables collected as part of a conceptual model to understand factors for lung cancer screening participation and adherence. Our analysis revealed a lack of standardization in controlled terminologies and common data elements for these variables. For example, only eight (44%) demographic and health-related variables were recorded consistently in the electronic health record. Multiple survey instruments could collect the remaining variables but were highly inconsistent in how variables were represented. This analysis suggests opportunities to establish standardized data formats for psychological, cognitive, social, and environmental variables to improve data collection.


Subject(s)
Early Detection of Cancer , Lung Neoplasms , Humans , Data Collection , Patient Participation , Demography
13.
J Asthma ; 59(7): 1305-1318, 2022 07.
Article in English | MEDLINE | ID: mdl-33926348

ABSTRACT

OBJECTIVE: The heterogeneity of asthma has inspired widespread application of statistical clustering algorithms to a variety of datasets for identification of potentially clinically meaningful phenotypes. There has not been a standardized data analysis approach for asthma clustering, which can affect reproducibility and clinical translation of results. Our objective was to identify common and effective data analysis practices in the asthma clustering literature and apply them to data from a Southern California population-based cohort of schoolchildren with asthma. METHODS: As of January 1, 2020, we reviewed key statistical elements of 77 asthma clustering studies. Guided by the literature, we used 12 input variables and three clustering methods (hierarchical clustering, k-medoids, and latent class analysis) to identify clusters in 598 schoolchildren with asthma from the Southern California Children's Health Study (CHS). RESULTS: Clusters of children identified by latent class analysis were characterized by exhaled nitric oxide, FEV1/FVC, FEV1 percent predicted, asthma control and allergy score; and were predictive of control at two year follow up. Clusters from the other two methods were less clinically remarkable, primarily differentiated by sex and race/ethnicity and less predictive of asthma control over time. CONCLUSION: Upon review of the asthma phenotyping literature, common approaches of data clustering emerged. When applying these elements to the Children's Health Study data, latent class analysis clusters-represented by exhaled nitric oxide and spirometry measures-had clinical relevance over time.


Subject(s)
Asthma , Asthma/epidemiology , Asthma/genetics , Child , Child Health , Cluster Analysis , Humans , Nitric Oxide , Reproducibility of Results
14.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 2303-2309, 2021 11.
Article in English | MEDLINE | ID: mdl-34891747

ABSTRACT

The adoption of electronic health records (EHRs) has made patient data increasingly accessible, precipitating the development of various clinical decision support systems and data-driven models to help physicians. However, missing data are common in EHR-derived datasets, which can introduce significant uncertainty, if not invalidating the use of a predictive model. Machine learning (ML)-based imputation methods have shown promise in various domains for the task of estimating values and reducing uncertainty to the point that a predictive model can be employed. We introduce Autopopulus, a novel framework that enables the design and evaluation of various autoencoder architectures for efficient imputation on large datasets. Autopopulus implements existing autoencoder methods as well as a new technique that outputs a range of estimated values (rather than point estimates), and demonstrates a workflow that helps users make an informed decision on an appropriate imputation method. To further illustrate Autopopulus' utility, we use it to identify not only which imputation methods can most accurately impute on a large clinical dataset, but to also identify the imputation methods that enable downstream predictive models to achieve the best performance for prediction of chronic kidney disease (CKD) progression.


Subject(s)
Electronic Health Records , Research Design , Datasets as Topic , Disease Progression , Humans , Renal Insufficiency, Chronic/diagnosis , Software , Uncertainty
15.
Front Big Data ; 4: 693869, 2021.
Article in English | MEDLINE | ID: mdl-34604740

ABSTRACT

We present a novel approach for imputing missing data that incorporates temporal information into bipartite graphs through an extension of graph representation learning. Missing data is abundant in several domains, particularly when observations are made over time. Most imputation methods make strong assumptions about the distribution of the data. While novel methods may relax some assumptions, they may not consider temporality. Moreover, when such methods are extended to handle time, they may not generalize without retraining. We propose using a joint bipartite graph approach to incorporate temporal sequence information. Specifically, the observation nodes and edges with temporal information are used in message passing to learn node and edge embeddings and to inform the imputation task. Our proposed method, temporal setting imputation using graph neural networks (TSI-GNN), captures sequence information that can then be used within an aggregation function of a graph neural network. To the best of our knowledge, this is the first effort to use a joint bipartite graph approach that captures sequence information to handle missing data. We use several benchmark datasets to test the performance of our method against a variety of conditions, comparing to both classic and contemporary methods. We further provide insight to manage the size of the generated TSI-GNN model. Through our analysis we show that incorporating temporal information into a bipartite graph improves the representation at the 30% and 60% missing rate, specifically when using a nonlinear model for downstream prediction tasks in regularly sampled datasets and is competitive with existing temporal methods under different scenarios.

16.
Cancer Epidemiol Biomarkers Prev ; 30(12): 2227-2234, 2021 12.
Article in English | MEDLINE | ID: mdl-34548326

ABSTRACT

BACKGROUND: Randomized controlled trials (RCT) play a central role in evidence-based healthcare. However, the clinical and policy implications of implementing RCTs in clinical practice are difficult to predict as the studied population is often different from the target population where results are being applied. This study illustrates the concepts of generalizability and transportability, demonstrating their utility in interpreting results from the National Lung Screening Trial (NLST). METHODS: Using inverse-odds weighting, we demonstrate how generalizability and transportability techniques can be used to extrapolate treatment effect from (i) a subset of NLST to the entire NLST population and from (ii) the entire NLST to different target populations. RESULTS: Our generalizability analysis revealed that lung cancer mortality reduction by LDCT screening across the entire NLST [16% (95% confidence interval [CI]: 4-24)] could have been estimated using a smaller subset of NLST participants. Using transportability analysis, we showed that populations with a higher prevalence of females and current smokers had a greater reduction in lung cancer mortality with LDCT screening [e.g., 27% (95% CI, 11-37) for the population with 80% females and 80% current smokers] than those with lower prevalence of females and current smokers. CONCLUSIONS: This article illustrates how generalizability and transportability methods extend estimation of RCTs' utility beyond trial participants, to external populations of interest, including those that more closely mirror real-world populations. IMPACT: Generalizability and transportability approaches can be used to quantify treatment effects for populations of interest, which may be used to design future trials or adjust lung cancer screening eligibility criteria.


Subject(s)
Early Detection of Cancer/methods , Lung Neoplasms/diagnosis , Mass Screening/organization & administration , Randomized Controlled Trials as Topic/standards , Aged , Female , Humans , Male , Middle Aged , Multicenter Studies as Topic/standards
17.
JAMA Netw Open ; 4(8): e2119629, 2021 08 02.
Article in English | MEDLINE | ID: mdl-34427681

ABSTRACT

Importance: The potential to achieve greater reductions in lung cancer mortality than originally estimated by the National Lung Screening Trial with the inclusion of more Black participants stresses the importance of improving access to lung cancer screening for Black current and former smokers, a population presently with the highest lung cancer morbidity and mortality. Objective: To estimate lung cancer and all-cause mortality reductions achievable with lung cancer screening via low-dose computed tomography (LDCT) of the chest in populations with greater proportions of Black screening participants than seen in the original NLST cohort. Design, Setting, and Participants: This cohort study was conducted as a secondary analysis of existing data from the National Lung Screening Trial, a large national randomized clinical trial conducted from 2002 through 2009. NLST participants were current or former smokers, aged between 55 and 74 years, with at least 30 pack-years of smoking history and less than 15 years since quitting. Cox proportional hazard models were used to estimate the hazard ratios (HRs) and 95% CIs of lung cancer mortality and all-cause mortality according to LDCT screening compared with chest radiograph screening. Using a transportability formula, we estimated outcomes for LDCT screening among hypothetical populations by varying the distributions of Black individuals, women, and current smokers. Data were analyzed between September 2020 and March 2021. Exposures: Lung screening with LDCT of the chest compared with chest radiography. Main Outcomes and Measures: Lung cancer mortality and all-cause mortality. Results: This study included a total of 53 452 participants enrolled in the NLST. Of 2376 Black individuals and 51 076 non-Black individuals, 21 922 (41.0%) were women and the mean (SD) age was 61.4 (5.0) years. Over a median (interquartile range) follow-up of 6.7 (6.2-7.0) years, LDCT screening among the synthesized population with a higher proportion of Black individuals (13.4%, mirroring US Census data) was associated with a greater relative reduction of lung cancer mortality (eg, Black individuals: HR, 0.82; 95% CI, 0.72-0.92; vs entire NLST cohort: HR, 0.84; 95% CI, 0.76-0.96). Further reductions in lung cancer mortality by LDCT screening were found among a hypothetical population with a higher proportion of men or current smokers, along with a higher proportion of Black individuals (ie, 60% Black participants; 20% to 40% women) (HR, 0.68; 95% CI, 0.48-0.97). Conclusions and Relevance: The potential to achieve greater reductions in lung cancer mortality than originally estimated by the NLST with the inclusion of more Black participants stresses the critical importance of improving access to lung cancer screening for Black current and former smokers.


Subject(s)
Black or African American/statistics & numerical data , Early Detection of Cancer/mortality , Early Detection of Cancer/statistics & numerical data , Lung Neoplasms/diagnosis , Lung Neoplasms/mortality , Mass Screening/mortality , Mass Screening/statistics & numerical data , Aged , Cohort Studies , Female , Humans , Male , Middle Aged , United States
18.
JAMA Psychiatry ; 78(8): 886-895, 2021 08 01.
Article in English | MEDLINE | ID: mdl-34037672

ABSTRACT

Importance: Provisional records from the US Centers for Disease Control and Prevention (CDC) through July 2020 indicate that overdose deaths spiked during the early months of the COVID-19 pandemic, yet more recent trends are not available, and the data are not disaggregated by month of occurrence, race/ethnicity, or other social categories. In contrast, data from emergency medical services (EMS) provide a source of information nearly in real time that may be useful for rapid and more granular surveillance of overdose mortality. Objective: To describe racial/ethnic, social, and geographic trends in EMS-observed overdose-associated cardiac arrests during the COVID-19 pandemic through December 2020 and assess the concordance with CDC-reported provisional total overdose mortality through May 2020. Design, Setting, and Participants: This cohort study included more than 11 000 EMS agencies in 49 US states that participate in the National EMS Information System and 83.7 million EMS activations in which patient contact was made. Exposures: Year and month of occurrence of overdose-associated cardiac arrest; patient race/ethnicity; census region and division; county-level urbanicity; and zip code-level racial/ethnic composition, poverty, and educational attainment. Main Outcomes and Measures: Overdose-associated cardiac arrests per 100 000 EMS activations with patient contact in 2020 were compared with a baseline of values from 2018 and 2019. Aggregate numbers of overdose-associated cardiac arrests and percentage increases were compared with provisional total mortality in CDC records from rolling 12-month windows with end months spanning January 2018 through July 2020. Results: Among 33.4 million EMS activations in 2020, 16.8 million (50.2%) involved female patients and 16.3 million (48.8%) involved non-Hispanic White individuals. Overdose-associated cardiac arrests were elevated by 42.1% nationally in 2020 (42.3 per 100 000 EMS activations at baseline vs 60.1 per 100 000 EMS activations in 2020). The highest percentage increases were seen among Latinx individuals (49.7%; 38.8 per 100 000 activations at baseline vs 58.1 per 100 000 activations in 2020) and Black or African American individuals (50.3%; 21.5 per 100 000 activations at baseline vs 32.3 per 100 000 activations in 2020), people living in more impoverished neighborhoods (46.4%; 42.0 per 100 000 activations at baseline vs 61.5 per 100 000 activations in 2020), and the Pacific states (63.8%; 33.1 per 100 000 activations at baseline vs 54.2 per 100 000 activations in 2020), despite lower rates at baseline for these groups. The EMS records were available 6 to 12 months ahead of CDC mortality figures and showed a high concordance (r = 0.98) for months in which both data sets were available. If the historical association between EMS-observed and total overdose mortality holds true, an expected total of approximately 90 632 (95% CI, 85 737-95 525) overdose deaths may eventually be reported by the CDC for 2020. Conclusions and Relevance: In this cohort study, records from EMS agencies provided an effective manner to rapidly surveil shifts in US overdose mortality. Unprecedented overdose deaths during the pandemic necessitate investments in overdose prevention as an essential aspect of the COVID-19 response and postpandemic recovery. This is particularly urgent for more socioeconomically disadvantaged and racial/ethnic minority communities subjected to the compounded burden of disproportionate COVID-19 mortality and rising overdose deaths.


Subject(s)
COVID-19/epidemiology , Drug Overdose/epidemiology , Emergency Medical Services/statistics & numerical data , Heart Arrest/epidemiology , Black or African American/statistics & numerical data , Cohort Studies , Drug Overdose/ethnology , Female , Heart Arrest/ethnology , Hispanic or Latino/statistics & numerical data , Humans , Male , Pandemics , Poverty/statistics & numerical data , SARS-CoV-2 , United States/epidemiology , White People/statistics & numerical data
19.
Sci Rep ; 11(1): 8764, 2021 04 22.
Article in English | MEDLINE | ID: mdl-33888839

ABSTRACT

Individuals diagnosed with colorectal adenomas with high-risk features during screening colonoscopy have increased risk for the development of subsequent adenomas and colorectal cancer. While US guidelines recommend surveillance colonoscopy at 3 years in this high-risk population, surveillance uptake is suboptimal. To inform future interventions to improve surveillance uptake, we sought to assess surveillance rates and identify facilitators of uptake in a large integrated health system. We utilized a cohort of patients with a diagnosis of ≥ 1 tubular adenoma (TA) with high-risk features (TA ≥ 1 cm, TA with villous features, TA with high-grade dysplasia, or ≥ 3 TA of any size) on colonoscopy between 2013 and 2016. Surveillance colonoscopy completion within 3.5 years of diagnosis of an adenoma with high-risk features was our primary outcome. We evaluated surveillance uptake over time and utilized logistic regression to detect factors associated with completion of surveillance colonoscopy. The final cohort was comprised of 405 patients. 172 (42.5%) patients successfully completed surveillance colonoscopy by 3.5 years. Use of a patient reminder (telephone, electronic message, or letter) for due surveillance (adjusted odds = 1.9; 95%CI = 1.2-2.8) and having ≥ 1 gastroenterology (GI) visit after diagnosis of an adenoma with high-risk features (adjusted odds = 2.6; 95%CI = 1.6-4.2) significantly predicted surveillance colonoscopy completion at 3.5 years. For patients diagnosed with adenomas with high-risk features, surveillance colonoscopy uptake is suboptimal and frequently occurs after the 3-year surveillance recommendation. Patient reminders and visitation with GI after index colonoscopy are associated with timely surveillance completion. Our findings highlight potential health system interventions to increase timely surveillance uptake for patients diagnosed with adenomas with high-risk features.


Subject(s)
Adenoma/pathology , Colorectal Neoplasms/pathology , Aged , Colonoscopy , Female , Humans , Likelihood Functions , Male , Middle Aged , Risk Factors
20.
JAMIA Open ; 4(4): ooab113, 2021 Oct.
Article in English | MEDLINE | ID: mdl-34988383

ABSTRACT

COVID-19 mortality forecasting models provide critical information about the trajectory of the pandemic, which is used by policymakers and public health officials to guide decision-making. However, thousands of published COVID-19 mortality forecasts now exist, many with their own unique methods, assumptions, format, and visualization. As a result, it is difficult to compare models and understand under which circumstances a model performs best. Here, we describe the construction and usability of covidcompare.io, a web tool built to compare numerous forecasts and offer insight into how each has performed over the course of the pandemic. From its launch in December 2020 to June 2021, we have seen 4600 unique visitors from 85 countries. A study conducted with public health professionals showed high usability overall as formally assessed using a Post-Study System Usability Questionnaire. We find that covidcompare.io is an impactful tool for the comparison of international COVID-19 mortality forecasting models.

SELECTION OF CITATIONS
SEARCH DETAIL
...