Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 42
Filter
1.
Patterns (N Y) ; 5(6): 101010, 2024 Jun 14.
Article in English | MEDLINE | ID: mdl-39005486

ABSTRACT

The authors emphasize diversity, equity, and inclusion in STEM education and artificial intelligence (AI) research, focusing on LGBTQ+ representation. They discuss the challenges faced by queer scientists, educational resources, the implementation of National AI Campus, and the notion of intersectionality. The authors hope to ensure supportive and respectful engagement across all communities.

2.
medRxiv ; 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38562836

ABSTRACT

Objectives: To synthesize discussions among sexual minority men and gender diverse (SMMGD) individuals on mpox, given limited representation of SMMGD voices in existing mpox literature. Methods: BERTopic (a topic modeling technique) was employed with human validations to analyze mpox-related tweets (n = 8,688; October 2020-September 2022) from 2,326 self-identified SMMGD individuals in the U.S.; followed by content analysis and geographic analysis. Results: BERTopic identified 11 topics: health activism (29.81%); mpox vaccination (25.81%) and adverse events (0.98%); sarcasm, jokes, emotional expressions (14.04%); COVID-19 and mpox (7.32%); government/public health response (6.12%); mpox symptoms (2.74%); case reports (2.21%); puns on the virus' naming (i.e., monkeypox; 0.86%); media publicity (0.68%); mpox in children (0.67%). Mpox health activism negatively correlated with LGB social climate index at U.S. state level, ρ = -0.322, p = 0.031. Conclusions: SMMGD discussions on mpox encompassed utilitarian (e.g., vaccine access, case reports, mpox symptoms) and emotionally-charged themes-advocating against homophobia, misinformation, and stigma. Mpox health activism was more prevalent in states with lower LGB social acceptance. Public Health Implications: Findings illuminate SMMGD engagement with mpox discourse, underscoring the need for more inclusive health communication strategies in infectious disease outbreaks to control associated stigma.

3.
Stud Health Technol Inform ; 310: 619-623, 2024 Jan 25.
Article in English | MEDLINE | ID: mdl-38269883

ABSTRACT

According to the World Stroke Organization, 12.2 million people world-wide will have their first stroke this year almost half of which will die as a result. Natural Language Processing (NLP) may improve stroke phenotyping; however, existing rule-based classifiers are rigid, resulting in inadequate performance. We report findings from a pilot study using NLP to improve relation detection for stroke assertion detection to support research studies and healthcare operations.


Subject(s)
Natural Language Processing , Stroke , Humans , Pilot Projects , Stroke/diagnosis
4.
Res Sq ; 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-38260372

ABSTRACT

Interrogating plasma cell-free DNA (cfDNA) to detect cancer offers promise; however, no current tests scan structural variants (SVs) throughout the genome. Here, we report a simple molecular workflow to enrich a tumorigenic SV (DNA palindromes/fold-back inversions) that often demarcates genomic amplification and its feasibility for cancer detection by combining low-throughput next-generation sequencing with automated machine learning (Genome-wide Analysis of Palindrome Formation, GAPF-seq). Tumor DNA signal manifested as skewed chromosomal distributions of high-coverage 1-kb bins (HCBs), differentiating 39 matched breast tumor DNA from normal DNA with an average AUC of 0.9819. In a proof-of-concept liquid biopsy study, cfDNA from 0.5 mL plasma from prostate cancer patients was sufficient for binary classification against matched buffy coat DNA with an average AUC of 0.965. HCBs on the X chromosome emerged as a determinant feature and were associated with AR amplification. GAPF-seq could generate unique cancer-specific SV profiles in an agnostic liquid biopsy setting.

5.
bioRxiv ; 2024 Jan 08.
Article in English | MEDLINE | ID: mdl-37961589

ABSTRACT

Plasma cell-free DNA (cfDNA) is a promising source of gene mutations for cancer detection by liquid biopsy. However, no current tests interrogate chromosomal structural variants (SVs) genome-wide. Here, we report a simple molecular and sequencing workflow called Genome-wide Analysis of Palindrome Formation (GAPF-seq) to probe DNA palindromes, a type of SV that often demarcates gene amplification. With low-throughput next-generation sequencing and automated machine learning, tumor DNA showed skewed chromosomal distributions of high-coverage 1-kb bins (HCBs), which differentiated 39 breast tumors from matched normal DNA with an average Area Under the Curve (AUC) of 0.9819. A proof-of-concept liquid biopsy study using cfDNA from prostate cancer patients and healthy individuals yielded an average AUC of 0.965. HCBs on the X chromosome emerged as a determinant feature and were associated with androgen receptor gene amplification. As a novel agnostic liquid biopsy approach, GAPF-seq could fill the technological gap offering unique cancer-specific SV profiles.

6.
J Med Syst ; 47(1): 83, 2023 Aug 05.
Article in English | MEDLINE | ID: mdl-37542590

ABSTRACT

Supply-demand mismatch of ward resources ("ward capacity strain") alters care and outcomes. Narrow strain definitions and heterogeneous populations limit strain literature. Evaluate the predictive utility of a large set of candidate strain variables for in-hospital mortality and discharge destination among acute respiratory failure (ARF) survivors. In a retrospective cohort of ARF survivors transferred from intensive care units (ICUs) to wards in five hospitals from 4/2017-12/2019, we applied 11 machine learning (ML) models to identify ward strain measures during the first 24 hours after transfer most predictive of outcomes. Measures spanned patient volume (census, admissions, discharges), staff workload (medications administered, off-ward transports, transfusions, isolation precautions, patients per respiratory therapist and nurse), and average patient acuity (Laboratory Acute Physiology Score version 2, ICU transfers) domains. The cohort included 5,052 visits in 43 wards. Median age was 65 years (IQR 56-73); 2,865 (57%) were male; and 2,865 (57%) were white. 770 (15%) patients died in the hospital or had hospice discharges, and 2,628 (61%) were discharged home and 964 (23%) to skilled nursing facilities (SNFs). Ward admissions, isolation precautions, and hospital admissions most consistently predicted in-hospital mortality across ML models. Patients per nurse most consistently predicted discharge to home and SNF, and medications administered predicted SNF discharge. In this hypothesis-generating analysis of candidate ward strain variables' prediction of outcomes among ARF survivors, several variables emerged as consistently predictive of key outcomes across ML models. These findings suggest targets for future inferential studies to elucidate mechanisms of ward strain's adverse effects.


Subject(s)
Benchmarking , Respiratory Insufficiency , Humans , Male , Aged , Female , Retrospective Studies , Hospitalization , Intensive Care Units , Patient Discharge , Hospitals , Respiratory Insufficiency/therapy
7.
BioData Min ; 16(1): 20, 2023 Jul 13.
Article in English | MEDLINE | ID: mdl-37443040

ABSTRACT

The introduction of large language models (LLMs) that allow iterative "chat" in late 2022 is a paradigm shift that enables generation of text often indistinguishable from that written by humans. LLM-based chatbots have immense potential to improve academic work efficiency, but the ethical implications of their fair use and inherent bias must be considered. In this editorial, we discuss this technology from the academic's perspective with regard to its limitations and utility for academic writing, education, and programming. We end with our stance with regard to using LLMs and chatbots in academia, which is summarized as (1) we must find ways to effectively use them, (2) their use does not constitute plagiarism (although they may produce plagiarized text), (3) we must quantify their bias, (4) users must be cautious of their poor accuracy, and (5) the future is bright for their application to research and as an academic tool.

8.
AMIA Jt Summits Transl Sci Proc ; 2023: 525-533, 2023.
Article in English | MEDLINE | ID: mdl-37350880

ABSTRACT

Amyloid imaging has been widely used in Alzheimer's disease (AD) diagnosis and biomarker discovery through detecting the regional amyloid plaque density. It is essential to be normalized by a reference region to reduce noise and artifacts. To explore an optimal normalization strategy, we employ an automated machine learning (AutoML) pipeline, STREAMLINE, to conduct the AD diagnosis binary classification and perform permutation-based feature importance analysis with thirteen machine learning models. In this work, we perform a comparative study to evaluate the prediction performance and biomarker discovery capability of three amyloid imaging measures, including one original measure and two normalized measures using two reference regions (i.e., the whole cerebellum and the composite reference region). Our AutoML results indicate that the composite reference region normalization dataset yields a higher balanced accuracy, and identifies more AD-related regions based on the fractioned feature importance ranking.

9.
AMIA Jt Summits Transl Sci Proc ; 2023: 544-553, 2023.
Article in English | MEDLINE | ID: mdl-37350896

ABSTRACT

STREAMLINE is a simple, transparent, end-to-end automated machine learning (AutoML) pipeline for easily conducting rigorous machine learning (ML) modeling and analysis. The initial version is limited to binary classification. In this work, we extend STREAMLINE through implementing multiple regression-based ML models, including linear regression, elastic net, group lasso, and L21 norm. We demonstrate the effectiveness of the regression version of STREAMLINE by applying it to the prediction of Alzheimer's disease (AD) cognitive outcomes using multimodal brain imaging data. Our empirical results demonstrate the feasibility and effectiveness of the newly expanded STREAMLINE as an AutoML pipeline for evaluating AD regression models, and for discovering multimodal imaging biomarkers.

10.
Eur Respir J ; 62(1)2023 07.
Article in English | MEDLINE | ID: mdl-37169384

ABSTRACT

BACKGROUND: It is currently unknown if disease severity modifies response to therapy in pulmonary arterial hypertension (PAH). We aimed to explore if disease severity, as defined by established risk-prediction algorithms, modified response to therapy in randomised clinical trials in PAH. METHODS: We performed a meta-analysis using individual participant data from 18 randomised clinical trials of therapy for PAH submitted to the United States Food and Drug Administration to determine if predicted risk of 1-year mortality at randomisation modified the treatment effect on three outcomes: change in 6-min walk distance (6MWD), clinical worsening at 12 weeks and time to clinical worsening. RESULTS: Of 6561 patients with a baseline US Registry to Evaluate Early and Long-Term PAH Disease Management (REVEAL 2.0) score, we found that individuals with higher baseline risk had higher probabilities of clinical worsening but no difference in change in 6MWD. We detected a significant interaction of REVEAL 2.0 risk and treatment assignment on change in 6MWD. For every 3-point increase in REVEAL 2.0 score, there was a 12.49 m (95% CI 5.86-19.12 m; p=0.001) greater treatment effect in change in 6MWD. We did not detect a significant risk by treatment interaction on clinical worsening with most of the risk-prediction algorithms. CONCLUSIONS: We found that predicted risk of 1-year mortality in PAH modified treatment effect as measured by 6MWD, but not clinical worsening. Our findings highlight the importance of identifying sources of treatment heterogeneity by predicted risk to tailor studies to patients most likely to have the greatest treatment response.


Subject(s)
Hypertension, Pulmonary , Pulmonary Arterial Hypertension , Humans , Pulmonary Arterial Hypertension/drug therapy , Familial Primary Pulmonary Hypertension/drug therapy , Treatment Outcome , Antihypertensive Agents/therapeutic use
11.
Lancet Respir Med ; 11(10): 873-882, 2023 10.
Article in English | MEDLINE | ID: mdl-37230098

ABSTRACT

BACKGROUND: Targeting short-term improvements in multicomponent risk scores for mortality in patients with pulmonary arterial hypertension (PAH) could result in improved long-term outcomes. We aimed to determine whether PAH risk scores were adequate surrogates for clinical worsening or mortality outcomes in PAH randomised clinical trials (RCTs). METHODS: We performed an individual participant data meta-analysis of RCTs selected from PAH trials provided by the US Food and Drug Administration (FDA). We calculated predicted risk using the COMPERA, COMPERA 2.0, non-invasive FPHR, REVEAL 2.0, and REVEAL Lite 2 risk scores. The primary outcome of interest was time to clinical worsening, a composite endpoint composed of any of the following events: all-cause death, hospitalisation for worsening PAH, lung transplantation, atrial septostomy, discontinuation of study treatment (or study withdrawal) for worsening PAH, initiation of parenteral prostacyclin analogue therapy, or decrease of at least 15% in 6-min walk distance from baseline, combined with either worsening of WHO functional class from baseline or the addition of an approved PAH treatment. The secondary outcome of interest was time to all-cause mortality. We assessed the surrogacy of these risk scores, parameterised as attainment of low-risk status by 16 weeks, for improvement in long-term clinical worsening and survival using mediation and meta-analysis frameworks. FINDINGS: Of 28 trials received from the FDA, three RCTs (AMBITION, GRIPHON, and SERAPHIN; n=2508) had the data necessary to assess long-term surrogacy. The mean age was 49 years (SD 16), 1956 (78%) participants were women, 1704 (68%) were classified as White, and 280 (11%) were Hispanic or Latino. 1388 (55%) of 2503 participants with available data had idiopathic PAH and 776 (31%) of 2503 had PAH associated with connective tissue disease. In a mediation analysis, the proportions of treatment effects explained by attainment of low-risk status ranged only from 7% to 13%. In a meta-analysis of trial-regions, the treatment effects on low-risk status were not predictive of the treatment effects on time to clinical worsening (R2 values 0·01-0·19) nor the treatment effects on time to all-cause mortality (R2 values 0-0·2). A leave-one-out analysis suggested that the use of these risk scores as surrogates might lead to biased inferences regarding the effect of therapies on clinical outcomes in PAH RCTs. Results were similar when using absolute risk scores at 16 weeks as the potential surrogates. INTERPRETATION: Multicomponent risk scores have utility for the prediction of outcomes in patients with PAH. Clinical surrogacy for long-term outcomes cannot be inferred from observational studies of outcomes. Our analyses of three PAH trials with long-term follow-up suggest that further study is necessary before using these or other scores as surrogate outcomes in PAH RCTs or clinical care. FUNDING: Cardiovascular Medical Research and Education Fund, US National Institutes of Health.


Subject(s)
Pulmonary Arterial Hypertension , Female , Humans , Middle Aged , Male , Pulmonary Arterial Hypertension/drug therapy , Familial Primary Pulmonary Hypertension , Epoprostenol , Risk Factors , Randomized Controlled Trials as Topic
12.
J Biomed Inform ; 142: 104374, 2023 06.
Article in English | MEDLINE | ID: mdl-37120046

ABSTRACT

OBJECTIVE: While associations between HLA antigen-level mismatches (Ag-MM) and kidney allograft failure are well established, HLA amino acid-level mismatches (AA-MM) have been less explored. Ag-MM fails to consider the substantial variability in the number of MMs at polymorphic amino acid (AA) sites within any given Ag-MM category, which may conceal variable impact on allorecognition. In this study we aim to develop a novel Feature Inclusion Bin Evolver for Risk Stratification (FIBERS) and apply it to automatically discover bins of HLA amino acid mismatches that stratify donor-recipient pairs into low versus high graft survival risk groups. METHODS: Using data from the Scientific Registry of Transplant Recipients, we applied FIBERS on a multiethnic population of 166,574 kidney transplants between 2000 and 2017. FIBERS was applied (1) across all HLA-A, B, C, DRB1, and DQB1 locus AA-MMs with comparison to 0-ABDR Ag-MM risk stratification, (2) on AA-MMs within each HLA locus individually, and (3) using cross validation to evaluate FIBERS generalizability. The predictive power of graft failure risk stratification was evaluated while adjusting for donor/recipient characteristics and HLA-A, B, C, DRB1, and DQB1 Ag-MMs as covariates. RESULTS: FIBERS's best-performing bin (on AA-MMs across all loci) added significant predictive power (hazard ratio = 1.10, Bonferroni adj. p < 0.001) in stratifying graft failure risk (where low-risk is defined as zero AA-MMs and high-risk is one or more AA-MMs) even after adjusting for Ag-MMs and donor/recipient covariates. The best bin also categorized more than twice as many patients to the low-risk category, compared to traditional 0-ABDR Ag mismatching (∼24.4% vs âˆ¼ 9.1%). When HLA loci were binned individually, the bin for DRB1 exhibited the strongest risk stratification; relative to zero AA-MM, one or more MMs in the bin yielded HR = 1.11, p < 0.005 in a fully adjusted Cox model. AA-MMs at HLA-DRB1 peptide contact sites contributed most to incremental risk of graft failure. Additionally, FIBERS points to possible risk associated with HLA-DQB1 AA-MMs at positions that determine specificity of peptide anchor residues and HLA-DQ heterodimer stability. CONCLUSION: FIBERS's performance suggests potential for discovery of HLA immunogenetics-based risk stratification of kidney graft failure that outperforms traditional assessment.


Subject(s)
Amino Acids , HLA-A Antigens , Humans , Histocompatibility Testing , Allografts , Risk Assessment , Kidney
13.
JCO Clin Cancer Inform ; 7: e2200097, 2023 02.
Article in English | MEDLINE | ID: mdl-36809006

ABSTRACT

PURPOSE: Predicting 30-day readmission risk is paramount to improving the quality of patient care. In this study, we compare sets of patient-, provider-, and community-level variables that are available at two different points of a patient's inpatient encounter (first 48 hours and the full encounter) to train readmission prediction models and identify possible targets for appropriate interventions that can potentially reduce avoidable readmissions. METHODS: Using electronic health record data from a retrospective cohort of 2,460 oncology patients and a comprehensive machine learning analysis pipeline, we trained and tested models predicting 30-day readmission on the basis of data available within the first 48 hours of admission and from the entire hospital encounter. RESULTS: Leveraging all features, the light gradient boosting model produced higher, but comparable performance (area under receiver operating characteristic curve [AUROC]: 0.711) with the Epic model (AUROC: 0.697). Given features in the first 48 hours, the random forest model produces higher AUROC (0.684) than the Epic model (AUROC: 0.676). Both models flagged patients with a similar distribution of race and sex; however, our light gradient boosting and random forest models were more inclusive, flagging more patients among younger age groups. The Epic models were more sensitive to identifying patients with an average lower zip income. Our 48-hour models were powered by novel features at various levels: patient (weight change over 365 days, depression symptoms, laboratory values, and cancer type), hospital (winter discharge and hospital admission type), and community (zip income and marital status of partner). CONCLUSION: We developed and validated models comparable with the existing Epic 30-day readmission models with several novel actionable insights that could create service interventions deployed by the case management or discharge planning teams that may decrease readmission rates over time.


Subject(s)
Neoplasms , Patient Readmission , Humans , Retrospective Studies , Hospitalization , Risk Factors
14.
Ann Am Thorac Soc ; 20(1): 58-66, 2023 01.
Article in English | MEDLINE | ID: mdl-36053665

ABSTRACT

Rationale: Sex-based differences in pulmonary arterial hypertension (PAH) are known, but the contribution to disease measures is understudied. Objectives: We examined whether sex was associated with baseline 6-minute-walk distance (6MWD), hemodynamics, and functional class. Methods: We conducted a secondary analysis of participant-level data from randomized clinical trials of investigational PAH therapies conducted between 1998 and 2014 and provided by the U.S. Food and Drug Administration. Outcomes were modeled as a function of an interaction between sex and age or sex and body mass index (BMI), respectively, with generalized mixed modeling. Results: We included a total of 6,633 participants from 18 randomized clinical trials. A total of 5,197 (78%) were female, with a mean age of 49.1 years and a mean BMI of 27.0 kg/m2. Among 1,436 males, the mean age was 49.7 years, and the mean BMI was 26.4 kg/m2. The most common etiology of PAH was idiopathic. Females had shorter 6MWD. For every 1 kg/m2 increase in BMI for females, 6MWD decreased 2.3 (1.6-3.0) meters (P < 0.001), whereas 6MWD did not significantly change with BMI in males (0.31 m [-0.30 to 0.92]; P = 0.32). Females had lower right atrial pressure (RAP) and mean pulmonary artery pressure, and higher cardiac index than males (all P < 0.03). Age significantly modified the sex by RAP and mean pulmonary artery pressure relationships. For every 10-year increase in age, RAP was lower in males (0.5 mm Hg [0.3-0.7]; P < 0.001), but not in females (0.13 [-0.03 to 0.28]; P = 0.10). There was a significant decrease in pulmonary vascular resistance (PVR) with increasing age regardless of sex (P < 0.001). For every 1 kg/m2 increase in BMI, there was a 3% decrease in PVR for males (P < 0.001), compared with a 2% decrease in PVR in females (P < 0.001). Conclusions: Sexual dimorphism in subjects enrolled in clinical trials extends to 6MWD and hemodynamics; these relationships are modified by age and BMI. Sex, age, and body size should be considered in the evaluation and interpretation of surrogate outcomes in PAH.


Subject(s)
Hypertension, Pulmonary , Pulmonary Arterial Hypertension , Humans , Female , Male , Middle Aged , Sex Characteristics , Randomized Controlled Trials as Topic , Familial Primary Pulmonary Hypertension , Hemodynamics
15.
Genet Epidemiol ; 46(8): 555-571, 2022 12.
Article in English | MEDLINE | ID: mdl-35924480

ABSTRACT

Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.


Subject(s)
Genetic Heterogeneity , Precision Medicine , Humans , Precision Medicine/methods , Machine Learning , Phenotype
16.
Chest ; 162(2): 436-447, 2022 08.
Article in English | MEDLINE | ID: mdl-35247393

ABSTRACT

BACKGROUND: Obesity is increasingly prevalent in pulmonary arterial hypertension (PAH) but is associated with improved survival, creating an "obesity paradox" in PAH. It is unknown if the improved outcomes could be attributable to obese patients deriving a greater benefit from PAH therapies. RESEARCH QUESTION: Does BMI modify treatment effectiveness in PAH? STUDY DESIGN AND METHODS: Using individual participant data, a meta-analysis was conducted of phase III, randomized, placebo-controlled trials of treatments for PAH submitted for approval to the U.S. Food and Drug Administration from 2000 to 2015. Primary outcomes were change in 6-min walk distance (6MWD) and World Health Organization (WHO) functional class. RESULTS: A total of 5,440 participants from 17 trials were included. Patients with overweight and obesity had lower baseline 6MWD and were more likely to be WHO functional class III or IV. Treatment was associated with a 27.01-m increase in 6MWD (95% CI, 21.58-32.45; P < .001) and lower odds of worse WHO functional class (OR, 0.58; 95% CI, 0.48-0.70; P < .001). For every 1 kg/m2 increase in BMI, 6MWD was reduced by 0.66 m (P = .07); there was no significant effect modification of treatment response in 6MWD according to BMI (P for interaction = .34). Higher BMI was not associated with odds of WHO functional class at end of follow-up; however, higher BMI attenuated the treatment response such that every 1 kg/m2 increase in BMI increased odds of worse WHO functional class by 3% (OR, 1.03; P for interaction = .06). INTERPRETATION: Patients with overweight and obesity had lower baseline 6MWD and worse WHO functional class than patients with normal weight with PAH. Higher BMI did not modify the treatment response for change in 6MWD, but it attenuated the treatment response for WHO functional class. PAH trials should include participants representative of all weight groups to allow for assessment of treatment heterogeneity and mechanisms.


Subject(s)
Hypertension, Pulmonary , Pulmonary Arterial Hypertension , Antihypertensive Agents/therapeutic use , Clinical Trials, Phase III as Topic , Familial Primary Pulmonary Hypertension , Humans , Obesity/complications , Obesity/epidemiology , Overweight , Randomized Controlled Trials as Topic , Treatment Outcome
17.
BioData Min ; 15(1): 4, 2022 Feb 12.
Article in English | MEDLINE | ID: mdl-35151364

ABSTRACT

BACKGROUND: Gene set enrichment analysis (GSEA) uses gene-level univariate associations to identify gene set-phenotype associations for hypothesis generation and interpretation. We propose that GSEA can be adapted to incorporate SNP and gene-level interactions. To this end, gene scores are derived by Relief-based feature importance algorithms that efficiently detect both univariate and interaction effects (MultiSURF) or exclusively interaction effects (MultiSURF*). We compare these interaction-sensitive GSEA approaches to traditional χ2 rankings in simulated genome-wide array data, and in a target and replication cohort of congenital heart disease patients with conotruncal defects (CTDs). RESULTS: In the simulation study and for both CTD datasets, both Relief-based approaches to GSEA captured more relevant and significant gene ontology terms compared to the univariate GSEA. Key terms and themes of interest include cell adhesion, migration, and signaling. A leading edge analysis highlighted semaphorins and their receptors, the Slit-Robo pathway, and other genes with roles in the secondary heart field and outflow tract development. CONCLUSIONS: Our results indicate that interaction-sensitive approaches to enrichment analysis can improve upon traditional univariate GSEA. This approach replicated univariate findings and identified additional and more robust support for the role of the secondary heart field and cardiac neural crest cell migration in the development of CTDs.

18.
AMIA Annu Symp Proc ; 2022: 606-615, 2022.
Article in English | MEDLINE | ID: mdl-37128417

ABSTRACT

Our objective was to detect common barriers to post-acute care (B2PAC) among hospitalized older adults using natural language processing (NLP) of clinical notes from patients discharged home when a clinical decision support system recommended post-acute care. We annotated B2PAC sentences from discharge planning notes and developed an NLP classifier to identify the highest-value B2PAC class (negative patient preferences). Thirteen machine learning models were compared with Amazon's AutoGluon deep learning model. The study included 594 acute care notes from 100 patient encounters (1156 sentences contained 11 B2PAC) in a large academic health system. The most frequent and modifiable B2PAC class was negative patient preferences (18.3%). The best supervised model was Extreme Gradient Boosting (F1: 0.859), but the deep learning model performed better (F1: 0.916). Alerting clinicians of negative patient preferences early in the hospitalization can prompt interventions such as patient education to ensure patients receive the right level of care and avoid negative outcomes.


Subject(s)
Natural Language Processing , Patient Preference , Humans , Aged , Subacute Care , Machine Learning , Referral and Consultation , Electronic Health Records
19.
Methods Inf Med ; 61(1-02): 3-10, 2022 05.
Article in English | MEDLINE | ID: mdl-34820791

ABSTRACT

OBJECTIVE: Data harmonization is essential to integrate individual participant data from multiple sites, time periods, and trials for meta-analysis. The process of mapping terms and phrases to an ontology is complicated by typographic errors, abbreviations, truncation, and plurality. We sought to harmonize medical history (MH) and adverse events (AE) term records across 21 randomized clinical trials in pulmonary arterial hypertension and chronic thromboembolic pulmonary hypertension. METHODS: We developed and applied a semi-automated harmonization pipeline for use with domain-expert annotators to resolve ambiguous term mappings using exact and fuzzy matching. We summarized MH and AE term mapping success, including map quality measures, and imputation of a generalizing term hierarchy as defined by the applied Medical Dictionary for Regulatory Activities (MedDRA) ontology standard. RESULTS: Over 99.6% of both MH (N = 37,105) and AE (N = 58,170) records were successfully mapped to MedDRA low-level terms. Automated exact matching accounted for 74.9% of MH and 85.5% of AE mappings. Term recommendations from fuzzy matching in the pipeline facilitated annotator mapping of the remaining 24.9% of MH and 13.8% of AE records. Imputation of the generalized MedDRA term hierarchy was unambiguous in 85.2% of high-level terms, 99.4% of high-level group terms, and 99.5% of system organ class in MH, and 75% of high-level terms, 98.3% of high-level group terms, and 98.4% of system organ class in AE. CONCLUSION: This pipeline dramatically reduced the burden of manual annotation for MH and AE term harmonization and could be adapted to other data integration efforts.


Subject(s)
Adverse Drug Reaction Reporting Systems , Pulmonary Arterial Hypertension , Humans , Pulmonary Arterial Hypertension/drug therapy , Randomized Controlled Trials as Topic
20.
Ann Am Thorac Soc ; 19(6): 952-961, 2022 06.
Article in English | MEDLINE | ID: mdl-34936541

ABSTRACT

Rationale: The population of patients with pulmonary arterial hypertension (PAH) has evolved over time from predominantly young White women to an older, more racially diverse and obese population. Whether these changes are reflected in clinical trials is not known. Objectives: To determine secular and regional trends among PAH trial participants. Methods: We performed a pooled cohort analysis using harmonized data from phase III clinical trials of PAH therapies submitted to the U.S. Food and Drug Administration. We used mixed-effects linear and logistic regression to assess regional differences in participant age, sex, body habitus, and hemodynamics over time. Results: A total of 6,599 participants were enrolled in 18 trials between 1998 and 2013; 78% were female. The mean age of participants in North America, Europe, and Latin America at the time of study start increased by 2.09 (95% confidence interval [CI], 0.67-3.51), 1.62 (95% CI, 0.24-3.00), and 4.75 (95% CI, 2.29-7.21) years per 5 years, respectively (P = 0.01). Body mass index at the time of study start increased by 0.72 kg/m2 per 5 years (95% CI, 0.44-0.99; P < 0.001) across all regions. Eighty-five percent of participants in early studies were non-Hispanic White, but this decreased over time to 70%. Ninety-seven percent of Asians and 74% of Hispanics in the sample were recruited from Asia and Latin America. Conclusions: Patients enrolled in more recent PAH therapy trials are older and more obese, mirroring the changing epidemiology of observational cohorts. However, these trends varied by geographic region. PAH cohorts remain predominantly female, presenting challenges for generalizability to male patients. Although the proportion of non-White participants increased over time, this was primarily through recruitment in Asia and Latin America.


Subject(s)
Pulmonary Arterial Hypertension , Cohort Studies , Europe/epidemiology , Familial Primary Pulmonary Hypertension , Female , Humans , Male , Obesity , Pulmonary Arterial Hypertension/drug therapy , Pulmonary Arterial Hypertension/epidemiology , United States/epidemiology
SELECTION OF CITATIONS
SEARCH DETAIL
...