Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
Add more filters










Database
Language
Publication year range
1.
Nat Aging ; 4(3): 379-395, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38383858

ABSTRACT

Identification of Alzheimer's disease (AD) onset risk can facilitate interventions before irreversible disease progression. We demonstrate that electronic health records from the University of California, San Francisco, followed by knowledge networks (for example, SPOKE) allow for (1) prediction of AD onset and (2) prioritization of biological hypotheses, and (3) contextualization of sex dimorphism. We trained random forest models and predicted AD onset on a cohort of 749 individuals with AD and 250,545 controls with a mean area under the receiver operating characteristic of 0.72 (7 years prior) to 0.81 (1 day prior). We further harnessed matched cohort models to identify conditions with predictive power before AD onset. Knowledge networks highlight shared genes between multiple top predictors and AD (for example, APOE, ACTB, IL6 and INS). Genetic colocalization analysis supports AD association with hyperlipidemia at the APOE locus, as well as a stronger female AD association with osteoporosis at a locus near MS4A6A. We therefore show how clinical data can be utilized for early AD prediction and identification of personalized biological hypotheses.


Subject(s)
Alzheimer Disease , Male , Humans , Female , Alzheimer Disease/diagnosis , Electronic Health Records , Apolipoproteins E/genetics , San Francisco
2.
Chest ; 165(6): 1481-1490, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38199323

ABSTRACT

BACKGROUND: Language in nonmedical data sets is known to transmit human-like biases when used in natural language processing (NLP) algorithms that can reinforce disparities. It is unclear if NLP algorithms of medical notes could lead to similar transmissions of biases. RESEARCH QUESTION: Can we identify implicit bias in clinical notes, and are biases stable across time and geography? STUDY DESIGN AND METHODS: To determine whether different racial and ethnic descriptors are similar contextually to stigmatizing language in ICU notes and whether these relationships are stable across time and geography, we identified notes on critically ill adults admitted to the University of California, San Francisco (UCSF), from 2012 through 2022 and to Beth Israel Deaconess Hospital (BIDMC) from 2001 through 2012. Because word meaning is derived largely from context, we trained unsupervised word-embedding algorithms to measure the similarity (cosine similarity) quantitatively of the context between a racial or ethnic descriptor (eg, African-American) and a stigmatizing target word (eg, nonco-operative) or group of words (violence, passivity, noncompliance, nonadherence). RESULTS: In UCSF notes, Black descriptors were less likely to be similar contextually to violent words compared with White descriptors. Contrastingly, in BIDMC notes, Black descriptors were more likely to be similar contextually to violent words compared with White descriptors. The UCSF data set also showed that Black descriptors were more similar contextually to passivity and noncompliance words compared with Latinx descriptors. INTERPRETATION: Implicit bias is identifiable in ICU notes. Racial and ethnic group descriptors carry different contextual relationships to stigmatizing words, depending on when and where notes were written. Because NLP models seem able to transmit implicit bias from training data, use of NLP algorithms in clinical prediction could reinforce disparities. Active debiasing strategies may be necessary to achieve algorithmic fairness when using language models in clinical research.


Subject(s)
Intensive Care Units , Natural Language Processing , Neural Networks, Computer , Humans , Algorithms , Critical Illness/psychology , Bias , Electronic Health Records , Male , Female
4.
J Card Fail ; 29(7): 1017-1028, 2023 07.
Article in English | MEDLINE | ID: mdl-36706977

ABSTRACT

BACKGROUND: Pulmonary hypertension (PH) is life-threatening, and often diagnosed late in its course. We aimed to evaluate if a deep learning approach using electrocardiogram (ECG) data alone can detect PH and clinically important subtypes. We asked: does an automated deep learning approach to ECG interpretation detect PH and its clinically important subtypes? METHODS AND RESULTS: Adults with right heart catheterization or an echocardiogram within 90 days of an ECG at the University of California, San Francisco (2012-2019) were retrospectively identified as PH or non-PH. A deep convolutional neural network was trained on patients' 12-lead ECG voltage data. Patients were divided into training, development, and test sets in a ratio of 7:1:2. Overall, 5016 PH and 19,454 patients without PH were used in the study. The mean age at the time of ECG was 62.29 ± 17.58 years and 49.88% were female. The mean interval between ECG and right heart catheterization or echocardiogram was 3.66 and 2.23 days for patients with PH and patients without PH, respectively. In the test dataset, the model achieved an area under the receiver operating characteristic curve, sensitivity, and specificity, respectively of 0.89, 0.79, and 0.84 to detect PH; 0.91, 0.83, and 0.84 to detect precapillary PH; 0.88, 0.81, and 0.81 to detect pulmonary arterial hypertension, and 0.80, 0.73, and 0.76 to detect group 3 PH. We additionally applied the trained model on ECGs from participants in the test dataset that were obtained from up to 2 years before diagnosis of PH; the area under the receiver operating characteristic curve was 0.79 or greater. CONCLUSIONS: A deep learning ECG algorithm can detect PH and PH subtypes around the time of diagnosis and can detect PH using ECGs that were done up to 2 years before right heart catheterization/echocardiogram diagnosis. This approach has the potential to decrease diagnostic delays in PH.


Subject(s)
Deep Learning , Heart Failure , Hypertension, Pulmonary , Adult , Humans , Female , Male , Hypertension, Pulmonary/diagnosis , Retrospective Studies , Electrocardiography/methods
5.
Crit Care Explor ; 3(6): e0450, 2021 Jun.
Article in English | MEDLINE | ID: mdl-34136824

ABSTRACT

To evaluate whether different approaches in note text preparation (known as preprocessing) can impact machine learning model performance in the case of mortality prediction ICU. DESIGN: Clinical note text was used to build machine learning models for adults admitted to the ICU. Preprocessing strategies studied were none (raw text), cleaning text, stemming, term frequency-inverse document frequency vectorization, and creation of n-grams. Model performance was assessed by the area under the receiver operating characteristic curve. Models were trained and internally validated on University of California San Francisco data using 10-fold cross validation. These models were then externally validated on Beth Israel Deaconess Medical Center data. SETTING: ICUs at University of California San Francisco and Beth Israel Deaconess Medical Center. SUBJECTS: Ten thousand patients in the University of California San Francisco training and internal testing dataset and 27,058 patients in the external validation dataset, Beth Israel Deaconess Medical Center. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Mortality rate at Beth Israel Deaconess Medical Center and University of California San Francisco was 10.9% and 7.4%, respectively. Data are presented as area under the receiver operating characteristic curve (95% CI) for models validated at University of California San Francisco and area under the receiver operating characteristic curve for models validated at Beth Israel Deaconess Medical Center. Models built and trained on University of California San Francisco data for the prediction of inhospital mortality improved from the raw note text model (AUROC, 0.84; CI, 0.80-0.89) to the term frequency-inverse document frequency model (AUROC, 0.89; CI, 0.85-0.94). When applying the models developed at University of California San Francisco to Beth Israel Deaconess Medical Center data, there was a similar increase in model performance from raw note text (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.72) to the term frequency-inverse document frequency model (area under the receiver operating characteristic curve at Beth Israel Deaconess Medical Center: 0.83). CONCLUSIONS: Differences in preprocessing strategies for note text impacted model discrimination. Completing a preprocessing pathway including cleaning, stemming, and term frequency-inverse document frequency vectorization resulted in the preprocessing strategy with the greatest improvement in model performance. Further study is needed, with particular emphasis on how to manage author implicit bias present in note text, before natural language processing algorithms are implemented in the clinical setting.

SELECTION OF CITATIONS
SEARCH DETAIL
...