Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
1.
Mol Psychiatry ; 2024 May 23.
Article in English | MEDLINE | ID: mdl-38783055

ABSTRACT

Pharmacogenomic testing has emerged as an aid in clinical decision making for psychiatric providers, but more data is needed regarding its utility in clinical practice and potential impact on patient care. In this cross-sectional study, we determined the real-world prevalence of pharmacogenomic actionability in patients receiving psychiatric care. Potential actionability was based on the prevalence of CYP2C19 and CYP2D6 phenotypes, including CYP2D6 allele-specific copy number variations (CNVs). Combined actionability additionally incorporated CYP2D6 phenoconversion and the novel CYP2C-TG haplotype in patients with available medication data. Across 15,000 patients receiving clinical pharmacogenomic testing, 65% had potentially actionable CYP2D6 and CYP2C19 phenotypes, and phenotype assignment was impacted by CYP2D6 allele-specific CNVs in 2% of all patients. Of 4114 patients with medication data, 42% had CYP2D6 phenoconversion from drug interactions and 20% carried a novel CYP2C haplotype potentially altering actionability. A total of 87% had some form of potential actionability from genetic findings and/or phenoconversion. Genetic variation detected via next-generation sequencing led to phenotype reassignment in 22% of individuals overall (2% in CYP2D6 and 20% in CYP2C19). Ultimately, pharmacogenomic testing using next-generation sequencing identified potential actionability in most patients receiving psychiatric care. Early pharmacogenomic testing may provide actionable insights to aid clinicians in drug prescribing to optimize psychiatric care.

2.
Artif Intell Med ; 123: 102227, 2022 01.
Article in English | MEDLINE | ID: mdl-34998516

ABSTRACT

PURPOSE: Anesthesiologists simultaneously manage several aspects of patient care during general anesthesia. Automating administration of hypnotic agents could enable more precise control of a patient's level of unconsciousness and enable anesthesiologists to focus on the most critical aspects of patient care. Reinforcement learning (RL) algorithms can be used to fit a mapping from patient state to a medication regimen. These algorithms can learn complex control policies that, when paired with modern techniques for promoting model interpretability, offer a promising approach for developing a clinically viable system for automated anesthestic drug delivery. METHODS: We expand on our prior work applying deep RL to automated anesthetic dosing by now using a continuous-action model based on the actor-critic RL paradigm. The proposed RL agent is composed of a policy network that maps observed anesthetic states to a continuous probability density over propofol-infusion rates and a value network that estimates the favorability of observed states. We train and test three versions of the RL agent using varied reward functions. The agent is trained using simulated pharmacokinetic/pharmacodynamic models with randomized parameters to ensure robustness to patient variability. The model is tested on simulations and retrospectively on nine general anesthesia cases collected in the operating room. We utilize Shapley additive explanations to gain an understanding of the factors with the greatest influence over the agent's decision-making. RESULTS: The deep RL agent significantly outperformed a proportional-integral-derivative controller (median episode median absolute performance error 1.9% ± 1.8 and 3.1% ± 1.1). The model that was rewarded for minimizing total doses performed the best across simulated patient demographics (median episode median performance error 1.1% ± 0.5). When run on real-world clinical datasets, the agent recommended doses that were consistent with those administered by the anesthesiologist. CONCLUSIONS: The proposed approach marks the first fully continuous deep RL algorithm for automating anesthestic drug dosing. The reward function used by the RL training algorithm can be flexibly designed for desirable practices (e.g. use less anesthetic) and bolstered performances. Through careful analysis of the learned policies, techniques for interpreting dosing decisions, and testing on clinical data, we confirm that the agent's anesthetic dosing is consistent with our understanding of best-practices in anesthesia care.


Subject(s)
Propofol , Algorithms , Anesthesia, General , Humans , Reinforcement, Psychology , Retrospective Studies
3.
JACC Cardiovasc Imaging ; 15(3): 395-410, 2022 03.
Article in English | MEDLINE | ID: mdl-34656465

ABSTRACT

OBJECTIVES: This study sought to develop DL models capable of comprehensively quantifying left and right ventricular dysfunction from ECG data in a large, diverse population. BACKGROUND: Rapid evaluation of left and right ventricular function using deep learning (DL) on electrocardiograms (ECGs) can assist diagnostic workflow. However, DL tools to estimate right ventricular (RV) function do not exist, whereas those to estimate left ventricular (LV) function are restricted to quantification of very low LV function only. METHODS: A multicenter study was conducted with data from 5 New York City hospitals: 4 for internal testing and 1 serving as external validation. We created novel DL models to classify left ventricular ejection fraction (LVEF) into categories derived from the latest universal definition of heart failure, estimate LVEF through regression, and predict a composite outcome of either RV systolic dysfunction or RV dilation. RESULTS: We obtained echocardiogram LVEF estimates for 147,636 patients paired to 715,890 ECGs. We used natural language processing (NLP) to extract RV size and systolic function information from 404,502 echocardiogram reports paired to 761,510 ECGs for 148,227 patients. For LVEF classification in internal testing, area under curve (AUC) at detection of LVEF ≤40%, 40% < LVEF ≤50%, and LVEF >50% was 0.94 (95% CI: 0.94-0.94), 0.82 (95% CI: 0.81-0.83), and 0.89 (95% CI: 0.89-0.89), respectively. For external validation, these results were 0.94 (95% CI: 0.94-0.95), 0.73 (95% CI: 0.72-0.74), and 0.87 (95% CI: 0.87-0.88). For regression, the mean absolute error was 5.84% (95% CI: 5.82%-5.85%) for internal testing and 6.14% (95% CI: 6.13%-6.16%) in external validation. For prediction of the composite RV outcome, AUC was 0.84 (95% CI: 0.84-0.84) in both internal testing and external validation. CONCLUSIONS: DL on ECG data can be used to create inexpensive screening, diagnostic, and predictive tools for both LV and RV dysfunction. Such tools may bridge the applicability of ECGs and echocardiography and enable prioritization of patients for further interventions for either sided failure progressing to biventricular disease.


Subject(s)
Deep Learning , Ventricular Dysfunction, Left , Ventricular Dysfunction, Right , Electrocardiography , Humans , Predictive Value of Tests , Stroke Volume , Ventricular Dysfunction, Left/diagnostic imaging , Ventricular Dysfunction, Right/diagnostic imaging , Ventricular Function, Left , Ventricular Function, Right
4.
AMIA Jt Summits Transl Sci Proc ; 2021: 345-354, 2021.
Article in English | MEDLINE | ID: mdl-34457149

ABSTRACT

Deep learning models in healthcare may fail to generalize on data from unseen corpora. Additionally, no quantitative metric exists to tell how existing models will perform on new data. Previous studies demonstrated that NLP models of medical notes generalize variably between institutions, but ignored other levels of healthcare organization. We measured SciBERT diagnosis sentiment classifier generalizability between medical specialties using EHR sentences from MIMIC-III. Models trained on one specialty performed better on internal test sets than mixed or external test sets (mean AUCs 0.92, 0.87, and 0.83, respectively; p = 0.016). When models are trained on more specialties, they have better test performances (p < 1e-4). Model performance on new corpora is directly correlated to the similarity between train and test sentence content (p < 1e-4). Future studies should assess additional axes of generalization to ensure deep learning models fulfil their intended purpose across institutions, specialties, and practices.


Subject(s)
Deep Learning , Medicine , Humans , Language , Semantics
5.
Cancer Med ; 10(14): 4805-4813, 2021 07.
Article in English | MEDLINE | ID: mdl-34114376

ABSTRACT

BACKGROUND: In recent years, the fibroblast growth factor receptor (FGFR) pathway has been proven to be an important therapeutic target in bladder cancer. FGFR-targeted therapies are effective for patients with FGFR mutation, which can be discovered through genetic sequencing. However, genetic sequencing is not commonly performed at diagnosis, whereas a histologic assessment of the tumor is. We aim to computationally extract imaging biomarkers from existing tumor diagnostic slides in order to predict FGFR alterations in bladder cancer. METHODS: This study analyzed genomic profiles and H&E-stained tumor diagnostic slides of bladder cancer cases from The Cancer Genome Atlas (n = 418 cases). A convolutional neural network (CNN) identified tumor-infiltrating lymphocytes (TIL). The percentage of the tissue containing TIL ("TIL percentage") was then used to predict FGFR activation status with a logistic regression model. RESULTS: This predictive model could proficiently identify patients with any type of FGFR gene aberration using the CNN-based TIL percentage (sensitivity = 0.89, specificity = 0.42, AUROC = 0.76). A similar model which focused on predicting patients with only FGFR2/FGFR3 mutation was also found to be highly sensitive, but also specific (sensitivity = 0.82, specificity = 0.85, AUROC = 0.86). CONCLUSION: TIL percentage is a computationally derived image biomarker from routine tumor histology that can predict whether a tumor has FGFR mutations. CNNs and other digital pathology methods may complement genome sequencing and provide earlier screening options for candidates of targeted therapies.


Subject(s)
Deep Learning , Mutation , Receptors, Fibroblast Growth Factor/genetics , Urinary Bladder Neoplasms/genetics , Databases, Factual , Female , Gene Expression , Humans , Logistic Models , Lymphocytes, Tumor-Infiltrating , Male , Molecular Targeted Therapy/methods , Neural Networks, Computer , Receptor, Fibroblast Growth Factor, Type 2/genetics , Receptor, Fibroblast Growth Factor, Type 3/genetics , Sensitivity and Specificity , Urinary Bladder Neoplasms/pathology
6.
PLoS One ; 16(5): e0246165, 2021.
Article in English | MEDLINE | ID: mdl-33956800

ABSTRACT

In current anesthesiology practice, anesthesiologists infer the state of unconsciousness without directly monitoring the brain. Drug- and patient-specific electroencephalographic (EEG) signatures of anesthesia-induced unconsciousness have been identified previously. We applied machine learning approaches to construct classification models for real-time tracking of unconscious state during anesthesia-induced unconsciousness. We used cross-validation to select and train the best performing models using 33,159 2s segments of EEG data recorded from 7 healthy volunteers who received increasing infusions of propofol while responding to stimuli to directly assess unconsciousness. Cross-validated models of unconsciousness performed very well when tested on 13,929 2s EEG segments from 3 left-out volunteers collected under the same conditions (median volunteer AUCs 0.99-0.99). Models showed strong generalization when tested on a cohort of 27 surgical patients receiving solely propofol collected in a separate clinical dataset under different circumstances and using different hardware (median patient AUCs 0.95-0.98), with model predictions corresponding with actions taken by the anesthesiologist during the cases. Performance was also strong for 17 patients receiving sevoflurane (alone or in addition to propofol) (median AUCs 0.88-0.92). These results indicate that EEG spectral features can predict unconsciousness, even when tested on a different anesthetic that acts with a similar neural mechanism. With high performance predictions of unconsciousness, we can accurately monitor anesthetic state, and this approach may be used to engineer infusion pumps to intelligibly respond to patients' neural activity.


Subject(s)
Electroencephalography , Machine Learning , Signal Processing, Computer-Assisted , Unconsciousness/physiopathology , Anesthetics, Intravenous/pharmacology , Brain/drug effects , Brain/physiopathology , Electroencephalography/drug effects , Humans , Male , Sevoflurane/adverse effects , Unconsciousness/chemically induced
7.
ACS Omega ; 5(36): 23289-23298, 2020 Sep 15.
Article in English | MEDLINE | ID: mdl-32954180

ABSTRACT

Here, we report a nanoparticle-based probe that affords facile cell labeling with cholesterol in cholesterol efflux (CE) assays. This probe, called ezFlux, was optimized through a screening of multiple nanoformulations engineered with a Förster resonance energy transfer (FRET) reporter. The physicochemical- and bio-similarity of ezFlux to standard semi-synthetic acetylated low-density lipoprotein (acLDL) was confirmed by testing uptake in macrophages, the intracellular route of degradation, and performance in CE assays. A single-step fast self-assembly fabrication makes ezFlux an attractive alternative to acLDL. We also show that CE testing using ezFlux is significantly cheaper than that performed using commercial kits or acLDL. Additionally, we analyze clinical trials that measure CE and show that ezFlux has a place in many research and clinical laboratories worldwide that use CE to assess cellular and lipoprotein function.

8.
Proc IFAC World Congress ; 53(2): 15870-15876, 2020.
Article in English | MEDLINE | ID: mdl-34184002

ABSTRACT

Significant effort toward the automation of general anesthesia has been made in the past decade. One open challenge is in the development of control-ready patient models for closed-loop anesthesia delivery. Standard depth-of-anesthesia tracking does not readily capture inter-individual differences in response to anesthetics, especially those due to age, and does not aim to predict a relationship between a control input (infused anesthetic dose) and system state (commonly, a function of electroencephalography (EEG) signal). In this work, we developed a control-ready patient model for closed-loop propofol-induced anesthesia using data recorded during a clinical study of EEG during general anesthesia in ten healthy volunteers. We used principal component analysis to identify the low-dimensional state-space in which EEG signal evolves during anesthesia delivery. We parameterized the response of the EEG signal to changes in propofol target-site concentration using logistic models. We note that inter-individual differences in anesthetic sensitivity may be captured by varying a constant cofactor of the predicted effect-site concentration. We linked the EEG dose-response with the control input using a pharmacokinetic model. Finally, we present a simple nonlinear model predictive control in silico demonstration of how such a closed-loop system would work.

10.
NPJ Digit Med ; 2: 31, 2019.
Article in English | MEDLINE | ID: mdl-31304378

ABSTRACT

Hip fractures are a leading cause of death and disability among older adults. Hip fractures are also the most commonly missed diagnosis on pelvic radiographs, and delayed diagnosis leads to higher cost and worse outcomes. Computer-aided diagnosis (CAD) algorithms have shown promise for helping radiologists detect fractures, but the image features underpinning their predictions are notoriously difficult to understand. In this study, we trained deep-learning models on 17,587 radiographs to classify fracture, 5 patient traits, and 14 hospital process variables. All 20 variables could be individually predicted from a radiograph, with the best performances on scanner model (AUC = 1.00), scanner brand (AUC = 0.98), and whether the order was marked "priority" (AUC = 0.79). Fracture was predicted moderately well from the image (AUC = 0.78) and better when combining image features with patient data (AUC = 0.86, DeLong paired AUC comparison, p = 2e-9) or patient data plus hospital process features (AUC = 0.91, p = 1e-21). Fracture prediction on a test set that balanced fracture risk across patient variables was significantly lower than a random test set (AUC = 0.67, DeLong unpaired AUC comparison, p = 0.003); and on a test set with fracture risk balanced across patient and hospital process variables, the model performed randomly (AUC = 0.52, 95% CI 0.46-0.58), indicating that these variables were the main source of the model's fracture predictions. A single model that directly combines image features, patient, and hospital process data outperforms a Naive Bayes ensemble of an image-only model prediction, patient, and hospital process data. If CAD algorithms are inexplicably leveraging patient and process variables in their predictions, it is unclear how radiologists should interpret their predictions in the context of other known patient data. Further research is needed to illuminate deep-learning decision processes so that computers and clinicians can effectively cooperate.

11.
BMC Med Genomics ; 12(Suppl 6): 108, 2019 07 25.
Article in English | MEDLINE | ID: mdl-31345219

ABSTRACT

BACKGROUND: Genetic loss-of-function variants (LoFs) associated with disease traits are increasingly recognized as critical evidence for the selection of therapeutic targets. We integrated the analysis of genetic and clinical data from 10,511 individuals in the Mount Sinai BioMe Biobank to identify genes with loss-of-function variants (LoFs) significantly associated with cardiovascular disease (CVD) traits, and used RNA-sequence data of seven metabolic and vascular tissues isolated from 600 CVD patients in the Stockholm-Tartu Atherosclerosis Reverse Network Engineering Task (STARNET) study for validation. We also carried out in vitro functional studies of several candidate genes, and in vivo studies of one gene. RESULTS: We identified LoFs in 433 genes significantly associated with at least one of 10 major CVD traits. Next, we used RNA-sequence data from the STARNET study to validate 115 of the 433 LoF harboring-genes in that their expression levels were concordantly associated with corresponding CVD traits. Together with the documented hepatic lipid-lowering gene, APOC3, the expression levels of six additional liver LoF-genes were positively associated with levels of plasma lipids in STARNET. Candidate LoF-genes were subjected to gene silencing in HepG2 cells with marked overall effects on cellular LDLR, levels of triglycerides and on secreted APOB100 and PCSK9. In addition, we identified novel LoFs in DGAT2 associated with lower plasma cholesterol and glucose levels in BioMe that were also confirmed in STARNET, and showed a selective DGAT2-inhibitor in C57BL/6 mice not only significantly lowered fasting glucose levels but also affected body weight. CONCLUSION: In sum, by integrating genetic and electronic medical record data, and leveraging one of the world's largest human RNA-sequence datasets (STARNET), we identified known and novel CVD-trait related genes that may serve as targets for CVD therapeutics and as such merit further investigation.


Subject(s)
Cardiovascular Diseases/genetics , Genomics , Mutation , Cardiovascular Diseases/blood , Cholesterol/blood , Genotype , Humans , Triglycerides/blood
12.
Bioinformatics ; 35(21): 4515-4518, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31214700

ABSTRACT

MOTIVATION: Electronic health records (EHRs) are quickly becoming omnipresent in healthcare, but interoperability issues and technical demands limit their use for biomedical and clinical research. Interactive and flexible software that interfaces directly with EHR data structured around a common data model (CDM) could accelerate more EHR-based research by making the data more accessible to researchers who lack computational expertise and/or domain knowledge. RESULTS: We present PatientExploreR, an extensible application built on the R/Shiny framework that interfaces with a relational database of EHR data in the Observational Medical Outcomes Partnership CDM format. PatientExploreR produces patient-level interactive and dynamic reports and facilitates visualization of clinical data without any programming required. It allows researchers to easily construct and export patient cohorts from the EHR for analysis with other software. This application could enable easier exploration of patient-level data for physicians and researchers. PatientExploreR can incorporate EHR data from any institution that employs the CDM for users with approved access. The software code is free and open source under the MIT license, enabling institutions to install and users to expand and modify the application for their own purposes. AVAILABILITY AND IMPLEMENTATION: PatientExploreR can be freely obtained from GitHub: https://github.com/BenGlicksberg/PatientExploreR. We provide instructions for how researchers with approved access to their institutional EHR can use this package. We also release an open sandbox server of synthesized patient data for users without EHR access to explore: http://patientexplorer.ucsf.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Electronic Health Records , Software , Computers , Databases, Factual , Humans , Observational Studies as Topic
13.
Bioinformatics ; 35(9): 1610-1612, 2019 05 01.
Article in English | MEDLINE | ID: mdl-30304439

ABSTRACT

MOTIVATION: Radiologists have used algorithms for Computer-Aided Diagnosis (CAD) for decades. These algorithms use machine learning with engineered features, and there have been mixed findings on whether they improve radiologists' interpretations. Deep learning offers superior performance but requires more training data and has not been evaluated in joint algorithm-radiologist decision systems. RESULTS: We developed the Computer-Aided Note and Diagnosis Interface (CANDI) for collaboratively annotating radiographs and evaluating how algorithms alter human interpretation. The annotation app collects classification, segmentation, and image captioning training data, and the evaluation app randomizes the availability of CAD tools to facilitate clinical trials on radiologist enhancement. AVAILABILITY AND IMPLEMENTATION: Demonstrations and source code are hosted at (https://candi.nextgenhealthcare.org), and (https://github.com/mbadge/candi), respectively, under GPL-3 license. SUPPLEMENTARY INFORMATION: Supplementary material is available at Bioinformatics online.


Subject(s)
Algorithms , Software , Deep Learning , Humans , Machine Learning , Neural Networks, Computer
14.
PLoS Med ; 15(11): e1002683, 2018 11.
Article in English | MEDLINE | ID: mdl-30399157

ABSTRACT

BACKGROUND: There is interest in using convolutional neural networks (CNNs) to analyze medical imaging to provide computer-aided diagnosis (CAD). Recent work has suggested that image classification CNNs may not generalize to new data as well as previously believed. We assessed how well CNNs generalized across three hospital systems for a simulated pneumonia screening task. METHODS AND FINDINGS: A cross-sectional design with multiple model training cohorts was used to evaluate model generalizability to external sites using split-sample validation. A total of 158,323 chest radiographs were drawn from three institutions: National Institutes of Health Clinical Center (NIH; 112,120 from 30,805 patients), Mount Sinai Hospital (MSH; 42,396 from 12,904 patients), and Indiana University Network for Patient Care (IU; 3,807 from 3,683 patients). These patient populations had an age mean (SD) of 46.9 years (16.6), 63.2 years (16.5), and 49.6 years (17) with a female percentage of 43.5%, 44.8%, and 57.3%, respectively. We assessed individual models using the area under the receiver operating characteristic curve (AUC) for radiographic findings consistent with pneumonia and compared performance on different test sets with DeLong's test. The prevalence of pneumonia was high enough at MSH (34.2%) relative to NIH and IU (1.2% and 1.0%) that merely sorting by hospital system achieved an AUC of 0.861 (95% CI 0.855-0.866) on the joint MSH-NIH dataset. Models trained on data from either NIH or MSH had equivalent performance on IU (P values 0.580 and 0.273, respectively) and inferior performance on data from each other relative to an internal test set (i.e., new data from within the hospital system used for training data; P values both <0.001). The highest internal performance was achieved by combining training and test data from MSH and NIH (AUC 0.931, 95% CI 0.927-0.936), but this model demonstrated significantly lower external performance at IU (AUC 0.815, 95% CI 0.745-0.885, P = 0.001). To test the effect of pooling data from sites with disparate pneumonia prevalence, we used stratified subsampling to generate MSH-NIH cohorts that only differed in disease prevalence between training data sites. When both training data sites had the same pneumonia prevalence, the model performed consistently on external IU data (P = 0.88). When a 10-fold difference in pneumonia rate was introduced between sites, internal test performance improved compared to the balanced model (10× MSH risk P < 0.001; 10× NIH P = 0.002), but this outperformance failed to generalize to IU (MSH 10× P < 0.001; NIH 10× P = 0.027). CNNs were able to directly detect hospital system of a radiograph for 99.95% NIH (22,050/22,062) and 99.98% MSH (8,386/8,388) radiographs. The primary limitation of our approach and the available public data is that we cannot fully assess what other factors might be contributing to hospital system-specific biases. CONCLUSION: Pneumonia-screening CNNs achieved better internal than external performance in 3 out of 5 natural comparisons. When models were trained on pooled data from sites with different pneumonia prevalence, they performed better on new pooled data from these sites but not on external data. CNNs robustly identified hospital system and department within a hospital, which can have large differences in disease burden and may confound predictions.


Subject(s)
Deep Learning , Diagnosis, Computer-Assisted/methods , Pneumonia/diagnostic imaging , Radiographic Image Interpretation, Computer-Assisted/methods , Radiography, Thoracic/methods , Adult , Aged , Cross-Sectional Studies , Female , Humans , Male , Middle Aged , Predictive Value of Tests , Radiology Information Systems , Reproducibility of Results , Retrospective Studies , United States
15.
Nat Med ; 24(9): 1337-1341, 2018 09.
Article in English | MEDLINE | ID: mdl-30104767

ABSTRACT

Rapid diagnosis and treatment of acute neurological illnesses such as stroke, hemorrhage, and hydrocephalus are critical to achieving positive outcomes and preserving neurologic function-'time is brain'1-5. Although these disorders are often recognizable by their symptoms, the critical means of their diagnosis is rapid imaging6-10. Computer-aided surveillance of acute neurologic events in cranial imaging has the potential to triage radiology workflow, thus decreasing time to treatment and improving outcomes. Substantial clinical work has focused on computer-assisted diagnosis (CAD), whereas technical work in volumetric image analysis has focused primarily on segmentation. 3D convolutional neural networks (3D-CNNs) have primarily been used for supervised classification on 3D modeling and light detection and ranging (LiDAR) data11-15. Here, we demonstrate a 3D-CNN architecture that performs weakly supervised classification to screen head CT images for acute neurologic events. Features were automatically learned from a clinical radiology dataset comprising 37,236 head CTs and were annotated with a semisupervised natural-language processing (NLP) framework16. We demonstrate the effectiveness of our approach to triage radiology workflow and accelerate the time to diagnosis from minutes to seconds through a randomized, double-blinded, prospective trial in a simulated clinical environment.


Subject(s)
Imaging, Three-Dimensional , Neural Networks, Computer , Skull/diagnostic imaging , Algorithms , Automation , Humans , ROC Curve , Randomized Controlled Trials as Topic , Tomography, X-Ray Computed
16.
Radiology ; 287(2): 570-580, 2018 05.
Article in English | MEDLINE | ID: mdl-29381109

ABSTRACT

Purpose To compare different methods for generating features from radiology reports and to develop a method to automatically identify findings in these reports. Materials and Methods In this study, 96 303 head computed tomography (CT) reports were obtained. The linguistic complexity of these reports was compared with that of alternative corpora. Head CT reports were preprocessed, and machine-analyzable features were constructed by using bag-of-words (BOW), word embedding, and Latent Dirichlet allocation-based approaches. Ultimately, 1004 head CT reports were manually labeled for findings of interest by physicians, and a subset of these were deemed critical findings. Lasso logistic regression was used to train models for physician-assigned labels on 602 of 1004 head CT reports (60%) using the constructed features, and the performance of these models was validated on a held-out 402 of 1004 reports (40%). Models were scored by area under the receiver operating characteristic curve (AUC), and aggregate AUC statistics were reported for (a) all labels, (b) critical labels, and (c) the presence of any critical finding in a report. Sensitivity, specificity, accuracy, and F1 score were reported for the best performing model's (a) predictions of all labels and (b) identification of reports containing critical findings. Results The best-performing model (BOW with unigrams, bigrams, and trigrams plus average word embeddings vector) had a held-out AUC of 0.966 for identifying the presence of any critical head CT finding and an average 0.957 AUC across all head CT findings. Sensitivity and specificity for identifying the presence of any critical finding were 92.59% (175 of 189) and 89.67% (191 of 213), respectively. Average sensitivity and specificity across all findings were 90.25% (1898 of 2103) and 91.72% (18 351 of 20 007), respectively. Simpler BOW methods achieved results competitive with those of more sophisticated approaches, with an average AUC for presence of any critical finding of 0.951 for unigram BOW versus 0.966 for the best-performing model. The Yule I of the head CT corpus was 34, markedly lower than that of the Reuters corpus (at 103) or I2B2 discharge summaries (at 271), indicating lower linguistic complexity. Conclusion Automated methods can be used to identify findings in radiology reports. The success of this approach benefits from the standardized language of these reports. With this method, a large labeled corpus can be generated for applications such as deep learning. © RSNA, 2018 Online supplemental material is available for this article.


Subject(s)
Electronic Health Records , Machine Learning , Natural Language Processing , Radiology/methods , Tomography, X-Ray Computed , Area Under Curve , Databases, Factual , Humans , Sensitivity and Specificity
17.
J Neurointerv Surg ; 10(4): 358-362, 2018 Apr.
Article in English | MEDLINE | ID: mdl-28954825

ABSTRACT

Stroke is a leading cause of long-term disability, and outcome is directly related to timely intervention. Not all patients benefit from rapid intervention, however. Thus a significant amount of attention has been paid to using neuroimaging to assess potential benefit by identifying areas of ischemia that have not yet experienced cellular death. The perfusion-diffusion mismatch, is used as a simple metric for potential benefit with timely intervention, yet penumbral patterns provide an inaccurate predictor of clinical outcome. Machine learning research in the form of deep learning (artificial intelligence) techniques using deep neural networks (DNNs) excel at working with complex inputs. The key areas where deep learning may be imminently applied to stroke management are image segmentation, automated featurization (radiomics), and multimodal prognostication. The application of convolutional neural networks, the family of DNN architectures designed to work with images, to stroke imaging data is a perfect match between a mature deep learning technique and a data type that is naturally suited to benefit from deep learning's strengths. These powerful tools have opened up exciting opportunities for data-driven stroke management for acute intervention and for guiding prognosis. Deep learning techniques are useful for the speed and power of results they can deliver and will become an increasingly standard tool in the modern stroke specialist's arsenal for delivering personalized medicine to patients with ischemic stroke.


Subject(s)
Disease Management , Machine Learning/trends , Neural Networks, Computer , Stroke/therapy , Humans , Neuroimaging/methods , Neuroimaging/trends , Stroke/diagnostic imaging
18.
Brief Bioinform ; 19(4): 656-678, 2018 07 20.
Article in English | MEDLINE | ID: mdl-28200013

ABSTRACT

Increase in global population and growing disease burden due to the emergence of infectious diseases (Zika virus), multidrug-resistant pathogens, drug-resistant cancers (cisplatin-resistant ovarian cancer) and chronic diseases (arterial hypertension) necessitate effective therapies to improve health outcomes. However, the rapid increase in drug development cost demands innovative and sustainable drug discovery approaches. Drug repositioning, the discovery of new or improved therapies by reevaluation of approved or investigational compounds, solves a significant gap in the public health setting and improves the productivity of drug development. As the number of drug repurposing investigations increases, a new opportunity has emerged to understand factors driving drug repositioning through systematic analyses of drugs, drug targets and associated disease indications. However, such analyses have so far been hampered by the lack of a centralized knowledgebase, benchmarking data sets and reporting standards. To address these knowledge and clinical needs, here, we present RepurposeDB, a collection of repurposed drugs, drug targets and diseases, which was assembled, indexed and annotated from public data. RepurposeDB combines information on 253 drugs [small molecules (74.30%) and protein drugs (25.29%)] and 1125 diseases. Using RepurposeDB data, we identified pharmacological (chemical descriptors, physicochemical features and absorption, distribution, metabolism, excretion and toxicity properties), biological (protein domains, functional process, molecular mechanisms and pathway cross talks) and epidemiological (shared genetic architectures, disease comorbidities and clinical phenotype similarities) factors mediating drug repositioning. Collectively, RepurposeDB is developed as the reference database for drug repositioning investigations. The pharmacological, biological and epidemiological principles of drug repositioning identified from the meta-analyses could augment therapeutic development.


Subject(s)
Computational Biology/methods , Databases, Factual , Disease , Drug Discovery , Drug Repositioning , Proteins/metabolism , Humans , Molecular Epidemiology , Proteins/genetics
19.
Nat Biotechnol ; 35(4): 354-362, 2017 Apr.
Article in English | MEDLINE | ID: mdl-28288104

ABSTRACT

The feasibility of using mobile health applications to conduct observational clinical studies requires rigorous validation. Here, we report initial findings from the Asthma Mobile Health Study, a research study, including recruitment, consent, and enrollment, conducted entirely remotely by smartphone. We achieved secure bidirectional data flow between investigators and 7,593 participants from across the United States, including many with severe asthma. Our platform enabled prospective collection of longitudinal, multidimensional data (e.g., surveys, devices, geolocation, and air quality) in a subset of users over the 6-month study period. Consistent trending and correlation of interrelated variables support the quality of data obtained via this method. We detected increased reporting of asthma symptoms in regions affected by heat, pollen, and wildfires. Potential challenges with this technology include selection bias, low retention rates, reporting bias, and data security. These issues require attention to realize the full potential of mobile platforms in research and patient care.


Subject(s)
Asthma/epidemiology , Health Services Research/organization & administration , Health Surveys/statistics & numerical data , Population Surveillance/methods , Research Design , Telemedicine/statistics & numerical data , Adolescent , Adult , Aged , Asthma/diagnosis , Female , Health Surveys/methods , Humans , Male , Middle Aged , New York/epidemiology , Observational Studies as Topic/methods , Patient Selection , Prevalence , Risk Factors , Young Adult
20.
Brief Bioinform ; 18(1): 105-124, 2017 01.
Article in English | MEDLINE | ID: mdl-26876889

ABSTRACT

Monitoring and modeling biomedical, health care and wellness data from individuals and converging data on a population scale have tremendous potential to improve understanding of the transition to the healthy state of human physiology to disease setting. Wellness monitoring devices and companion software applications capable of generating alerts and sharing data with health care providers or social networks are now available. The accessibility and clinical utility of such data for disease or wellness research are currently limited. Designing methods for streaming data capture, real-time data aggregation, machine learning, predictive analytics and visualization solutions to integrate wellness or health monitoring data elements with the electronic medical records (EMRs) maintained by health care providers permits better utilization. Integration of population-scale biomedical, health care and wellness data would help to stratify patients for active health management and to understand clinically asymptomatic patients and underlying illness trajectories. In this article, we discuss various health-monitoring devices, their ability to capture the unique state of health represented in a patient and their application in individualized diagnostics, prognosis, clinical or wellness intervention. We also discuss examples of translational bioinformatics approaches to integrating patient-generated data with existing EMRs, personal health records, patient portals and clinical data repositories. Briefly, translational bioinformatics methods, tools and resources are at the center of these advances in implementing real-time biomedical and health care analytics in the clinical setting. Furthermore, these advances are poised to play a significant role in clinical decision-making and implementation of data-driven medicine and wellness care.


Subject(s)
Computational Biology , Data Collection , Humans , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...