Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
Sci Rep ; 13(1): 294, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36609415

ABSTRACT

Left ventricular ejection fraction (EF) is a key measure in the diagnosis and treatment of heart failure (HF) and many patients experience changes in EF overtime. Large-scale analysis of longitudinal changes in EF using electronic health records (EHRs) is limited. In a multi-site retrospective study using EHR data from three academic medical centers, we investigated longitudinal changes in EF measurements in patients diagnosed with HF. We observed significant variations in baseline characteristics and longitudinal EF change behavior of the HF cohorts from a previous study that is based on HF registry data. Data gathered from this longitudinal study were used to develop multiple machine learning models to predict changes in ejection fraction measurements in HF patients. Across all three sites, we observed higher performance in predicting EF increase over a 1-year duration, with similarly higher performance predicting an EF increase of 30% from baseline compared to lower percentage increases. In predicting EF decrease we found moderate to high performance with low confidence for various models. Among various machine learning models, XGBoost was the best performing model for predicting EF changes. Across the three sites, the XGBoost model had an F1-score of 87.2, 89.9, and 88.6 and AUC of 0.83, 0.87, and 0.90 in predicting a 30% increase in EF, and had an F1-score of 95.0, 90.6, 90.1 and AUC of 0.54, 0.56, 0.68 in predicting a 30% decrease in EF. Among features that contribute to predicting EF changes, baseline ejection fraction measurement, age, gender, and heart diseases were found to be statistically significant.


Subject(s)
Heart Failure , Ventricular Function, Left , Humans , Electronic Health Records , Longitudinal Studies , Machine Learning , Prognosis , Retrospective Studies , Stroke Volume
2.
J Biomed Inform ; 118: 103789, 2021 06.
Article in English | MEDLINE | ID: mdl-33862230

ABSTRACT

Patients treated in an intensive care unit (ICU) are critically ill and require life-sustaining organ failure support. Existing critical care data resources are limited to a select number of institutions, contain only ICU data, and do not enable the study of local changes in care patterns. To address these limitations, we developed the Critical carE Database for Advanced Research (CEDAR), a method for automating extraction and transformation of data from an electronic health record (EHR) system. Compared to an existing gold standard of manually collected data at our institution, CEDAR was statistically similar in most measures, including patient demographics and sepsis-related organ failure assessment (SOFA) scores. Additionally, CEDAR automated data extraction obviated the need for manual collection of 550 variables. Critically, during the spring 2020 COVID-19 surge in New York City, a modified version of CEDAR supported pandemic response efforts, including clinical operations and research. Other academic medical centers may find value in using the CEDAR method to automate data extraction from EHR systems to support ICU activities.


Subject(s)
COVID-19 , Databases, Factual , Electronic Health Records , Intensive Care Units , Aged , Aged, 80 and over , Critical Care , Critical Illness , Female , Humans , Male , Middle Aged , New York City
3.
Ann Am Thorac Soc ; 18(11): 1849-1860, 2021 11.
Article in English | MEDLINE | ID: mdl-33760709

ABSTRACT

Rationale: The Sequential Organ Failure Assessment (SOFA) tool is a commonly used measure of illness severity. Calculation of the respiratory subscore of SOFA is frequently limited by missing arterial oxygen pressure (PaO2) data. Although missing PaO2 data are commonly replaced with normal values, the performance of different methods of substituting PaO2 for SOFA calculation is unclear. Objectives: The study objective was to compare the performance of different substitution strategies for missing PaO2 data for SOFA score calculation. Methods: This retrospective cohort study was performed using the Weill Cornell Critical Care Database for Advanced Research from a tertiary care hospital in the United States. All adult patients admitted to an intensive care unit (ICU) from 2011 to 2019 with an available respiratory SOFA score were included. We analyzed the availability of the PaO2/fraction of inspired oxygen (FiO2) ratio on the first day of ICU admission. In those without a PaO2/FiO2 ratio available, the ratio of oxygen saturation as measured by pulse oximetry to FiO2 was used to calculate a respiratory SOFA subscore according to four methods (linear substitution [Rice], nonlinear substitution [Severinghaus], modified respiratory SOFA, and multiple imputation by chained equations [MICE]) as well as the missing-as-normal technique. We then compared how well the different total SOFA scores discriminated in-hospital mortality. We performed several subgroup and sensitivity analyses. Results: We identified 35,260 unique visits, of which 9,172 included predominant respiratory failure. PaO2 data were available for 14,939 (47%). The area under the receiver operating characteristic curve for each substitution technique for discriminating in-hospital mortality was higher than that for the missing-as-normal technique (0.78 [0.77-0.79]) in all analyses (modified, 0.80 [0.79-0.81]; Rice, 0.80 [0.79-0.81]; Severinghaus, 0.80 [0.79-0.81]; and MICE, 0.80 [0.79-0.81]) (P < 0.01). Each substitution method had a higher accuracy for discriminating in-hospital mortality (MICE, 0.67; Rice, 0.67; modified, 0.66; and Severinghaus, 0.66) than the missing-as-normal technique. Model calibration for in-hospital mortality was less precise for the missing-as-normal technique than for the other substitution techniques at the lower range of SOFA and among the subgroups. Conclusions: Using physiologic and statistical substitution methods improved the total SOFA score's ability to discriminate mortality compared with the missing-as-normal technique. Treating missing data as normal may result in underreporting the severity of illness compared with using substitution. The simplicity of a direct oxygen saturation as measured by pulse oximetry/FiO2 ratio-modified SOFA technique makes it an attractive choice for electronic health record-based research. This knowledge can inform comparisons of severity of illness across studies that used different techniques.


Subject(s)
Organ Dysfunction Scores , Oximetry , Humans , Intensive Care Units , Oxygen , Prognosis , ROC Curve , Retrospective Studies
4.
Appl Clin Inform ; 11(5): 785-791, 2020 10.
Article in English | MEDLINE | ID: mdl-33241548

ABSTRACT

BACKGROUND: Although federal regulations mandate documentation of structured race data according to Office of Management and Budget (OMB) categories in electronic health record (EHR) systems, many institutions have reported gaps in EHR race data that hinder secondary use for population-level research focused on underserved populations. When evaluating race data available for research purposes, we found our institution's enterprise EHR contained structured race data for only 51% (1.6 million) of patients. OBJECTIVES: We seek to improve the availability and quality of structured race data available to researchers by integrating values from multiple local sources. METHODS: To address the deficiency in race data availability, we implemented a method to supplement OMB race values from four local sources-inpatient EHR, inpatient billing, natural language processing, and coded clinical observations. We evaluated this method by measuring race data availability and data quality with respect to completeness, concordance, and plausibility. RESULTS: The supplementation method improved race data availability in the enterprise EHR up to 10% for some minority groups and 4% overall. We identified structured OMB race values for more than 142,000 patients, nearly a third of whom were from racial minority groups. Our data quality evaluation indicated that the supplemented race values improved completeness in the enterprise EHR, originated from sources in agreement with the enterprise EHR, and were unbiased to the enterprise EHR. CONCLUSION: Implementation of this method can successfully increase OMB race data availability, potentially enhancing accrual of patients from underserved populations to research studies.


Subject(s)
Electronic Health Records , Natural Language Processing , Computer Systems , Data Accuracy , Documentation , Humans
5.
AMIA Jt Summits Transl Sci Proc ; 2020: 589-596, 2020.
Article in English | MEDLINE | ID: mdl-32477681

ABSTRACT

Developed to enable basic queries for cohort discovery, i2b2 has evolved to support complex queries. Little is known whether query sophistication - and the informatics resources required to support it - addresses researcher needs. In three years at our institution, 609 researchers ran 6,662 queries and requested re-identification of 80 patient cohorts to support specific studies. After characterizing all queries as "basic" or "complex" with respect to use of sophisticated query features, we found that the majority of all queries, and the majority of queries resulting in a request for cohort re-identification, did not use complex i2b2 features. Data domains that required extensive effort to implement saw relatively little use compared to common domains (e.g., diagnoses). These findings suggest that efforts to ensure the performance of basic queries using common data domains may better serve the needs of the research community than efforts to integrate novel domains or introduce complex new features.

6.
J Biomed Inform ; 84: 179-183, 2018 08.
Article in English | MEDLINE | ID: mdl-30009991

ABSTRACT

Although i2b2, a popular platform for patient cohort discovery using electronic health record (EHR) data, can support multiple projects specific to individual disease areas or research interests, the standard approach for doing so duplicates data across projects, requiring additional disk space and processing time, which limits scalability. To address this deficiency, we developed a novel approach that stored data in a single i2b2 fact table and used structured query language (SQL) views to access data for specific projects. Compared to the standard approach, the view-based approach reduced required disk space by 59% and extract-transfer-load (ETL) time by 46%, without substantially impacting query performance. The view-based approach has enabled scalability of multiple i2b2 projects and generalized to another data model at our institution. Other institutions may benefit from this approach, code of which is available on GitHub (https://github.com/wcmc-research-informatics/super-i2b2).


Subject(s)
Electronic Health Records , Medical Informatics/methods , Medical Informatics/organization & administration , Academic Medical Centers , Algorithms , Cohort Studies , Humans , Information Storage and Retrieval , Language , New York , Reproducibility of Results , Software , Translational Research, Biomedical/organization & administration
7.
AMIA Annu Symp Proc ; 2017: 1581-1588, 2017.
Article in English | MEDLINE | ID: mdl-29854228

ABSTRACT

Academic medical centers commonly approach secondary use of electronic health record (EHR) data by implementing centralized clinical data warehouses (CDWs). However, CDWs require extensive resources to model data dimensions and harmonize clinical terminology, which can hinder effective support of the specific and varied data needs of investigators. We hypothesized that an approach that aggregates raw data from source systems, ignores initial modeling typical of CDWs, and transforms raw data for specific research purposes would meet investigator needs. The approach has successfully enabled multiple tools that provide utility to the institutional research enterprise. To our knowledge, this is the first complete description of a methodology for electronic patient data acquisition and provisioning that ignores data harmonization at the time of initial storage in favor of downstream transformation to address specific research questions and applications.


Subject(s)
Data Aggregation , Data Warehousing , Electronic Health Records , Translational Research, Biomedical , Academic Medical Centers , Clinical Studies as Topic , Data Mining , Electronic Health Records/organization & administration , Humans , Information Systems/organization & administration , New York City , Systems Integration
SELECTION OF CITATIONS
SEARCH DETAIL
...