Pesquisa | Portal Regional da BVS

1.

INSPIRE, a publicly available research dataset for perioperative medicine.

Lim, Leerang; Lee, Hyeonhoon; Jung, Chul-Woo; Sim, Dayeon; Borrat, Xavier; Pollard, Tom J; Celi, Leo A; Mark, Roger G; Vistisen, Simon T; Lee, Hyung-Chul.

Sci Data ; 11(1): 655, 2024 Jun 21.

Artigo em Inglês | MEDLINE | ID: mdl-38906912

RESUMO

We present the INSPIRE dataset, a publicly available research dataset in perioperative medicine, which includes approximately 130,000 surgical operations at an academic institution in South Korea over a ten-year period between 2011 and 2020. This comprehensive dataset includes patient characteristics such as age, sex, American Society of Anesthesiologists physical status classification, diagnosis, surgical procedure code, department, and type of anaesthesia. The dataset also includes vital signs in the operating theatre, general wards, and intensive care units (ICUs), laboratory results from six months before admission to six months after discharge, and medication during hospitalisation. Complications include total hospital and ICU length of stay and in-hospital death. We hope this dataset will inspire collaborative research and development in perioperative medicine and serve as a reproducible external validation dataset to improve surgical outcomes.

Assuntos

Medicina Perioperatória , Humanos , República da Coreia , Unidades de Terapia Intensiva

2.

Author Correction: MIMIC-IV, a freely accessible electronic health record dataset.

Johnson, Alistair E W; Bulgarelli, Lucas; Shen, Lu; Gayles, Alvin; Shammout, Ayad; Horng, Steven; Pollard, Tom J; Hao, Sicheng; Moody, Benjamin; Gow, Brian; Lehman, Li-Wei H; Celi, Leo A; Mark, Roger G.

Sci Data ; 10(1): 219, 2023 Apr 18.

Artigo em Inglês | MEDLINE | ID: mdl-37072428

3.

Author Correction: MIMIC-IV, a freely accessible electronic health record dataset.

Johnson, Alistair E W; Bulgarelli, Lucas; Shen, Lu; Gayles, Alvin; Shammout, Ayad; Horng, Steven; Pollard, Tom J; Moody, Benjamin; Gow, Brian; Lehman, Li-Wei H; Celi, Leo A; Mark, Roger G.

Sci Data ; 10(1): 31, 2023 Jan 16.

Artigo em Inglês | MEDLINE | ID: mdl-36646711

4.

MIMIC-IV, a freely accessible electronic health record dataset.

Johnson, Alistair E W; Bulgarelli, Lucas; Shen, Lu; Gayles, Alvin; Shammout, Ayad; Horng, Steven; Pollard, Tom J; Hao, Sicheng; Moody, Benjamin; Gow, Brian; Lehman, Li-Wei H; Celi, Leo A; Mark, Roger G.

Sci Data ; 10(1): 1, 2023 01 03.

Artigo em Inglês | MEDLINE | ID: mdl-36596836

RESUMO

Digital data collection during routine clinical practice is now ubiquitous within hospitals. The data contains valuable information on the care of patients and their response to treatments, offering exciting opportunities for research. Typically, data are stored within archival systems that are not intended to support research. These systems are often inaccessible to researchers and structured for optimal storage, rather than interpretability and analysis. Here we present MIMIC-IV, a publicly available database sourced from the electronic health record of the Beth Israel Deaconess Medical Center. Information available includes patient measurements, orders, diagnoses, procedures, treatments, and deidentified free-text clinical notes. MIMIC-IV is intended to support a wide array of research studies and educational material, helping to reduce barriers to conducting clinical research.

Assuntos

Registros Eletrônicos de Saúde , Humanos , Bases de Dados Factuais , Hospitais

5.

The authors reply.

Raffa, Jesse D; Johnson, Alistair E W; O'Brien, Zach; Pollard, Tom J; Mark, Roger G; Celi, Leo A; Pilcher, David; Badawi, Omar.

Crit Care Med ; 50(11): e801-e802, 2022 Nov 01.

Artigo em Inglês | MEDLINE | ID: mdl-36227051

6.

The Global Open Source Severity of Illness Score (GOSSIS).

Raffa, Jesse D; Johnson, Alistair E W; O'Brien, Zach; Pollard, Tom J; Mark, Roger G; Celi, Leo A; Pilcher, David; Badawi, Omar.

Crit Care Med ; 50(7): 1040-1050, 2022 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-35354159

RESUMO

OBJECTIVES: To develop and demonstrate the feasibility of a Global Open Source Severity of Illness Score (GOSSIS)-1 for critical care patients, which generalizes across healthcare systems and countries. DESIGN: A merger of several critical care multicenter cohorts derived from registry and electronic health record data. Data were split into training (70%) and test (30%) sets, using each set exclusively for development and evaluation, respectively. Missing data were imputed when not available. SETTING/PATIENTS: Two large multicenter datasets from Australia and New Zealand (Australian and New Zealand Intensive Care Society Adult Patient Database [ANZICS-APD]) and the United States (eICU Collaborative Research Database [eICU-CRD]) representing 249,229 and 131,051 patients, respectively. ANZICS-APD and eICU-CRD contributed data from 162 and 204 hospitals, respectively. The cohort included all ICU admissions discharged in 2014-2015, excluding patients less than 16 years old, admissions less than 6 hours, and those with a previous ICU stay. INTERVENTIONS: Not applicable. MEASUREMENTS AND MAIN RESULTS: GOSSIS-1 uses data collected during the ICU stay's first 24 hours, including extrema values for vital signs and laboratory results, admission diagnosis, the Glasgow Coma Scale, chronic comorbidities, and admission/demographic variables. The datasets showed significant variation in admission-related variables, case-mix, and average physiologic state. Despite this heterogeneity, test set discrimination of GOSSIS-1 was high (area under the receiver operator characteristic curve [AUROC], 0.918; 95% CI, 0.915-0.921) and calibration was excellent (standardized mortality ratio [SMR], 0.986; 95% CI, 0.966-1.005; Brier score, 0.050). Performance was held within ANZICS-APD (AUROC, 0.925; SMR, 0.982; Brier score, 0.047) and eICU-CRD (AUROC, 0.904; SMR, 0.992; Brier score, 0.055). Compared with GOSSIS-1, Acute Physiology and Chronic Health Evaluation (APACHE)-IIIj (ANZICS-APD) and APACHE-IVa (eICU-CRD), had worse discrimination with AUROCs of 0.904 and 0.869, and poorer calibration with SMRs of 0.594 and 0.770, and Brier scores of 0.059 and 0.063, respectively. CONCLUSIONS: GOSSIS-1 is a modern, free, open-source inhospital mortality prediction algorithm for critical care patients, achieving excellent discrimination and calibration across three countries.

Assuntos

Cuidados Críticos , Unidades de Terapia Intensiva , APACHE , Adolescente , Adulto , Austrália , Mortalidade Hospitalar , Humanos

7.

Impact of sex on use of low tidal volume ventilation in invasively ventilated ICU patients-A mediation analysis using two observational cohorts.

Swart, Pien; Deliberato, Rodrigo Octavio; Johnson, Alistair E W; Pollard, Tom J; Bulgarelli, Lucas; Pelosi, Paolo; de Abreu, Marcelo Gama; Schultz, Marcus J; Neto, Ary Serpa.

PLoS One ; 16(7): e0253933, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34260619

RESUMO

BACKGROUND: Studies in patients receiving invasive ventilation show important differences in use of low tidal volume (VT) ventilation (LTVV) between females and males. The aims of this study were to describe temporal changes in VT and to determine what factors drive the sex difference in use of LTVV. METHODS AND FINDINGS: This is a posthoc analysis of 2 large longitudinal projects in 59 ICUs in the United States, the 'Medical information Mart for Intensive Care III' (MIMIC III) and the 'eICU Collaborative Research DataBase'. The proportion of patients under LTVV (median VT < 8 ml/kg PBW), was the primary outcome. Mediation analysis, a method to dissect total effect into direct and indirect effects, was used to understand which factors drive the sex difference. We included 3614 (44%) females and 4593 (56%) males. Median VT declined over the years, but with a persistent difference between females (from median 10.2 (9.1 to 11.4) to 8.2 (7.5 to 9.1) ml/kg PBW) vs. males (from median 9.2 [IQR 8.2 to 10.1] to 7.3 [IQR 6.6 to 8.0] ml/kg PBW) (P < .001). In females versus males, use of LTVV increased from 5 to 50% versus from 12 to 78% (difference, -27% [-29% to -25%]; P < .001). The sex difference was mainly driven by patients' body height and actual body weight (adjusted average causal mediation effect, -30% [-33% to -27%]; P < .001, and 4 [3% to 4%]; P < .001). CONCLUSIONS: While LTVV is increasingly used in females and males, females continue to receive LTVV less often than males. The sex difference is mainly driven by patients' body height and actual body weight, and not necessarily by sex. Use of LTVV in females could improve by paying more attention to a correct calculation of VT, i.e., using the correct body height.

Assuntos

Unidades de Terapia Intensiva , Análise de Mediação , Respiração Artificial , Caracteres Sexuais , Peso Corporal , Estudos de Coortes , Feminino , Humanos , Masculino , Análise Multivariada , Volume de Ventilação Pulmonar

8.

VitalDB: fostering collaboration in anaesthesia research.

Vistisen, Simon T; Pollard, Tom J; Enevoldsen, Johannes; Scheeren, Thomas W L.

Br J Anaesth ; 127(2): 184-187, 2021 08.

Artigo em Inglês | MEDLINE | ID: mdl-33888300

Assuntos

Anestesia , Anestesiologia , Mineração de Dados , Humanos

9.

Recalibration of deep learning models for abnormality detection in smartphone-captured chest radiograph.

Kuo, Po-Chih; Tsai, Cheng Che; López, Diego M; Karargyris, Alexandros; Pollard, Tom J; Johnson, Alistair E W; Celi, Leo Anthony.

NPJ Digit Med ; 4(1): 25, 2021 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-33589700

RESUMO

Image-based teleconsultation using smartphones has become increasingly popular. In parallel, deep learning algorithms have been developed to detect radiological findings in chest X-rays (CXRs). However, the feasibility of using smartphones to automate this process has yet to be evaluated. This study developed a recalibration method to build deep learning models to detect radiological findings on CXR photographs. Two publicly available databases (MIMIC-CXR and CheXpert) were used to build the models, and four derivative datasets containing 6453 CXR photographs were collected to evaluate model performance. After recalibration, the model achieved areas under the receiver operating characteristic curve of 0.80 (95% confidence interval: 0.78-0.82), 0.88 (0.86-0.90), 0.81 (0.79-0.84), 0.79 (0.77-0.81), 0.84 (0.80-0.88), and 0.90 (0.88-0.92), respectively, for detecting cardiomegaly, edema, consolidation, atelectasis, pneumothorax, and pleural effusion. The recalibration strategy, respectively, recovered 84.9%, 83.5%, 53.2%, 57.8%, 69.9%, and 83.0% of performance losses of the uncalibrated model. We conclude that the recalibration method can transfer models from digital CXRs to CXR photographs, which is expected to help physicians' clinical works.

10.

"Yes, but will it work for my patients?" Driving clinically relevant research with benchmark datasets.

Panch, Trishan; Pollard, Tom J; Mattie, Heather; Lindemer, Emily; Keane, Pearse A; Celi, Leo Anthony.

NPJ Digit Med ; 3: 87, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32577534

RESUMO

Benchmark datasets have a powerful normative influence: by determining how the real world is represented in data, they define which problems will first be solved by algorithms built using the datasets and, by extension, who these algorithms will work for. It is desirable for these datasets to serve four functions: (1) enabling the creation of clinically relevant algorithms; (2) facilitating like-for-like comparison of algorithmic performance; (3) ensuring reproducibility of algorithms; (4) asserting a normative influence on the clinical domains and diversity of patients that will potentially benefit from technological advances. Without benchmark datasets that satisfy these functions, it is impossible to address two perennial concerns of clinicians experienced in computational research: "the data scientists just go where the data is rather than where the needs are," and, "yes, but will this work for my patients?" If algorithms are to be developed and applied for the care of patients, then it is prudent for the research community to create benchmark datasets proactively, across specialties. As yet, best practice in this area has not been defined. Broadly speaking, efforts will include design of the dataset; compliance and contracting issues relating to the sharing of sensitive data; enabling access and reuse; and planning for translation of algorithms to the clinical environment. If a deliberate and systematic approach is not followed, not only will the considerable benefits of clinical algorithms fail to be realized, but the potential harms may be regressively incurred across existing gradients of social inequity.

11.

Deidentification of free-text medical records using pre-trained bidirectional transformers.

Johnson, Alistair E W; Bulgarelli, Lucas; Pollard, Tom J.

Proc ACM Conf Health Inference Learn (2020) ; 2020: 214-221, 2020 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-34350426

RESUMO

The ability of caregivers and investigators to share patient data is fundamental to many areas of clinical practice and biomedical research. Prior to sharing, it is often necessary to remove identifiers such as names, contact details, and dates in order to protect patient privacy. Deidentification, the process of removing identifiers, is challenging, however. High-quality annotated data for developing models is scarce; many target identifiers are highly heterogenous (for example, there are uncountable variations of patient names); and in practice anything less than perfect sensitivity may be considered a failure. As a result, patient data is often withheld when sharing would be beneficial, and identifiable patient data is often divulged when a deidentified version would suffice. In recent years, advances in machine learning methods have led to rapid performance improvements in natural language processing tasks, in particular with the advent of large-scale pretrained language models. In this paper we develop and evaluate an approach for deidentification of clinical notes based on a bidirectional transformer model. We propose human interpretable evaluation measures and demonstrate state of the art performance against modern baseline models. Finally, we highlight current challenges in deidentification, including the absence of clear annotation guidelines, lack of portability of models, and paucity of training data. Code to develop our model is open source, allowing for broad reuse.

12.

MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.

Johnson, Alistair E W; Pollard, Tom J; Berkowitz, Seth J; Greenbaum, Nathaniel R; Lungren, Matthew P; Deng, Chih-Ying; Mark, Roger G; Horng, Steven.

Sci Data ; 6(1): 317, 2019 12 12.

Artigo em Inglês | MEDLINE | ID: mdl-31831740

RESUMO

Chest radiography is an extremely powerful imaging modality, allowing for a detailed inspection of a patient's chest, but requires specialized training for proper interpretation. With the advent of high performance general purpose computer vision algorithms, the accurate automated analysis of chest radiographs is becoming increasingly of interest to researchers. Here we describe MIMIC-CXR, a large dataset of 227,835 imaging studies for 65,379 patients presenting to the Beth Israel Deaconess Medical Center Emergency Department between 2011-2016. Each imaging study can contain one or more images, usually a frontal view and a lateral view. A total of 377,110 images are available in the dataset. Studies are made available with a semi-structured free-text radiology report that describes the radiological findings of the images, written by a practicing radiologist contemporaneously during routine clinical care. All images and reports have been de-identified to protect patient privacy. The dataset is made freely available to facilitate and encourage a wide range of research in computer vision, natural language processing, and clinical data mining.

Assuntos

Bases de Dados Factuais , Radiografia Torácica , Algoritmos , Mineração de Dados , Humanos , Interpretação de Imagem Assistida por Computador , Processamento de Linguagem Natural

13.

Normalization of mechanical power to anthropometric indices: impact on its association with mortality in critically ill patients.

Serpa Neto, Ary; Deliberato, Rodrigo Octavio; Johnson, Alistair Ew; Pollard, Tom J; Celi, Leo A; Pelosi, Paolo; Gama de Abreu, Marcelo; Schultz, Marcus J.

Intensive Care Med ; 45(12): 1835-1837, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-31595350

Assuntos

Índice de Massa Corporal , Superfície Corporal , Peso Corporal , Estado Terminal/mortalidade , Estado Terminal/terapia , Respiração Artificial/mortalidade , Respiração Artificial/métodos , Estudos de Coortes , Mortalidade Hospitalar , Humanos , Estudos Prospectivos

14.

The PLOS ONE collection on machine learning in health and biomedicine: Towards open code and open data.

Celi, Leo A; Citi, Luca; Ghassemi, Marzyeh; Pollard, Tom J.

PLoS One ; 14(1): e0210232, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-30645625

RESUMO

Recent years have seen a surge of studies in machine learning in health and biomedicine, driven by digitalization of healthcare environments and increasingly accessible computer systems for conducting analyses. Many of us believe that these developments will lead to significant improvements in patient care. Like many academic disciplines, however, progress is hampered by lack of code and data sharing. In bringing together this PLOS ONE collection on machine learning in health and biomedicine, we sought to focus on the importance of reproducibility, making it a requirement, as far as possible, for authors to share data and code alongside their papers.

Assuntos

Pesquisa Biomédica/tendências , Atenção à Saúde/tendências , Aprendizado de Máquina/tendências , Algoritmos , Humanos , Disseminação de Informação , Revisão da Pesquisa por Pares/tendências

15.

Turning the crank for machine learning: ease, at what expense?

Pollard, Tom J; Chen, Irene; Wiens, Jenna; Horng, Steven; Wong, Danny; Ghassemi, Marzyeh; Mattie, Heather; Lindemer, Emily; Panch, Trishan.

Lancet Digit Health ; 1(5): e198-e199, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-33323266

Assuntos

Aprendizado Profundo , Braço , Teste de Esforço , Estudos de Viabilidade , Humanos

16.

Mechanical power of ventilation is associated with mortality in critically ill patients: an analysis of patients in two observational cohorts.

Serpa Neto, Ary; Deliberato, Rodrigo Octavio; Johnson, Alistair E W; Bos, Lieuwe D; Amorim, Pedro; Pereira, Silvio Moreto; Cazati, Denise Carnieli; Cordioli, Ricardo L; Correa, Thiago Domingos; Pollard, Tom J; Schettino, Guilherme P P; Timenetsky, Karina T; Celi, Leo A; Pelosi, Paolo; Gama de Abreu, Marcelo; Schultz, Marcus J.

Intensive Care Med ; 44(11): 1914-1922, 2018 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-30291378

RESUMO

PURPOSE: Mechanical power (MP) may unify variables known to be related to development of ventilator-induced lung injury. The aim of this study is to examine the association between MP and mortality in critically ill patients receiving invasive ventilation for at least 48 h. METHODS: This is an analysis of data stored in the databases of the MIMIC-III and eICU. Critically ill patients receiving invasive ventilation for at least 48 h were included. The exposure of interest was MP. The primary outcome was in-hospital mortality. RESULTS: Data from 8207 patients were analyzed. Median MP during the second 24 h was 21.4 (16.2-28.1) J/min in MIMIC-III and 16.0 (11.7-22.1) J/min in eICU. MP was independently associated with in-hospital mortality [odds ratio per 5 J/min increase (OR) 1.06 (95% confidence interval (CI) 1.01-1.11); p = 0.021 in MIMIC-III, and 1.10 (1.02-1.18); p = 0.010 in eICU]. MP was also associated with ICU mortality, 30-day mortality, and with ventilator-free days, ICU and hospital length of stay. Even at low tidal volume, high MP was associated with in-hospital mortality [OR 1.70 (1.32-2.18); p < 0.001] and other secondary outcomes. Finally, there is a consistent increase in the risk of death with MP higher than 17.0 J/min. CONCLUSION: High MP of ventilation is independently associated with higher in-hospital mortality and several other outcomes in ICU patients receiving invasive ventilation for at least 48 h.

Assuntos

Cuidados Críticos , Estado Terminal/mortalidade , Respiração Artificial , Idoso , Estudos de Coortes , Feminino , Mortalidade Hospitalar , Humanos , Masculino , Pessoa de Meia-Idade , Estudos Retrospectivos , Fatores de Tempo , Estados Unidos

17.

The eICU Collaborative Research Database, a freely available multi-center database for critical care research.

Pollard, Tom J; Johnson, Alistair E W; Raffa, Jesse D; Celi, Leo A; Mark, Roger G; Badawi, Omar.

Sci Data ; 5: 180178, 2018 09 11.

Artigo em Inglês | MEDLINE | ID: mdl-30204154

RESUMO

Critical care patients are monitored closely through the course of their illness. As a result of this monitoring, large amounts of data are routinely collected for these patients. Philips Healthcare has developed a telehealth system, the eICU Program, which leverages these data to support management of critically ill patients. Here we describe the eICU Collaborative Research Database, a multi-center intensive care unit (ICU)database with high granularity data for over 200,000 admissions to ICUs monitored by eICU Programs across the United States. The database is deidentified, and includes vital sign measurements, care plan documentation, severity of illness measures, diagnosis information, treatment information, and more. Data are publicly available after registration, including completion of a training course in research with human subjects and signing of a data use agreement mandating responsible handling of the data and adhering to the principle of collaborative research. The freely available nature of the data will support a number of applications including the development of machine learning algorithms, decision support tools, and clinical research.

Assuntos

Cuidados Críticos , Estado Terminal/terapia , Bases de Dados Factuais , Humanos , Unidades de Terapia Intensiva , Telemedicina , Estados Unidos

18.

A Comparative Analysis of Sepsis Identification Methods in an Electronic Database.

Johnson, Alistair E W; Aboab, Jerome; Raffa, Jesse D; Pollard, Tom J; Deliberato, Rodrigo O; Celi, Leo A; Stone, David J.

Crit Care Med ; 46(4): 494-499, 2018 04.

Artigo em Inglês | MEDLINE | ID: mdl-29303796

RESUMO

OBJECTIVES: To evaluate the relative validity of criteria for the identification of sepsis in an ICU database. DESIGN: Retrospective cohort study of adult ICU admissions from 2008 to 2012. SETTING: Tertiary teaching hospital in Boston, MA. PATIENTS: Initial admission of all adult patients to noncardiac surgical ICUs. INTERVENTIONS: Comparison of five different algorithms for retrospectively identifying sepsis, including the Sepsis-3 criteria. MEASUREMENTS AND MAIN RESULTS: 11,791 of 23,620 ICU admissions (49.9%) met criteria for the study. Within this subgroup, 59.9% were suspected of infection on ICU admission, 75.2% of admissions had Sequential Organ Failure Assessment greater than or equal to 2, and 49.1% had both suspicion of infection and Sequential Organ Failure Assessment greater than or equal to 2 thereby meeting the Sepsis-3 criteria. The area under the receiver operator characteristic of Sequential Organ Failure Assessment (0.74) for hospital mortality was consistent with previous studies of the Sepsis-3 criteria. The Centers for Disease Control and Prevention, Angus, Martin, Centers for Medicare & Medicaid Services, and explicit coding methods for identifying sepsis revealed respective sepsis incidences of 31.9%, 28.6%, 14.7%, 11.0%, and 9.0%. In-hospital mortality increased with decreasing cohort size, ranging from 30.1% (explicit codes) to 14.5% (Sepsis-3 criteria). Agreement among the criteria was acceptable (Cronbach's alpha, 0.40-0.62). CONCLUSIONS: The new organ dysfunction-based Sepsis-3 criteria have been proposed as a clinical method for identifying sepsis. These criteria identified a larger, less severely ill cohort than that identified by previously used administrative definitions. The Sepsis-3 criteria have several advantages over prior methods, including less susceptibility to coding practices changes, provision of temporal context, and possession of high construct validity. However, the Sepsis-3 criteria also present new challenges, especially when calculated retrospectively. Future studies on sepsis should recognize the differences in outcome incidence among identification methods and contextualize their findings according to the different cohorts identified.

Assuntos

Bases de Dados Factuais/estatística & dados numéricos , Unidades de Terapia Intensiva/estatística & dados numéricos , Sepse/diagnóstico , Índice de Gravidade de Doença , Fatores Etários , Idoso , Idoso de 80 Anos ou mais , Algoritmos , Boston/epidemiologia , Codificação Clínica , Feminino , Mortalidade Hospitalar , Hospitais de Ensino/estatística & dados numéricos , Humanos , Tempo de Internação , Masculino , Pessoa de Meia-Idade , Escores de Disfunção Orgânica , Curva ROC , Estudos Retrospectivos , Sepse/mortalidade , Fatores Sexuais , Fatores Socioeconômicos , Centros de Atenção Terciária/estatística & dados numéricos

19.

tableone: An open source Python package for producing summary statistics for research papers.

Pollard, Tom J; Johnson, Alistair E W; Raffa, Jesse D; Mark, Roger G.

JAMIA Open ; 1(1): 26-31, 2018 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-31984317

RESUMO

OBJECTIVES: In quantitative research, understanding basic parameters of the study population is key for interpretation of the results. As a result, it is typical for the first table ("Table 1") of a research paper to include summary statistics for the study data. Our objectives are 2-fold. First, we seek to provide a simple, reproducible method for providing summary statistics for research papers in the Python programming language. Second, we seek to use the package to improve the quality of summary statistics reported in research papers. MATERIALS AND METHODS: The tableone package is developed following good practice guidelines for scientific computing and all code is made available under a permissive MIT License. A testing framework runs on a continuous integration server, helping to maintain code stability. Issues are tracked openly and public contributions are encouraged. RESULTS: The tableone software package automatically compiles summary statistics into publishable formats such as CSV, HTML, and LaTeX. An executable Jupyter Notebook demonstrates application of the package to a subset of data from the MIMIC-III database. Tests such as Tukey's rule for outlier detection and Hartigan's Dip Test for modality are computed to highlight potential issues in summarizing the data. DISCUSSION AND CONCLUSION: We present open source software for researchers to facilitate carrying out reproducible studies in Python, an increasingly popular language in scientific research. The toolkit is intended to mature over time with community feedback and input. Development of a common tool for summarizing data may help to promote good practice when used as a supplement to existing guidelines and recommendations. We encourage use of tableone alongside other methods of descriptive statistics and, in particular, visualization to ensure appropriate data handling. We also suggest seeking guidance from a statistician when using tableone for a research study, especially prior to submitting the study for publication.

20.

The MIMIC Code Repository: enabling reproducibility in critical care research.

Johnson, Alistair Ew; Stone, David J; Celi, Leo A; Pollard, Tom J.

J Am Med Inform Assoc ; 25(1): 32-39, 2018 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-29036464

RESUMO

Objective: Lack of reproducibility in medical studies is a barrier to the generation of a robust knowledge base to support clinical decision-making. In this paper we outline the Medical Information Mart for Intensive Care (MIMIC) Code Repository, a centralized code base for generating reproducible studies on an openly available critical care dataset. Materials and Methods: Code is provided to load the data into a relational structure, create extractions of the data, and reproduce entire analysis plans including research studies. Results: Concepts extracted include severity of illness scores, comorbid status, administrative definitions of sepsis, physiologic criteria for sepsis, organ failure scores, treatment administration, and more. Executable documents are used for tutorials and reproduce published studies end-to-end, providing a template for future researchers to replicate. The repository's issue tracker enables community discussion about the data and concepts, allowing users to collaboratively improve the resource. Discussion: The centralized repository provides a platform for users of the data to interact directly with the data generators, facilitating greater understanding of the data. It also provides a location for the community to collaborate on necessary concepts for research progress and share them with a larger audience. Consistent application of the same code for underlying concepts is a key step in ensuring that research studies on the MIMIC database are comparable and reproducible. Conclusion: By providing open source code alongside the freely accessible MIMIC-III database, we enable end-to-end reproducible analysis of electronic health records.

Assuntos

Pesquisa Biomédica , Cuidados Críticos , Bases de Dados Factuais , Acesso à Informação , Comorbidade , Mineração de Dados , Registros Eletrônicos de Saúde , Humanos , Escores de Disfunção Orgânica , Curva ROC , Reprodutibilidade dos Testes , Sepse , Índice de Gravidade de Doença , Terapêutica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA