Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Pediatr Cardiol ; 2024 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-38953953

RESUMO

Secundum atrial septal defect (ASD2) detection is often delayed, with the potential for late diagnosis complications. Recent work demonstrated artificial intelligence-enhanced ECG analysis shows promise to detect ASD2 in adults. However, its application to pediatric populations remains underexplored. In this study, we trained a convolutional neural network (AI-pECG) on paired ECG-echocardiograms (≤ 2 days apart) to detect ASD2 from patients ≤ 18 years old without major congenital heart disease. Model performance was evaluated on the first ECG-echocardiogram pair per patient for Boston Children's Hospital internal testing and emergency department cohorts using area under the receiver operating (AUROC) and precision-recall (AUPRC) curves. The training cohort comprised of 92,377 ECG-echocardiogram pairs (46,261 patients; median age 8.2 years) with an ASD2 prevalence of 6.7%. Test groups included internal testing (12,631 patients; median age 7.4 years; 6.9% prevalence) and emergency department (2,830 patients; median age 7.5 years; 4.9% prevalence) cohorts. Model performance was higher in the internal test (AUROC 0.84, AUPRC 0.46) cohort than the emergency department cohort (AUROC 0.80, AUPRC 0.30). In both cohorts, AI-pECG outperformed ECG findings of incomplete right bundle branch block. Model explainability analyses suggest high-risk limb lead features include greater amplitude P waves (suggestive of right atrial enlargement) and V1 RSR' (suggestive of RBBB). Our findings demonstrate the promise of AI-pECG to inexpensively screen and/or detect ASD2 in pediatric patients. Future multicenter validation and prospective trials to inform clinical decision making are warranted.

2.
Am J Obstet Gynecol ; 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38663662

RESUMO

BACKGROUND: Electronic fetal monitoring is used in most US hospital births but has significant limitations in achieving its intended goal of preventing intrapartum hypoxic-ischemic injury. Novel deep learning techniques can improve complex data processing and pattern recognition in medicine. OBJECTIVE: This study aimed to apply deep learning approaches to develop and validate a model to predict fetal acidemia from electronic fetal monitoring data. STUDY DESIGN: The database was created using intrapartum electronic fetal monitoring data from 2006 to 2020 from a large, multisite academic health system. Data were divided into training and testing sets with equal distribution of acidemic cases. Several different deep learning architectures were explored. The primary outcome was umbilical artery acidemia, which was investigated at 4 clinically meaningful thresholds: 7.20, 7.15, 7.10, and 7.05, along with base excess. The receiver operating characteristic curves were generated with the area under the receiver operating characteristic assessed to determine the performance of the models. External validation was performed using a publicly available Czech database of electronic fetal monitoring data. RESULTS: A total of 124,777 electronic fetal monitoring files were available, of which 77,132 had <30% missingness in the last 60 minutes of the electronic fetal monitoring tracing. Of these, 21,041 were matched to a corresponding umbilical cord gas result, of which 10,182 were time-stamped within 30 minutes of the last electronic fetal monitoring reading and composed the final dataset. The prevalence rates of the outcomes in the data were 20.9% with a pH of <7.2, 9.1% with a pH of <7.15, 3.3% with a pH of <7.10, and 1.3% with a pH of <7.05. The best performing model achieved an area under the receiver operating characteristic of 0.85 at a pH threshold of <7.05. When predicting the joint outcome of both pH of <7.05 and base excess of less than -10 meq/L, an area under the receiver operating characteristic of 0.89 was achieved. When predicting both pH of <7.20 and base excess of less than -10 meq/L, an area under the receiver operating characteristic of 0.87 was achieved. At a pH of <7.15 and a positive predictive value of 30%, the model achieved a sensitivity of 90% and a specificity of 48%. CONCLUSION: The application of deep learning methods to intrapartum electronic fetal monitoring analysis achieves promising performance in predicting fetal acidemia. This technology could help improve the accuracy and consistency of electronic fetal monitoring interpretation.

3.
Circulation ; 149(12): 917-931, 2024 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-38314583

RESUMO

BACKGROUND: Artificial intelligence-enhanced ECG analysis shows promise to detect ventricular dysfunction and remodeling in adult populations. However, its application to pediatric populations remains underexplored. METHODS: A convolutional neural network was trained on paired ECG-echocardiograms (≤2 days apart) from patients ≤18 years of age without major congenital heart disease to detect human expert-classified greater than mild left ventricular (LV) dysfunction, hypertrophy, and dilation (individually and as a composite outcome). Model performance was evaluated on single ECG-echocardiogram pairs per patient at Boston Children's Hospital and externally at Mount Sinai Hospital using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC). RESULTS: The training cohort comprised 92 377 ECG-echocardiogram pairs (46 261 patients; median age, 8.2 years). Test groups included internal testing (12 631 patients; median age, 8.8 years; 4.6% composite outcomes), emergency department (2830 patients; median age, 7.7 years; 10.0% composite outcomes), and external validation (5088 patients; median age, 4.3 years; 6.1% composite outcomes) cohorts. Model performance was similar on internal test and emergency department cohorts, with model predictions of LV hypertrophy outperforming the pediatric cardiologist expert benchmark. Adding age and sex to the model added no benefit to model performance. When using quantitative outcome cutoffs, model performance was similar between internal testing (composite outcome: AUROC, 0.88, AUPRC, 0.43; LV dysfunction: AUROC, 0.92, AUPRC, 0.23; LV hypertrophy: AUROC, 0.88, AUPRC, 0.28; LV dilation: AUROC, 0.91, AUPRC, 0.47) and external validation (composite outcome: AUROC, 0.86, AUPRC, 0.39; LV dysfunction: AUROC, 0.94, AUPRC, 0.32; LV hypertrophy: AUROC, 0.84, AUPRC, 0.25; LV dilation: AUROC, 0.87, AUPRC, 0.33), with composite outcome negative predictive values of 99.0% and 99.2%, respectively. Saliency mapping highlighted ECG components that influenced model predictions (precordial QRS complexes for all outcomes; T waves for LV dysfunction). High-risk ECG features include lateral T-wave inversion (LV dysfunction), deep S waves in V1 and V2 and tall R waves in V6 (LV hypertrophy), and tall R waves in V4 through V6 (LV dilation). CONCLUSIONS: This externally validated algorithm shows promise to inexpensively screen for LV dysfunction and remodeling in children, which may facilitate improved access to care by democratizing the expertise of pediatric cardiologists.


Assuntos
Aprendizado Profundo , Disfunção Ventricular Esquerda , Adulto , Humanos , Criança , Pré-Escolar , Eletrocardiografia , Inteligência Artificial , Disfunção Ventricular Esquerda/diagnóstico por imagem , Hipertrofia Ventricular Esquerda/diagnóstico por imagem
4.
Health Equity ; 7(1): 803-808, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38076214

RESUMO

Introduction: Despite their dynamic, socially constructed, and imprecise nature, both race and gender are included in common risk calculators used for clinical decision-making about statin therapy for atherosclerotic cardiovascular disease (ASCVD) prevention. Methods and Materials: We assessed the effect of manipulating six different race-gender categories on ASCVD risk scores among 90 Black transgender women. Results: Risk scores varied by operationalization of race and gender and affected the proportion for whom statins were recommended. Discussion: Race and gender are social constructs underpinning racialized and gendered health inequities. Their rote use in ASCVD risk calculators may reinforce and perpetuate existing inequities.

5.
Nat Mach Intell ; 5(5): 476-479, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37600144

RESUMO

Fairness approaches in machine learning should involve more than assessment of performance metrics across groups. Shifting the focus away from model metrics, we reframe fairness through the lens of intersectionality, a Black feminist theoretical framework that contextualizes individuals in interacting systems of power and oppression.

6.
Proc Mach Learn Res ; 209: 350-378, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37576024

RESUMO

Fair calibration is a widely desirable fairness criteria in risk prediction contexts. One way to measure and achieve fair calibration is with multicalibration. Multicalibration constrains calibration error among flexibly-defined subpopulations while maintaining overall calibration. However, multicalibrated models can exhibit a higher percent calibration error among groups with lower base rates than groups with higher base rates. As a result, it is possible for a decision-maker to learn to trust or distrust model predictions for specific groups. To alleviate this, we propose proportional multicalibration, a criteria that constrains the percent calibration error among groups and within prediction bins. We prove that satisfying proportional multicalibration bounds a model's multicalibration as well its differential calibration, a fairness criteria that directly measures how closely a model approximates sufficiency. Therefore, proportionally calibrated models limit the ability of decision makers to distinguish between model performance on different patient groups, which may make the models more trustworthy in practice. We provide an efficient algorithm for post-processing risk prediction models for proportional multicalibration and evaluate it empirically. We conduct simulation studies and investigate a real-world application of PMC-postprocessing to prediction of emergency department patient admissions. We observe that proportional multicalibration is a promising criteria for controlling simultaneous measures of calibration fairness of a model over intersectional groups with virtually no cost in terms of classification performance.

7.
NPJ Digit Med ; 6(1): 107, 2023 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-37277550

RESUMO

Machine learning (ML) models trained for triggering clinical decision support (CDS) are typically either accurate or interpretable but not both. Scaling CDS to the panoply of clinical use cases while mitigating risks to patients will require many ML models be intuitively interpretable for clinicians. To this end, we adapted a symbolic regression method, coined the feature engineering automation tool (FEAT), to train concise and accurate models from high-dimensional electronic health record (EHR) data. We first present an in-depth application of FEAT to classify hypertension, hypertension with unexplained hypokalemia, and apparent treatment-resistant hypertension (aTRH) using EHR data for 1200 subjects receiving longitudinal care in a large healthcare system. FEAT models trained to predict phenotypes adjudicated by chart review had equivalent or higher discriminative performance (p < 0.001) and were at least three times smaller (p < 1 × 10-6) than other potentially interpretable models. For aTRH, FEAT generated a six-feature, highly discriminative (positive predictive value = 0.70, sensitivity = 0.62), and clinically intuitive model. To assess the generalizability of the approach, we tested FEAT on 25 benchmark clinical phenotyping tasks using the MIMIC-III critical care database. Under comparable dimensionality constraints, FEAT's models exhibited higher area under the receiver-operating curve scores than penalized linear models across tasks (p < 6 × 10-6). In summary, FEAT can train EHR prediction models that are both intuitively interpretable and accurate, which should facilitate safe and effective scaling of ML-triggered CDS to the panoply of potential clinical use cases and healthcare practices.

8.
J Biomed Inform ; 139: 104306, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36738870

RESUMO

BACKGROUND: In electronic health records, patterns of missing laboratory test results could capture patients' course of disease as well as ​​reflect clinician's concerns or worries for possible conditions. These patterns are often understudied and overlooked. This study aims to identify informative patterns of missingness among laboratory data collected across 15 healthcare system sites in three countries for COVID-19 inpatients. METHODS: We collected and analyzed demographic, diagnosis, and laboratory data for 69,939 patients with positive COVID-19 PCR tests across three countries from 1 January 2020 through 30 September 2021. We analyzed missing laboratory measurements across sites, missingness stratification by demographic variables, temporal trends of missingness, correlations between labs based on missingness indicators over time, and clustering of groups of labs based on their missingness/ordering pattern. RESULTS: With these analyses, we identified mapping issues faced in seven out of 15 sites. We also identified nuances in data collection and variable definition for the various sites. Temporal trend analyses may support the use of laboratory test result missingness patterns in identifying severe COVID-19 patients. Lastly, using missingness patterns, we determined relationships between various labs that reflect clinical behaviors. CONCLUSION: In this work, we use computational approaches to relate missingness patterns to hospital treatment capacity and highlight the heterogeneity of looking at COVID-19 over time and at multiple sites, where there might be different phases, policies, etc. Changes in missingness could suggest a change in a patient's condition, and patterns of missingness among laboratory measurements could potentially identify clinical outcomes. This allows sites to consider missing data as informative to analyses and help researchers identify which sites are better poised to study particular questions.


Assuntos
COVID-19 , Registros Eletrônicos de Saúde , Humanos , Coleta de Dados , Registros , Análise por Conglomerados
9.
Bioinformatics ; 38(3): 878-880, 2022 01 12.
Artigo em Inglês | MEDLINE | ID: mdl-34677586

RESUMO

MOTIVATION: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. RESULTS: This release of PMLB (Penn Machine Learning Benchmarks) provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community. AVAILABILITY AND IMPLEMENTATION: PMLB is available at https://github.com/EpistasisLab/pmlb. Python and R interfaces for PMLB can be installed through the Python Package Index and Comprehensive R Archive Network, respectively.


Assuntos
Benchmarking , Software , Aprendizado de Máquina , Modelos Estatísticos
10.
Bioinformatics ; 37(2): 250-256, 2021 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-32766825

RESUMO

MOTIVATION: Many researchers with domain expertise are unable to easily apply machine learning (ML) to their bioinformatics data due to a lack of ML and/or coding expertise. Methods that have been proposed thus far to automate ML mostly require programming experience as well as expert knowledge to tune and apply the algorithms correctly. Here, we study a method of automating biomedical data science using a web-based AI platform to recommend model choices and conduct experiments. We have two goals in mind: first, to make it easy to construct sophisticated models of biomedical processes; and second, to provide a fully automated AI agent that can choose and conduct promising experiments for the user, based on the user's experiments as well as prior knowledge. To validate this framework, we conduct an experiment on 165 classification problems, comparing to state-of-the-art, automated approaches. Finally, we use this tool to develop predictive models of septic shock in critical care patients. RESULTS: We find that matrix factorization-based recommendation systems outperform metalearning methods for automating ML. This result mirrors the results of earlier recommender systems research in other domains. The proposed AI is competitive with state-of-the-art automated ML methods in terms of choosing optimal algorithm configurations for datasets. In our application to prediction of septic shock, the AI-driven analysis produces a competent ML model (AUROC 0.85±0.02) that performs on par with state-of-the-art deep learning results for this task, with much less computational effort. AVAILABILITY AND IMPLEMENTATION: PennAI is available free of charge and open-source. It is distributed under the GNU public license (GPL) version 3. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Aprendizado de Máquina , Humanos , Informática
11.
Adv Neural Inf Process Syst ; 2021(DB1): 1-16, 2021 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38715933

RESUMO

Many promising approaches to symbolic regression have been presented in recent years, yet progress in the field continues to suffer from a lack of uniform, robust, and transparent benchmarking standards. We address this shortcoming by introducing an open-source, reproducible benchmarking platform for symbolic regression. We assess 14 symbolic regression methods and 7 machine learning methods on a set of 252 diverse regression problems. Our assessment includes both real-world datasets with no known model form as well as ground-truth benchmark problems. For the real-world datasets, we benchmark the ability of each method to learn models with low error and low complexity relative to state-of-the-art machine learning methods. For the synthetic problems, we assess each method's ability to find exact solutions in the presence of varying levels of noise. Under these controlled experiments, we conclude that the best performing methods for real-world regression combine genetic algorithms with parameter estimation and/or semantic search drivers. When tasked with recovering exact equations in the presence of noise, we find that several approaches perform similarly. We provide a detailed guide to reproducing this experiment and contributing new methods, and encourage other researchers to collaborate with us on a common and living symbolic regression benchmark.

12.
Genet Program Evolvable Mach ; 21(3): 433-467, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33343224

RESUMO

Genetic programming has found recent success as a tool for learning sets of features for regression and classification. Multidimensional genetic programming is a useful variant of genetic programming for this task because it represents candidate solutions as sets of programs. These sets of programs expose additional information that can be exploited for building block identification. In this work, we discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. We investigate methods for biasing the components of programs that are promoted in order to guide search towards useful and complementary feature spaces. We study two main approaches: 1) the introduction of new objectives and 2) the use of specialized semantic variation operators. We find that a semantic crossover operator based on stagewise regression leads to significant improvements on a set of regression problems. The inclusion of semantic crossover produces state-of-the-art results in a large benchmark study of open-source regression problems in comparison to several state-of-the-art machine learning approaches and other genetic programming frameworks. Finally, we look at the collinearity and complexity of the data representations produced by different methods, in order to assess whether relevant, concise, and independent factors of variation can be produced in application.

13.
Evol Comput ; 27(3): 377-402, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-29746157

RESUMO

Lexicase selection is a parent selection method that considers training cases individually, rather than in aggregate, when performing parent selection. Whereas previous work has demonstrated the ability of lexicase selection to solve difficult problems in program synthesis and symbolic regression, the central goal of this article is to develop the theoretical underpinnings that explain its performance. To this end, we derive an analytical formula that gives the expected probabilities of selection under lexicase selection, given a population and its behavior. In addition, we expand upon the relation of lexicase selection to many-objective optimization methods to describe the behavior of lexicase selection, which is to select individuals on the boundaries of Pareto fronts in high-dimensional space. We show analytically why lexicase selection performs more poorly for certain sizes of population and training cases, and show why it has been shown to perform more poorly in continuous error spaces. To address this last concern, we propose new variants of ε-lexicase selection, a method that modifies the pass condition in lexicase selection to allow near-elite individuals to pass cases, thereby improving selection performance with continuous errors. We show that ε-lexicase outperforms several diversity-maintenance strategies on a number of real-world and synthetic regression problems.


Assuntos
Biologia Computacional/métodos , Linguística/estatística & dados numéricos , Modelos Estatísticos , Algoritmos , Humanos , Análise de Regressão , Ferramenta de Busca/estatística & dados numéricos , Semântica
14.
Genet Evol Comput Conf ; 2019: 1056-1064, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35849643

RESUMO

Multidimensional genetic programming represents candidate solutions as sets of programs, and thereby provides an interesting framework for exploiting building block identification. Towards this goal, we investigate the use of machine learning as a way to bias which components of programs are promoted, and propose two semantic operators to choose where useful building blocks are placed during crossover. A forward stagewise crossover operator we propose leads to significant improvements on a set of regression problems, and produces state-of-the-art results in a large benchmark study. We discuss this architecture and others in terms of their propensity for allowing heuristic search to utilize information during the evolutionary process. Finally, we look at the collinearity and complexity of the data representations that result from these architectures, with a view towards disentangling factors of variation in application.

15.
J Biomed Inform ; 85: 189-203, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30031057

RESUMO

Feature selection plays a critical role in biomedical data mining, driven by increasing feature dimensionality in target problems and growing interest in advanced but computationally expensive methodologies able to model complex associations. Specifically, there is a need for feature selection methods that are computationally efficient, yet sensitive to complex patterns of association, e.g. interactions, so that informative features are not mistakenly eliminated prior to downstream modeling. This paper focuses on Relief-based algorithms (RBAs), a unique family of filter-style feature selection algorithms that have gained appeal by striking an effective balance between these objectives while flexibly adapting to various data characteristics, e.g. classification vs. regression. First, this work broadly examines types of feature selection and defines RBAs within that context. Next, we introduce the original Relief algorithm and associated concepts, emphasizing the intuition behind how it works, how feature weights generated by the algorithm can be interpreted, and why it is sensitive to feature interactions without evaluating combinations of features. Lastly, we include an expansive review of RBA methodological research beyond Relief and its popular descendant, ReliefF. In particular, we characterize branches of RBA research, and provide comparative summaries of RBA algorithms including contributions, strategies, functionality, time complexity, adaptation to key data characteristics, and software availability.


Assuntos
Algoritmos , Biologia Computacional/métodos , Mineração de Dados/métodos , Humanos , Modelos Estatísticos , Análise de Regressão , Software
16.
BioData Min ; 10: 36, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29238404

RESUMO

BACKGROUND: The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. RESULTS: The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. From this study, we find that existing benchmarks lack the diversity to properly benchmark machine learning algorithms, and there are several gaps in benchmarking problems that still need to be considered. CONCLUSIONS: This work represents another important step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...