Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
FAccT 23 (2023) ; 2023: 1599-1608, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37990734

RESUMO

Developing AI tools that preserve fairness is of critical importance, specifically in high-stakes applications such as those in healthcare. However, health AI models' overall prediction performance is often prioritized over the possible biases such models could have. In this study, we show one possible approach to mitigate bias concerns by having healthcare institutions collaborate through a federated learning paradigm (FL; which is a popular choice in healthcare settings). While FL methods with an emphasis on fairness have been previously proposed, their underlying model and local implementation techniques, as well as their possible applications to the healthcare domain remain widely underinvestigated. Therefore, we propose a comprehensive FL approach with adversarial debiasing and a fair aggregation method, suitable to various fairness metrics, in the healthcare domain where electronic health records are used. Not only our approach explicitly mitigates bias as part of the optimization process, but an FL-based paradigm would also implicitly help with addressing data imbalance and increasing the data size, offering a practical solution for healthcare applications. We empirically demonstrate our method's superior performance on multiple experiments simulating large-scale real-world scenarios and compare it to several baselines. Our method has achieved promising fairness performance with the lowest impact on overall discrimination performance (accuracy). Our code is available at https://github.com/healthylaife/FairFedAvg.

2.
medRxiv ; 2023 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-37398451

RESUMO

Background: Understanding social determinants of health (SDOH) that may be risk factors for childhood obesity is important to developing targeted interventions to prevent obesity. Prior studies have examined these risk factors, mostly examining obesity as a static outcome variable. Objectives: This study aimed to identify distinct subpopulations based on BMI percentile classification or changes in BMI percentile classifications over time and explore these longitudinal associations with neighborhood-level SDOH factors in children from 0 to 7 years of age. Methods: Using Latent Class Growth (Mixture) Modelling (LCGMM) we identify distinct BMI% classification groups in children from 0 to 7 years of age. We used multinomial logistic regression to study associations between SDOH factors with each BMI% classification group. Results: From the study cohort of 36,910 children, five distinct BMI% classification groups emerged: always having obesity (n=429; 1.16%), overweight most of the time (n=15,006; 40.65%), increasing BMI% (n=9,060; 24.54%), decreasing BMI% (n=5,058; 13.70%), and always normal weight (n=7,357; 19.89%). Compared to children in the decreasing BMI% and always normal weight groups, children in the other three groups were more likely to live in neighborhoods with higher rates of poverty, unemployment, crowded households, and single-parent households, and lower rates of preschool enrollment. Conclusions: Neighborhood-level SDOH factors have significant associations with children's BMI% classification and changes in classification over time. This highlights the need to develop tailored obesity interventions for different groups to address the barriers faced by communities that can impact the weight and health of the children living within them.

3.
Artigo em Inglês | MEDLINE | ID: mdl-37021857

RESUMO

Obesity is a major health problem, increasing the risk of various major chronic diseases, such as diabetes, cancer, and stroke. While the role of obesity identified by cross-sectional BMI recordings has been heavily studied, the role of BMI trajectories is much less explored. In this study, we use a machine learning approach to subtype individuals' risk of developing 18 major chronic diseases by using their BMI trajectories extracted from a large and geographically diverse EHR dataset capturing the health status of around two million individuals for a period of six years. We define nine new interpretable and evidence-based variables based on the BMI trajectories to cluster the patients into subgroups using the k-means clustering method. We thoroughly review each cluster's characteristics in terms of demographic, socioeconomic, and physiological measurement variables to specify the distinct properties of the patients in the clusters. In our experiments, the direct relationship of obesity with diabetes, hypertension, Alzheimer's, and dementia has been re-established and distinct clusters with specific characteristics for several of the chronic diseases have been found to be conforming or complementary to the existing body of knowledge.

4.
Proc Mach Learn Res ; 219: 167-185, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38344396

RESUMO

Sleep apnea in children is a major health problem affecting one to five percent of children (in the US). If not treated in a timely manner, it can also lead to other physical and mental health issues. Pediatric sleep apnea has different clinical causes and characteristics than adults. Despite a large group of studies dedicated to studying adult apnea, pediatric sleep apnea has been studied in a much less limited fashion. Relatedly, at-home sleep apnea testing tools and algorithmic methods for automatic detection of sleep apnea are widely present for adults, but not children. In this study, we target this gap by presenting a machine learning-based model for detecting apnea events from commonly collected sleep signals. We show that our method outperforms state-of-the-art methods across two public datasets, as determined by the F1-score and AUROC measures. Additionally, we show that using two of the signals that are easier to collect at home (ECG and SpO2) can also achieve very competitive results, potentially addressing the concerns about collecting various sleep signals from children outside the clinic. Therefore, our study can greatly inform ongoing progress toward increasing the accessibility of pediatric sleep apnea testing and improving the timeliness of the treatment interventions.

5.
Proc AAAI Conf Artif Intell ; 36(11): 12510-12516, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36312212

RESUMO

Various types of machine learning techniques are available for analyzing electronic health records (EHRs). For predictive tasks, most existing methods either explicitly or implicitly divide these time-series datasets into predetermined observation and prediction windows. Patients have different lengths of medical history and the desired predictions (for purposes such as diagnosis or treatment) are required at different times in the future. In this paper, we propose a method that uses a sequence-to-sequence generator model to transfer an input sequence of EHR data to a sequence of user-defined target labels, providing the end-users with "flexible" observation and prediction windows to define. We use adversarial and semi-supervised approaches in our design, where the sequence-to-sequence model acts as a generator and a discriminator distinguishes between the actual (observed) and generated labels. We evaluate our models through an extensive series of experiments using two large EHR datasets from adult and pediatric populations. In an obesity predicting case study, we show that our model can achieve superior results in flexible-window prediction tasks, after being trained once and even with large missing rates on the input EHR data. Moreover, using a number of attention analysis experiments, we show that the proposed model can effectively learn more relevant features in different prediction tasks.

6.
Artigo em Inglês | MEDLINE | ID: mdl-35756858

RESUMO

Childhood obesity is a major public health challenge. Early prediction and identification of the children at an elevated risk of developing childhood obesity may help in engaging earlier and more effective interventions to prevent and manage obesity. Most existing predictive tools for childhood obesity primarily rely on traditional regression-type methods using only a few hand-picked features and without exploiting longitudinal patterns of children's data. Deep learning methods allow the use of high-dimensional longitudinal datasets. In this paper, we present a deep learning model designed for predicting future obesity patterns from generally available items on children's medical history. To do this, we use a large unaugmented electronic health records dataset from a large pediatric health system in the US. We adopt a general LSTM network architecture and train our proposed model using both static and dynamic EHR data. To add interpretability, we have additionally included an attention layer to calculate the attention scores for the timestamps and rank features of each timestamp. Our model is used to predict obesity for ages between 3-20 years using the data from 1-3 years in advance. We compare the performance of our LSTM model with a series of existing studies in the literature and show it outperforms their performance in most age ranges.

7.
J Am Med Dir Assoc ; 23(12): 1977-1983.e1, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-35594943

RESUMO

OBJECTIVES: This paper uses deep (machine) learning techniques to develop and test how motor behaviors, derived from location and movement sensor tracking data, may be associated with falls, delirium, and urinary tract infections (UTIs) in long-term care (LTC) residents. DESIGN: Longitudinal observational study. SETTING AND PARTICIPANTS: A total of 23 LTC residents (81,323 observations) with cognitive impairment or dementia in 2 northeast Department of Veterans Affairs LTC facilities. METHODS: More than 18 months of continuous (24/7) monitoring of motor behavior and activity levels used objective radiofrequency identification sensor data to track and record movement data. Occurrence of acute events was recorded each week. Unsupervised deep learning models were used to classify motor behaviors into 5 clusters; supervised decision tree algorithms used these clusters to predict acute health events (falls, delirium, and UTIs) the week before the week of the event. RESULTS: Motor behaviors were classified into 5 categories (Silhouette score = 0.67), and these were significantly different from each other. Motor behavior classifications were sensitive and specific to falls, delirium, and UTI predictions 1 week before the week of the event (sensitivity range = 0.88-0.91; specificity range = 0.71-0.88). CONCLUSION AND IMPLICATIONS: Intraindividual changes in motor behaviors predict some of the most common and detrimental acute events in LTC populations. Study findings suggest real-time locating system sensor data and machine learning techniques may be used in clinical applications to effectively prevent falls and lead to the earlier recognition of risk for delirium and UTIs in this vulnerable population.


Assuntos
Aprendizado Profundo , Demência , Estados Unidos , Humanos , Idoso , Assistência de Longa Duração
8.
Proc Mach Learn Res ; 182: 853-873, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37538125

RESUMO

With the growing availability of Electronic Health Records (EHRs), many deep learning methods have been developed to leverage such datasets in medical prediction tasks. Notably, transformer-based architectures have proven to be highly effective for EHRs. Transformer-based architectures are generally very effective in "transferring" the acquired knowledge from very large datasets to smaller target datasets through their comprehensive "pre-training" process. However, to work efficiently, they still rely on the target datasets for the downstream tasks, and if the target dataset is (very) small, the performance of downstream models can degrade rapidly. In biomedical applications, it is common to only have access to small datasets, for instance, when studying rare diseases, invasive procedures, or using restrictive cohort selection processes. In this study, we present CEHR-GAN-BERT, a semi-supervised transformer-based architecture that leverages both in- and out-of-cohort patients to learn better patient representations in the context of few-shot learning. The proposed method opens new learning opportunities where only a few hundred samples are available. We extensively evaluate our method on four prediction tasks and three public datasets showing the ability of our model to achieve improvements upwards of 5% on all performance metrics (including AUROC and F1 Score) on the tasks that use less than 200 annotated patients during the training process.

9.
Proc Mach Learn Res ; 193: 326-342, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36686987

RESUMO

Obesity is a major public health concern. Multidisciplinary pediatric weight management programs are considered standard treatment for children with obesity who are not able to be successfully managed in the primary care setting. Despite their great potential, high dropout rates (referred to as attrition) are a major hurdle in delivering successful interventions. Predicting attrition patterns can help providers reduce the alarmingly high rates of attrition (up to 80%) by engaging in earlier and more personalized interventions. Previous work has mainly focused on finding static predictors of attrition on smaller datasets and has achieved limited success in effective prediction. In this study, we have collected a five-year comprehensive dataset of 4,550 children from diverse backgrounds receiving treatment at four pediatric weight management programs in the US. We then developed a machine learning pipeline to predict (a) the likelihood of attrition, and (b) the change in body-mass index (BMI) percentile of children, at different time points after joining the weight management program. Our pipeline is greatly customized for this problem using advanced machine learning techniques to process longitudinal data, smaller-size data, and interrelated prediction tasks. The proposed method showed strong prediction performance as measured by AUROC scores (average AUROC of 0.77 for predicting attrition, and 0.78 for predicting weight outcomes).

10.
Proc Mach Learn Res ; 193: 311-325, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36686986

RESUMO

An increasing amount of research is being devoted to applying machine learning methods to electronic health record (EHR) data for various clinical purposes. This growing area of research has exposed the challenges of the accessibility of EHRs. MIMIC is a popular, public, and free EHR dataset in a raw format that has been used in numerous studies. The absence of standardized preprocessing steps can be, however, a significant barrier to the wider adoption of this rare resource. Additionally, this absence can reduce the reproducibility of the developed tools and limit the ability to compare the results among similar studies. In this work, we provide a greatly customizable pipeline to extract, clean, and preprocess the data available in the fourth version of the MIMIC dataset (MIMIC-IV). The pipeline also presents an end-to-end wizard-like package supporting predictive model creations and evaluations. The pipeline covers a range of clinical prediction tasks which can be broadly classified into four categories - readmission, length of stay, mortality, and phenotype prediction. The tool is publicly available at https://github.com/healthylaife/MIMIC-IV-Data-Pipeline.

11.
ACM BCB ; 20212021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34604866

RESUMO

Working with electronic health records (EHRs) is known to be challenging due to several reasons. These reasons include not having: 1) similar lengths (per visit), 2) the same number of observations (per patient), and 3) complete entries in the available records. These issues hinder the performance of the predictive models created using EHRs. In this paper, we approach these issues by presenting a model for the combined task of imputing and predicting values for the irregularly observed and varying length EHR data with missing entries. Our proposed model (dubbed as Bi-GAN) uses a bidirectional recurrent network in a generative adversarial setting. In this architecture, the generator is a bidirectional recurrent network that receives the EHR data and imputes the existing missing values. The discriminator attempts to discriminate between the actual and the imputed values generated by the generator. Using the input data in its entirety, Bi-GAN learns how to impute missing elements in-between (imputation) or outside of the input time steps (prediction). Our method has three advantages to the state-of-the-art methods in the field: (a) one single model performs both the imputation and prediction tasks; (b) the model can perform predictions using time-series of varying length with missing data; (c) it does not require to know the observation and prediction time window during training and can be used for the predictions with different observation and prediction window lengths, for short- and long-term predictions. We evaluate our model on two large EHR datasets to impute and predict body mass index (BMI) values and show its superior performance in both settings.

12.
Smart Health (Amst) ; 212021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-34568534

RESUMO

Type 2 diabetes - a prevalent chronic disease worldwide - increases risk for serious health consequences including heart and kidney disease. Forecasting diabetes progression can inform disease management strategies, thereby potentially reducing the likelihood or severity of its consequences. We use continuous glucose monitoring and actigraphy data from 54 individuals with Type 2 diabetes to predict their future hemoglobin A1c, HDL cholesterol, LDL cholesterol, and triglyceride levels one year later. We use a combination of convolutional and recurrent neural networks to develop a deep neural network architecture that can learn the dynamic patterns in different sensors' data and combine those patterns with additional demographic and lab data. To further demonstrate the generalizability of our models, we also evaluate their performance using an independent public dataset of individuals with Type 1 diabetes. In addition to diabetes, our approach could be useful for other serious and chronic physical illness, where dynamic (e.g., from multiple sensors) and static (e.g., demographic) data are used for creating predictive models.

13.
BMC Med Inform Decis Mak ; 21(1): 5, 2021 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-33407390

RESUMO

BACKGROUND: Cardiovascular disease (CVD) is the leading cause of death in the United States (US). Better cardiovascular health (CVH) is associated with CVD prevention. Predicting future CVH levels may help providers better manage patients' CVH. We hypothesized that CVH measures can be predicted based on previous measurements from longitudinal electronic health record (EHR) data. METHODS: The Guideline Advantage (TGA) dataset was used and contained EHR data from 70 outpatient clinics across the United States (US). We studied predictions of 5 CVH submetrics: smoking status (SMK), body mass index (BMI), blood pressure (BP), hemoglobin A1c (A1C), and low-density lipoprotein (LDL). We applied embedding techniques and long short-term memory (LSTM) networks - to predict future CVH category levels from all the previous CVH measurements of 216,445 unique patients for each CVH submetric. RESULTS: The LSTM model performance was evaluated by the area under the receiver operator curve (AUROC): the micro-average AUROC was 0.99 for SMK prediction; 0.97 for BMI; 0.84 for BP; 0.91 for A1C; and 0.93 for LDL prediction. Model performance was not improved by using all 5 submetric measures compared with using single submetric measures. CONCLUSIONS: We suggest that future CVH levels can be predicted using previous CVH measurements for each submetric, which has implications for population cardiovascular health management. Predicting patients' future CVH levels might directly increase patient CVH health and thus quality of life, while also indirectly decreasing the burden and cost for clinical health system caused by CVD and cancers.


Assuntos
Doenças Cardiovasculares , Registros Eletrônicos de Saúde , Pressão Sanguínea , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/epidemiologia , Estudos Transversais , Nível de Saúde , Humanos , Qualidade de Vida , Fatores de Risco , Estados Unidos/epidemiologia
14.
Artigo em Inglês | MEDLINE | ID: mdl-36684475

RESUMO

Machine learning algorithms have been widely used to capture the static and temporal patterns within electronic health records (EHRs). While many studies focus on the (primary) prevention of diseases, primordial prevention (preventing the factors that are known to increase the risk of a disease occurring) is still widely under-investigated. In this study, we propose a multi-target regression model leveraging transformers to learn the bidirectional representations of EHR data and predict the future values of 11 major modifiable risk factors of cardiovascular disease (CVD). Inspired by the proven results of pre-training in natural language processing studies, we apply the same principles on EHR data, dividing the training of our model into two phases: pre-training and fine-tuning. We use the fine-tuned transformer model in a "multi-target regression" theme. Following this theme, we combine the 11 disjoint prediction tasks by adding shared and target-specific layers to the model and jointly train the entire model. We evaluate the performance of our proposed method on a large publicly available EHR dataset. Through various experiments, we demonstrate that the proposed method obtains a significant improvement (12.6% MAE on average across all 11 different outputs) over the baselines.

15.
Am J Physiol Regul Integr Comp Physiol ; 315(2): R256-R266, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29341825

RESUMO

Easy access to high-energy food has been linked to high rates of obesity in the world. Understanding the way that access to palatable (high fat or high calorie) food can lead to overconsumption is essential for both preventing and treating obesity. Although the body of studies focused on the effects of high-energy diets is growing, our understanding of how different factors contribute to food choices is not complete. In this study, we present a mathematical model that can predict rat calorie intake to a high-energy diet based on their ingestive behavior to a standard chow diet. Specifically, we propose an equation that describes the relation between the body weight ( W), energy density ( E), time elapsed from the start of diet ( T), and daily calorie intake ( C). We tested our model on two independent data sets. Our results show that the suggested model can predict the calorie intake patterns with high accuracy. Additionally, the only free parameter of our proposed equation (ρ), which is unique to each animal, has a strong correlation with their calorie intake.


Assuntos
Comportamento Animal , Ingestão de Energia , Metabolismo Energético , Comportamento Alimentar , Modelos Biológicos , Valor Nutritivo , Ração Animal , Animais , Peso Corporal , Preferências Alimentares , Masculino , Ratos Sprague-Dawley , Fatores de Tempo
16.
PLoS One ; 12(5): e0178348, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28542615

RESUMO

Computational models have gained popularity as a predictive tool for assessing proposed policy changes affecting dietary choice. Specifically, they have been used for modeling dietary changes in response to economic interventions, such as price and income changes. Herein, we present a novel addition to this type of model by incorporating habitual behaviors that drive individuals to maintain or conform to prior eating patterns. We examine our method in a simulated case study of food choice behaviors of low-income adults in the US. We use data from several national datasets, including the National Health and Nutrition Examination Survey (NHANES), the US Bureau of Labor Statistics and the USDA, to parameterize our model and develop predictive capabilities in 1) quantifying the influence of prior diet preferences when food budgets are increased and 2) simulating the income elasticities of demand for four food categories. Food budgets can increase because of greater affordability (due to food aid and other nutritional assistance programs), or because of higher income. Our model predictions indicate that low-income adults consume unhealthy diets when they have highly constrained budgets, but that even after budget constraints are relaxed, these unhealthy eating behaviors are maintained. Specifically, diets in this population, before and after changes in food budgets, are characterized by relatively low consumption of fruits and vegetables and high consumption of fat. The model results for income elasticities also show almost no change in consumption of fruit and fat in response to changes in income, which is in agreement with data from the World Bank's International Comparison Program (ICP). Hence, the proposed method can be used in assessing the influences of habitual dietary patterns on the effectiveness of food policies.


Assuntos
Comportamento de Escolha , Custos e Análise de Custo , Comportamento Alimentar/psicologia , Alimentos/economia , Adulto , Idoso , Custos e Análise de Custo/estatística & dados numéricos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Teóricos , Política Nutricional , Inquéritos Nutricionais , Pobreza/psicologia , Pobreza/estatística & dados numéricos , Estados Unidos , Adulto Jovem
17.
SSM Popul Health ; 3: 211-218, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29349218

RESUMO

Social networks as well as neighborhood environments have been shown to effect obesity-related behaviors including energy intake and physical activity. Accordingly, harnessing social networks to improve targeting of obesity interventions may be promising to the extent this leads to social multiplier effects and wider diffusion of intervention impact on populations. However, the literature evaluating network-based interventions has been inconsistent. Computational methods like agent-based models (ABM) provide researchers with tools to experiment in a simulated environment. We develop an ABM to compare conventional targeting methods (random selection, based on individual obesity risk, and vulnerable areas) with network-based targeting methods. We adapt a previously published and validated model of network diffusion of obesity-related behavior. We then build social networks among agents using a more realistic approach. We calibrate our model first against national-level data. Our results show that network-based targeting may lead to greater population impact. We also present a new targeting method that outperforms other methods in terms of intervention effectiveness at the population level.

18.
J Nutr ; 146(11): 2304-2311, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27655757

RESUMO

BACKGROUND: The price of food has long been considered one of the major factors that affects food choices. However, the price metric (e.g., the price of food per calorie or the price of food per gram) that individuals predominantly use when making food choices is unclear. Understanding which price metric is used is especially important for studying individuals with severe budget constraints because food price then becomes even more important in food choice. OBJECTIVE: We assessed which price metric is used by low-income individuals in deciding what to eat. METHODS: With the use of data from NHANES and the USDA Food and Nutrient Database for Dietary Studies, we created an agent-based model that simulated an environment representing the US population, wherein individuals were modeled as agents with a specific weight, age, and income. In our model, agents made dietary food choices while meeting their budget limits with the use of 1 of 3 different metrics for decision making: energy cost (price per calorie), unit price (price per gram), and serving price (price per serving). The food consumption patterns generated by our model were compared to 3 independent data sets. RESULTS: The food choice behaviors observed in 2 of the data sets were found to be closest to the simulated dietary patterns generated by the price per calorie metric. The behaviors observed in the third data set were equidistant from the patterns generated by price per calorie and price per serving metrics, whereas results generated by the price per gram metric were further away. CONCLUSIONS: Our simulations suggest that dietary food choice based on price per calorie best matches actual consumption patterns and may therefore be the most salient price metric for low-income populations.


Assuntos
Simulação por Computador , Ingestão de Energia , Alimentos/economia , Modelos Teóricos , Pobreza , Custos e Análise de Custo , Tomada de Decisões , Humanos , Valor Nutritivo , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...