Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Sci Rep ; 14(1): 8745, 2024 04 16.
Artigo em Inglês | MEDLINE | ID: mdl-38627439

RESUMO

Accurately predicting patients' risk for specific medical outcomes is paramount for effective healthcare management and personalized medicine. While a substantial body of literature addresses the prediction of diverse medical conditions, existing models predominantly focus on singular outcomes, limiting their scope to one disease at a time. However, clinical reality often entails patients concurrently facing multiple health risks across various medical domains. In response to this gap, our study proposes a novel multi-risk framework adept at simultaneous risk prediction for multiple clinical outcomes, including diabetes, mortality, and hypertension. Leveraging a concise set of features extracted from patients' cardiorespiratory fitness data, our framework minimizes computational complexity while maximizing predictive accuracy. Moreover, we integrate a state-of-the-art instance-based interpretability technique into our framework, providing users with comprehensive explanations for each prediction. These explanations afford medical practitioners invaluable insights into the primary health factors influencing individual predictions, fostering greater trust and utility in the underlying prediction models. Our approach thus stands to significantly enhance healthcare decision-making processes, facilitating more targeted interventions and improving patient outcomes in clinical practice. Our prediction framework utilizes an automated machine learning framework, Auto-Weka, to optimize machine learning models and hyper-parameter configurations for the simultaneous prediction of three medical outcomes: diabetes, mortality, and hypertension. Additionally, we employ a local interpretability technique to elucidate predictions generated by our framework. These explanations manifest visually, highlighting key attributes contributing to each instance's prediction for enhanced interpretability. Using automated machine learning techniques, the models simultaneously predict hypertension, mortality, and diabetes risks, utilizing only nine patient features. They achieved an average AUC of 0.90 ± 0.001 on the hypertension dataset, 0.90 ± 0.002 on the mortality dataset, and 0.89 ± 0.001 on the diabetes dataset through tenfold cross-validation. Additionally, the models demonstrated strong performance with an average AUC of 0.89 ± 0.001 on the hypertension dataset, 0.90 ± 0.001 on the mortality dataset, and 0.89 ± 0.001 on the diabetes dataset using bootstrap evaluation with 1000 resamples.


Assuntos
Aptidão Cardiorrespiratória , Diabetes Mellitus , Hipertensão , Humanos , Aprendizado de Máquina
2.
Sci Rep ; 12(1): 983, 2022 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-35046488

RESUMO

Governments pay agencies to control the activities of farmers who receive governmental support. Field visits are costly and highly time-consuming; hence remote sensing is widely used for monitoring farmers' activities. Nowadays, a vast amount of available data from the Sentinel mission significantly boosted research in agriculture. Estonia is among the first countries to take advantage of this data source to automate mowing and ploughing events detection across the country. Although techniques that rely on optical data for monitoring agriculture events are favourable, the availability of such data during the growing season is limited. Thus, alternative data sources have to be evaluated. In this paper, we developed a deep learning model with an integrated reject option for detecting grassland mowing events using time series of Sentinel-1 and Sentinel-2 optical images acquired from 2000 fields in Estonia in 2018 during the vegetative season. The rejection mechanism is based on a threshold over the prediction confidence of the proposed model. The proposed model significantly outperforms the state-of-the-art technique and achieves event accuracy of 73.3% and end of season accuracy of 94.8%.

3.
BMC Med Inform Decis Mak ; 19(1): 146, 2019 07 29.
Artigo em Inglês | MEDLINE | ID: mdl-31357998

RESUMO

BACKGROUND: Although complex machine learning models are commonly outperforming the traditional simple interpretable models, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. The aim of this study to demonstrate the utility of various model-agnostic explanation techniques of machine learning models with a case study for analyzing the outcomes of the machine learning random forest model for predicting the individuals at risk of developing hypertension based on cardiorespiratory fitness data. METHODS: The dataset used in this study contains information of 23,095 patients who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. Five global interpretability techniques (Feature Importance, Partial Dependence Plot, Individual Conditional Expectation, Feature Interaction, Global Surrogate Models) and two local interpretability techniques (Local Surrogate Models, Shapley Value) have been applied to present the role of the interpretability techniques on assisting the clinical staff to get better understanding and more trust of the outcomes of the machine learning-based predictions. RESULTS: Several experiments have been conducted and reported. The results show that different interpretability techniques can shed light on different insights on the model behavior where global interpretations can enable clinicians to understand the entire conditional distribution modeled by the trained response function. In contrast, local interpretations promote the understanding of small parts of the conditional distribution for specific instances. CONCLUSIONS: Various interpretability techniques can vary in their explanations for the behavior of the machine learning model. The global interpretability techniques have the advantage that it can generalize over the entire population while local interpretability techniques focus on giving explanations at the level of instances. Both methods can be equally valid depending on the application need. Both methods are effective methods for assisting clinicians on the medical decision process, however, the clinicians will always remain to hold the final say on accepting or rejecting the outcome of the machine learning models and their explanations based on their domain expertise.


Assuntos
Hipertensão/diagnóstico , Aprendizado de Máquina , Aptidão Cardiorrespiratória , Conjuntos de Dados como Assunto , Teste de Esforço , Reações Falso-Positivas , Feminino , Humanos , Masculino , Fatores de Risco
4.
Int J Cardiol ; 288: 140-147, 2019 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-30685103

RESUMO

OBJECTIVE: The In-hospital length of stay (LOS) is expected to increase as cardiovascular diseases complexity increases and the population ages. This will affect healthcare systems especially with the current situation of decreased bed capacity and increasing costs. Therefore, accurately predicting LOS would have a positive impact on healthcare metrics. The aim of this study is to develop a machine learning-based model approach for predicting in-hospital LOS for cardiac patients. DESIGN: Using electronic medical records, we retrospectively extracted all records of patients' visits that were admitted under adult cardiology service. Admission diagnosis and primary treating physician were reviewed to verify selection criteria. A predictive machine learning-based model approach was applied to incorporate simple baseline health data at admission time to predict LOS. Patients were divided into three groups based on their LOS: short (<3 days), intermediate (3-5 days) and long (>5 days). Information gain algorithm was utilized to select the most relevant attributes. Only attributes with information gain of more than zero were used in model building. Four different machine learning techniques were evaluated and their diagnostic accuracy measures were compared. SETTING: The dataset of this study included adult patients who were admitted between 2008 and 2016 in King Abdulaziz Cardiac Center (KACC). The center is located in King Abdulaziz Medical City Complex in Riyadh, the capital of Saudi Arabia. PARTICIPANTS (DATASET): A total of 16,414 consecutive inpatient visits for 12,769 unique patients (mean age of 58.8 ±â€¯16 years of which 68.2% were males) between 2008 and 2016 were included. The study cohort had a high prevalence of cardiovascular risk factors (hypertension 56%, diabetes 56%, dyslipidemia 52%, obesity 33% and smoking 24%). The most common admitting diagnosis was acute coronary syndrome (36%). RESULTS: The variables with highest impact on the prediction of in-hospital LOS were on admission heart rate, on admission systolic and diastolic blood pressure, age and insurance status (eligibility). Using machine learning models; Random Forest (RF) model outperformed among all other models (sensitivity (0.80), accuracy (0.80), and AUROC (0.94)). CONCLUSION: We showed that machine learning methods provide accurate prediction of LOS for cardiac patients. This is can be used in clinical bed management and resources allocation.


Assuntos
Registros Eletrônicos de Saúde/estatística & dados numéricos , Cardiopatias/terapia , Pacientes Internados/estatística & dados numéricos , Tempo de Internação/estatística & dados numéricos , Aprendizado de Máquina , Feminino , Cardiopatias/diagnóstico , Cardiopatias/epidemiologia , Humanos , Masculino , Pessoa de Meia-Idade , Morbidade/tendências , Prognóstico , Curva ROC , Estudos Retrospectivos , Arábia Saudita/epidemiologia
5.
PLoS One ; 13(4): e0195344, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29668729

RESUMO

This study evaluates and compares the performance of different machine learning techniques on predicting the individuals at risk of developing hypertension, and who are likely to benefit most from interventions, using the cardiorespiratory fitness data. The dataset of this study contains information of 23,095 patients who underwent clinician- referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. The variables of the dataset include information on vital signs, diagnosis and clinical laboratory measurements. Six machine learning techniques were investigated: LogitBoost (LB), Bayesian Network classifier (BN), Locally Weighted Naive Bayes (LWB), Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Tree Forest (RTF). Using different validation methods, the RTF model has shown the best performance (AUC = 0.93) and outperformed all other machine learning techniques examined in this study. The results have also shown that it is critical to carefully explore and evaluate the performance of the machine learning models using various model evaluation methods as the prediction accuracy can significantly differ.


Assuntos
Aptidão Cardiorrespiratória/fisiologia , Teste de Esforço/métodos , Hipertensão/etiologia , Aprendizado de Máquina , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Área Sob a Curva , Teorema de Bayes , Bases de Dados Factuais , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Máquina de Vetores de Suporte , Adulto Jovem
6.
BMC Med Inform Decis Mak ; 17(1): 174, 2017 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-29258510

RESUMO

BACKGROUND: Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality). METHODS: We use data of 34,212 patients free of known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had a complete 10-year follow-up. Seven machine learning classification techniques were evaluated: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN), K-Nearest Neighbor (KNN) and Random Forest (RF). In order to handle the imbalanced dataset used, the Synthetic Minority Over-Sampling Technique (SMOTE) is used. RESULTS: Two set of experiments have been conducted with and without the SMOTE sampling technique. On average over different evaluation metrics, SVM Classifier has shown the lowest performance while other models like BN, BC and DT performed better. The RF classifier has shown the best performance (AUC = 0.97) among all models trained using the SMOTE sampling. CONCLUSIONS: The results show that various ML techniques can significantly vary in terms of its performance for the different evaluation metrics. It is also not necessarily that the more complex the ML model, the more prediction accuracy can be achieved. The prediction performance of all models trained with SMOTE is much better than the performance of models trained without SMOTE. The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data.


Assuntos
Aptidão Cardiorrespiratória , Classificação , Teste de Esforço , Aprendizado de Máquina , Mortalidade , Adulto , Idoso , Conjuntos de Dados como Assunto , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Prognóstico
7.
Am J Cardiol ; 120(11): 2078-2084, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-28951020

RESUMO

Previous studies have demonstrated that cardiorespiratory fitness is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of the analysis is to compare the prediction of 10 years of all-cause mortality (ACM) using statistical logistic regression (LR) and ML approaches in a cohort of patients who underwent exercise stress testing. We included 34,212 patients (55% males, mean age 54 ± 13 years) free of coronary artery disease or heart failure who underwent exercise treadmill stress testing between 1991 and 2009 and had complete 10-year follow-up. The primary outcome of this analysis was ACM at 10 years. The probability of 10-years ACM was calculated using statistical LR and ML, and the accuracy of these methods was calculated and compared. A total of 3,921 patients died at 10 years. Using statistical LR, the sensitivity to predict ACM was 44.9% (95% confidence interval [CI] 43.3% to 46.5%), whereas the specificity was 93.4% (95% CI 93.1% to 93.7%). The sensitivity of ML to predict ACM was 87.4% (95% CI 86.3% to 88.4%), whereas the specificity was 97.2% (95% CI 97.0% to 97.4%). The ML approach was associated with improved model discrimination (area under the curve for ML [0.923 (95% CI 0.917 to 0.928)]) compared with statistical LR (0.836 [95% CI 0.829 to 0.846], p<0.0001). In conclusion, our analysis demonstrates that ML provides better accuracy and discrimination of the prediction of ACM among patients undergoing stress testing.


Assuntos
Aptidão Cardiorrespiratória , Doenças Cardiovasculares/diagnóstico , Teste de Esforço/métodos , Tolerância ao Exercício/fisiologia , Previsões , Aprendizado de Máquina , Medição de Risco/métodos , Algoritmos , Doenças Cardiovasculares/mortalidade , Causas de Morte/tendências , Feminino , Humanos , Masculino , Michigan/epidemiologia , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Estudos Retrospectivos
8.
Int J Cardiol ; 228: 214-218, 2017 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-27865188

RESUMO

BACKGROUND: Prior Studies showed mixed results in association of digoxin use with all-cause mortality (ACM). The aim of this analysis is to identify the impact of digoxin use on ACM in a contemporary heart failure (HF) cohort treated with guideline based therapy. METHODS: We included 2298 consecutive patients seen in an HF clinic between 2000 and 2015. Patients were considered to be a digoxin user if he/she received digoxin at any point during the enrollment period in the HF clinic. Patients were matched based on digoxin utility using propensity matching in 2-3:1 fashion. The primary outcome was ACM. RESULT: Of 2298 patients, 325 digoxin users were matched with 750 non-digoxin users. The Matched cohort did not have differences among demographics and clinical variables except for worse HF symptomatology and increased prevalence of atrial fibrillation. Overall, the prevalence of the use of guideline suggested therapies was 96%. After a median follow-up duration of 4years (IQR 2-6years), digoxin use was associated with increased ACM (21.8% versus 12.9%, unadjusted HR=1.81; 95% CI=1.33 to 2.45; p=0.001). This association remained significant after adjusting for the propensity score, atrial fibrillation, ejection fraction, and New York HF Class (HR=1.74; 95% CI=1.20 to 2.38; p<0.0001). CONCLUSION: In this analysis of well-treated HF patients, digoxin was associated with increased ACM. Further randomized controlled trials are needed to determine whether digoxin therapy should be used in well-treated HF patients. Until then, routine use of digoxin in clinical practice should be discouraged.


Assuntos
Cardiotônicos/uso terapêutico , Digoxina/uso terapêutico , Insuficiência Cardíaca Sistólica/tratamento farmacológico , Insuficiência Cardíaca Sistólica/mortalidade , Adulto , Idoso , Doença Crônica , Estudos de Coortes , Feminino , Insuficiência Cardíaca Sistólica/complicações , Humanos , Masculino , Pessoa de Meia-Idade , Pontuação de Propensão , Taxa de Sobrevida , Resultado do Tratamento
9.
Springerplus ; 5(1): 665, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27350905

RESUMO

A graph is a popular data model that has become pervasively used for modeling structural relationships between objects. In practice, in many real-world graphs, the graph vertices and edges need to be associated with descriptive attributes. Such type of graphs are referred to as attributed graphs. G-SPARQL has been proposed as an expressive language, with a centralized execution engine, for querying attributed graphs. G-SPARQL supports various types of graph querying operations including reachability, pattern matching and shortest path where any G-SPARQL query may include value-based predicates on the descriptive information (attributes) of the graph edges/vertices in addition to the structural predicates. In general, a main limitation of centralized systems is that their vertical scalability is always restricted by the physical limits of computer systems. This article describes the design, implementation in addition to the performance evaluation of DG-SPARQL, a distributed, hybrid and adaptive parallel execution engine of G-SPARQL queries. In this engine, the topology of the graph is distributed over the main memory of the underlying nodes while the graph data are maintained in a relational store which is replicated on the disk of each of the underlying nodes. DG-SPARQL evaluates parts of the query plan via SQL queries which are pushed to the underlying relational stores while other parts of the query plan, as necessary, are evaluated via indexless memory-based graph traversal algorithms. Our experimental evaluation shows the efficiency and the scalability of DG-SPARQL on querying massive attributed graph datasets in addition to its ability to outperform the performance of Apache Giraph, a popular distributed graph processing system, by orders of magnitudes.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...