ABSTRACT
Type 2 diabetes mellitus (T2DM) is one of the most common metabolic diseases in the world and poses a significant public health challenge. Early detection and management of this metabolic disorder is crucial to prevent complications and improve outcomes. This paper aims to find core differences in male and female markers to detect T2DM by their clinic and anthropometric features, seeking out ranges in potential biomarkers identified to provide useful information as a pre-diagnostic tool whie excluding glucose-related biomarkers using machine learning (ML) models. We used a dataset containing clinical and anthropometric variables from patients diagnosed with T2DM and patients without TD2M as control. We applied feature selection with three different techniques to identify relevant biomarker models: an improved recursive feature elimination (RFE) evaluating each set from all the features to one feature with the Akaike information criterion (AIC) to find optimal outputs; Least Absolute Shrinkage and Selection Operator (LASSO) with glmnet; and Genetic Algorithms (GA) with GALGO and forward selection (FS) applied to GALGO output. We then used these for comparison with the AIC to measure the performance of each technique and collect the optimal set of global features. Then, an implementation and comparison of five different ML models was carried out to identify the most accurate and interpretable one, considering the following models: logistic regression (LR), artificial neural network (ANN), support vector machine (SVM), k-nearest neighbors (KNN), and nearest centroid (Nearcent). The models were then combined in an ensemble to provide a more robust approximation. The results showed that potential biomarkers such as systolic blood pressure (SBP) and triglycerides are together significantly associated with T2DM. This approach also identified triglycerides, cholesterol, and diastolic blood pressure as biomarkers with differences between male and female actors that have not been previously reported in the literature. The most accurate ML model was selection with RFE and random forest (RF) as the estimator improved with the AIC, which achieved an accuracy of 0.8820. In conclusion, this study demonstrates the potential of ML models in identifying potential biomarkers for early detection of T2DM, excluding glucose-related biomarkers as well as differences between male and female anthropometric and clinic profiles. These findings may help to improve early detection and management of the T2DM by accounting for differences between male and female subjects in terms of anthropometric and clinic profiles, potentially reducing healthcare costs and improving personalized patient attention. Further research is needed to validate these potential biomarkers ranges in other populations and clinical settings.
ABSTRACT
Atrial fibrillation (AF) is the most clinically diagnosed arrhythmia, as its prevalence increases with age, and its initial stage is paroxysmal atrial fibrillation (PAF). This pathology usually triggers hemodynamic disorders that can generate cerebrovascular accidents (CVA), causing morbidity and even death. The aim of this study is to predict the occurrence of PAF episodes in order to take precautions to prevent PAF episodes. The PhysioNet AFPDB prediction database was used to extract 77 heart rate variability (HRV) features using time domain, geometrical analysis, Poincaré plot, nonlinear analysis, detrended fluctuation analysis, autoregressive modeling, fast Fourier transform (FFT), Lomb-Scargle periodogram, wavelet packet transform (WPT) and bispectrum measurements. The number of features was reduced using the near-zero value, correlation, and recursive feature elimination (RFE) methods for time windows of 1, 2, 5, 10, and 30 min. Feature selection was performed using backwards selection, genetic algorithm, analysis of variance (ANOVA), and non-dominated sorting genetic algorithm (NSGA-III) methods, and then random forest, conditional random forest, k-nearest neighbor (KNN), and support vector machine (SVM) classification algorithms were applied and evaluated using 10-fold cross-validation. The proposed method achieved a precision of 93.24% with a 5-minute window and 89.21% with a 2-minute window, improving performance in predicting PAF when compared with similar studies in the literature.
ABSTRACT
BACKGROUND: Pelvic floor pressure distribution profiles, obtained by a novel instrumented non-deformable probe, were used as the input to a feature extraction, selection, and classification approach to test their potential for an automatic diagnostic system for objective female urinary incontinence assessment. We tested the performance of different feature selection approaches and different classifiers, as well as sought to establish the group of features that provides the greatest discrimination capability between continent and incontinent women. METHODS: The available data for evaluation consisted of intravaginal spatiotemporal pressure profiles acquired from 24 continent and 24 incontinent women while performing four pelvic floor maneuvers: the maximum contraction maneuver, Valsalva maneuver, endurance maneuver, and wave maneuver. Feature extraction was guided by previous studies on the characterization of pressure profiles in the vaginal canal, where the extracted features were tested concerning their repeatability. Feature selection was achieved through a combination of a ranking method and a complete non-exhaustive subset search algorithm: branch and bound and recursive feature elimination. Three classifiers were tested: k-nearest neighbors (k-NN), support vector machine, and logistic regression. RESULTS: Of the classifiers employed, there was not one that outperformed the others; however, k-NN presented statistical inferiority in one of the maneuvers. The best result was obtained through the application of recursive feature elimination on the features extracted from all the maneuvers, resulting in 77.1% test accuracy, 74.1% precision, and 83.3 recall, using SVM. Moreover, the best feature subset, obtained by observing the selection frequency of every single feature during the application of branch and bound, was directly employed on the classification, thus reaching 95.8% accuracy. Although not at the level required by an automatic system, the results show the potential use of pelvic floor pressure distribution profiles data and provide insights into the pelvic floor functioning aspects that contribute to urinary incontinence.
ABSTRACT
Brain computer interfaces (BCI) represent an alternative for patients whose cognitive functions are preserved, but are unable to communicate via conventional means. A commonly used BCI paradigm is based on the detection of event-related potentials, particularly the P300, immersed in the electroencephalogram (EEG). In order to transfer laboratory-tested BCIs into systems that can be used by at homes, it is relevant to investigate if it is possible to select a limited set of EEG channels that work for most subjects and across different sessions without a significant decrease in performance. In this work, two strategies for channel selection for a single-trial P300 brain computer interface were evaluated and compared. The first strategy was tailored specifically for each subject, whereas the second strategy aimed at finding a subject-independent set of channels. In both strategies, genetic algorithms (GAs) and recursive feature elimination algorithms were used. The classification stage was performed using a linear discriminant. A dataset of EEG recordings from 18 healthy subjects was used test the proposed configurations. Performance indexes were calculated to evaluate the system. Results showed that a fixed subset of four subject-independent EEG channels selected using GA provided the best compromise between BCI setup and single-trial system performance.