Pesquisa | Portal Regional da BVS (teste)

1.

Fusing ¹H NMR and Raman experimental data for the improvement of wine recognition models.

Hategan, Ariana Raluca; David, Maria; Pirnau, Adrian; Cozar, Bogdan; Cinta-Pinzaru, Simona; Guyon, Francois; Magdas, Dana Alina.

Food Chem ; 458: 140245, 2024 Jun 26.

Artigo em Inglês | MEDLINE | ID: mdl-38954957

RESUMO

The present study proposes the development of new wine recognition models based on Artificial Intelligence (AI) applied to the mid-level data fusion of 1H NMR and Raman data. In this regard, a supervised machine learning method, namely Support Vector Machines (SVMs), was applied for classifying wine samples with respect to the cultivar, vintage, and geographical origin. Because the association between the two data sources generated an input space with a high dimensionality, a feature selection algorithm was employed to identify the most relevant discriminant markers for each wine classification criterion, before SVM modeling. The proposed data processing strategy allowed the classification of the wine sample set with accuracies up to 100% in both cross-validation and on an independent test set and highlighted the efficiency of 1H NMR and Raman data fusion as opposed to the use of a single-source data for differentiating wine concerning the cultivar and vintage.

2.

Chaotic RIME optimization algorithm with adaptive mutualism for feature selection problems.

Abdel-Salam, Mahmoud; Hu, Gang; Çelik, Emre; Gharehchopogh, Farhad Soleimanian; El-Hasnony, Ibrahim M.

Comput Biol Med ; 179: 108803, 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38955125

RESUMO

The RIME optimization algorithm is a newly developed physics-based optimization algorithm used for solving optimization problems. The RIME algorithm proved high-performing in various fields and domains, providing a high-performance solution. Nevertheless, like many swarm-based optimization algorithms, RIME suffers from many limitations, including the exploration-exploitation balance not being well balanced. In addition, the likelihood of falling into local optimal solutions is high, and the convergence speed still needs some work. Hence, there is room for enhancement in the search mechanism so that various search agents can discover new solutions. The authors suggest an adaptive chaotic version of the RIME algorithm named ACRIME, which incorporates four main improvements, including an intelligent population initialization using chaotic maps, a novel adaptive modified Symbiotic Organism Search (SOS) mutualism phase, a novel mixed mutation strategy, and the utilization of restart strategy. The main goal of these improvements is to improve the variety of the population, achieve a better balance between exploration and exploitation, and improve RIME's local and global search abilities. The study assesses the effectiveness of ACRIME by using the standard benchmark functions of the CEC2005 and CEC2019 benchmarks. The proposed ACRIME is also applied as a feature selection to fourteen various datasets to test its applicability to real-world problems. Besides, the ACRIME algorithm is applied to the COVID-19 classification real problem to test its applicability and performance further. The suggested algorithm is compared to other sophisticated classical and advanced metaheuristics, and its performance is assessed using statistical tests such as Wilcoxon rank-sum and Friedman rank tests. The study demonstrates that ACRIME exhibits a high level of competitiveness and often outperforms competing algorithms. It discovers the optimal subset of features, enhancing the accuracy of classification and minimizing the number of features employed. This study primarily focuses on enhancing the equilibrium between exploration and exploitation, extending the scope of local search.

3.

Multi-step ahead forecasting of electrical conductivity in rivers by using a hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) model enhanced by Boruta-XGBoost feature selection algorithm.

Karbasi, Masoud; Ali, Mumtaz; Bateni, Sayed M; Jun, Changhyun; Jamei, Mehdi; Farooque, Aitazaz Ahsan; Yaseen, Zaher Mundher.

Sci Rep ; 14(1): 15051, 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38951605

RESUMO

Electrical conductivity (EC) is widely recognized as one of the most essential water quality metrics for predicting salinity and mineralization. In the current research, the EC of two Australian rivers (Albert River and Barratta Creek) was forecasted for up to 10 days using a novel deep learning algorithm (Convolutional Neural Network combined with Long Short-Term Memory Model, CNN-LSTM). The Boruta-XGBoost feature selection method was used to determine the significant inputs (time series lagged data) to the model. To compare the performance of Boruta-XGB-CNN-LSTM models, three machine learning approaches-multi-layer perceptron neural network (MLP), K-nearest neighbour (KNN), and extreme gradient boosting (XGBoost) were used. Different statistical metrics, such as correlation coefficient (R), root mean square error (RMSE), and mean absolute percentage error, were used to assess the models' performance. From 10 years of data in both rivers, 7 years (2012-2018) were used as a training set, and 3 years (2019-2021) were used for testing the models. Application of the Boruta-XGB-CNN-LSTM model in forecasting one day ahead of EC showed that in both stations, Boruta-XGB-CNN-LSTM can forecast the EC parameter better than other machine learning models for the test dataset (R = 0.9429, RMSE = 45.6896, MAPE = 5.9749 for Albert River, and R = 0.9215, RMSE = 43.8315, MAPE = 7.6029 for Barratta Creek). Considering the better performance of the Boruta-XGB-CNN-LSTM model in both rivers, this model was used to forecast 3-10 days ahead of EC. The results showed that the Boruta-XGB-CNN-LSTM model is very capable of forecasting the EC for the next 10 days. The results showed that by increasing the forecasting horizon from 3 to 10 days, the performance of the Boruta-XGB-CNN-LSTM model slightly decreased. The results of this study show that the Boruta-XGB-CNN-LSTM model can be used as a good soft computing method for accurately predicting how the EC will change in rivers.

4.

DNN-BP: a novel framework for cuffless blood pressure measurement from optimal PPG features using deep learning model.

Raju, S M Taslim Uddin; Dipto, Safin Ahmed; Hossain, Md Imran; Chowdhury, Md Abu Shahid; Haque, Fabliha; Nashrah, Ayesha Tun; Nishan, Araf; Khan, Md Mahamudul Hasan; Hashem, M M A.

Med Biol Eng Comput ; 2024 Jul 04.

Artigo em Inglês | MEDLINE | ID: mdl-38963467

RESUMO

Continuous blood pressure (BP) provides essential information for monitoring one's health condition. However, BP is currently monitored using uncomfortable cuff-based devices, which does not support continuous BP monitoring. This paper aims to introduce a blood pressure monitoring algorithm based on only photoplethysmography (PPG) signals using the deep neural network (DNN). The PPG signals are obtained from 125 unique subjects with 218 records and filtered using signal processing algorithms to reduce the effects of noise, such as baseline wandering, and motion artifacts. The proposed algorithm is based on pulse wave analysis of PPG signals, extracted various domain features from PPG signals, and mapped them to BP values. Four feature selection methods are applied and yielded four feature subsets. Therefore, an ensemble feature selection technique is proposed to obtain the optimal feature set based on major voting scores from four feature subsets. DNN models, along with the ensemble feature selection technique, outperformed in estimating the systolic blood pressure (SBP) and diastolic blood pressure (DBP) compared to previously reported approaches that rely only on the PPG signal. The coefficient of determination ( R 2 ) and mean absolute error (MAE) of the proposed algorithm are 0.962 and 2.480 mmHg, respectively, for SBP and 0.955 and 1.499 mmHg, respectively, for DBP. The proposed approach meets the Advancement of Medical Instrumentation standard for SBP and DBP estimations. Additionally, according to the British Hypertension Society standard, the results attained Grade A for both SBP and DBP estimations. It concludes that BP can be estimated more accurately using the optimal feature set and DNN models. The proposed algorithm has the potential ability to facilitate mobile healthcare devices to monitor continuous BP.

5.

Predicting dyslipidemia incidence: unleashing machine learning algorithms on Lifestyle Promotion Project data.

Naderian, Senobar; Nikniaz, Zeinab; Farhangi, Mahdieh Abbasalizad; Nikniaz, Leila; Sama-Soltani, Taha; Rostami, Parisa.

BMC Public Health ; 24(1): 1777, 2024 Jul 03.

Artigo em Inglês | MEDLINE | ID: mdl-38961394

RESUMO

BACKGROUND: Dyslipidemia, characterized by variations in plasma lipid profiles, poses a global health threat linked to millions of deaths annually. OBJECTIVES: This study focuses on predicting dyslipidemia incidence using machine learning methods, addressing the crucial need for early identification and intervention. METHODS: The dataset, derived from the Lifestyle Promotion Project (LPP) in East Azerbaijan Province, Iran, undergoes a comprehensive preprocessing, merging, and null handling process. Target selection involves five distinct dyslipidemia-related variables. Normalization techniques and three feature selection algorithms are applied to enhance predictive modeling. RESULT: The study results underscore the potential of different machine learning algorithms, specifically multi-layer perceptron neural network (MLP), in reaching higher performance metrics such as accuracy, F1 score, sensitivity and specificity, among other machine learning methods. Among other algorithms, Random Forest also showed remarkable accuracies and outperformed K-Nearest Neighbors (KNN) in metrics like precision, recall, and F1 score. The study's emphasis on feature selection detected meaningful patterns among five target variables related to dyslipidemia, indicating fundamental shared unities among dyslipidemia-related factors. Features such as waist circumference, serum vitamin D, blood pressure, sex, age, diabetes, and physical activity related to dyslipidemia. CONCLUSION: These results cooperatively highlight the complex nature of dyslipidemia and its connections with numerous factors, strengthening the importance of applying machine learning methods to understand and predict its incidence precisely.

Assuntos

Dislipidemias , Aprendizado de Máquina , Humanos , Dislipidemias/epidemiologia , Incidência , Irã (Geográfico)/epidemiologia , Masculino , Feminino , Estilo de Vida , Algoritmos , Promoção da Saúde/métodos , Pessoa de Meia-Idade , Adulto

6.

Social coevolution and Sine chaotic opposition learning Chimp Optimization Algorithm for feature selection.

Zhang, Li; Chen, XiaoBo.

Sci Rep ; 14(1): 15413, 2024 Jul 04.

Artigo em Inglês | MEDLINE | ID: mdl-38965341

RESUMO

Feature selection is a hot problem in machine learning. Swarm intelligence algorithms play an essential role in feature selection due to their excellent optimisation ability. The Chimp Optimisation Algorithm (CHoA) is a new type of swarm intelligence algorithm. It has quickly won widespread attention in the academic community due to its fast convergence speed and easy implementation. However, CHoA has specific challenges in balancing local and global search, limiting its optimisation accuracy and leading to premature convergence, thus affecting the algorithm's performance on feature selection tasks. This study proposes Social coevolution and Sine chaotic opposition learning Chimp Optimization Algorithm (SOSCHoA). SOSCHoA enhances inter-population interaction through social coevolution, improving local search. Additionally, it introduces sine chaotic opposition learning to increase population diversity and prevent local optima. Extensive experiments on 12 high-dimensional classification datasets demonstrate that SOSCHoA outperforms existing algorithms in classification accuracy, convergence, and stability. Although SOSCHoA shows advantages in handling high-dimensional datasets, there is room for future research and optimization, particularly concerning feature dimensionality reduction.

7.

Comprehensive application of AI algorithms with TCR NGS data for glioma diagnosis.

Zhou, Kaiyue; Xiao, Zhengliang; Liu, Qi; Wang, Xu; Huo, Jiaxin; Wu, Xiaoqi; Zhao, Xiaoxiao; Feng, Xiaohan; Fu, Baoyi; Xu, Pengfei; Deng, Yunyun; Xiao, Wenwen; Sun, Tao; Da, Lin.

Sci Rep ; 14(1): 15361, 2024 07 04.

Artigo em Inglês | MEDLINE | ID: mdl-38965388

RESUMO

T-cell receptor (TCR) detection can examine the extent of T-cell immune responses. Therefore, the article analyzed characteristic data of glioma obtained by DNA-based TCR high-throughput sequencing, to predict the disease with fewer biomarkers and higher accuracy. We downloaded data online and obtained six TCR-related diversity indices to establish a multidimensional classification system. By comparing actual presence of the 602 correlated sequences, we obtained two-dimensional and multidimensional datasets. Multiple classification methods were utilized for both datasets with the classification accuracy of multidimensional data slightly less to two-dimensional datasets. This study reduced the TCR ß sequences through feature selection methods like RFECV (Recursive Feature Elimination with Cross-Validation). Consequently, using only the presence of these three sequences, the classification AUC value of 96.67% can be achieved. The combination of the three correlated TCR clones obtained at a source data threshold of 0.1 is: CASSLGGNTEAFF_TRBV12_TRBJ1-1, CASSYSDTGELFF_TRBV6_TRBJ2-2, and CASSLTGNTEAFF_TRBV12_TRBJ1-1. At 0.001, the combination is: CASSLGETQYF_TRBV12_TRBJ2-5, CASSLGGNQPQHF_TRBV12_TRBJ1-5, and CASSLSGNTIYF_TRBV12_TRBJ1-3. This method can serve as a potential diagnostic and therapeutic tool, facilitating diagnosis and treatment of glioma and other cancers.

Assuntos

Algoritmos , Glioma , Sequenciamento de Nucleotídeos em Larga Escala , Receptores de Antígenos de Linfócitos T , Glioma/genética , Glioma/diagnóstico , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Receptores de Antígenos de Linfócitos T/genética , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/diagnóstico

8.

A privacy-preserving platform oriented medical healthcare and its application in identifying patients with candidemia.

Yuan, Siyi; Xu, Song; Lu, Xiao; Chen, Xiangyu; Wang, Yao; Bao, Renyi; Sun, Yunbo; Xiao, Xiongjian; Su, Longxiang; Long, Yun; Li, Linfeng; He, Huaiwu.

Sci Rep ; 14(1): 15589, 2024 Jul 06.

Artigo em Inglês | MEDLINE | ID: mdl-38971879

RESUMO

Federated learning (FL) has emerged as a significant method for developing machine learning models across multiple devices without centralized data collection. Candidemia, a critical but rare disease in ICUs, poses challenges in early detection and treatment. The goal of this study is to develop a privacy-preserving federated learning framework for predicting candidemia in ICU patients. This approach aims to enhance the accuracy of antifungal drug prescriptions and patient outcomes. This study involved the creation of four predictive FL models for candidemia using data from ICU patients across three hospitals in China. The models were designed to prioritize patient privacy while aggregating learnings across different sites. A unique ensemble feature selection strategy was implemented, combining the strengths of XGBoost's feature importance and statistical test p values. This strategy aimed to optimize the selection of relevant features for accurate predictions. The federated learning models demonstrated significant improvements over locally trained models, with a 9% increase in the area under the curve (AUC) and a 24% rise in true positive ratio (TPR). Notably, the FL models excelled in the combined TPR + TNR metric, which is critical for feature selection in candidemia prediction. The ensemble feature selection method proved more efficient than previous approaches, achieving comparable performance. The study successfully developed a set of federated learning models that significantly enhance the prediction of candidemia in ICU patients. By leveraging a novel feature selection method and maintaining patient privacy, the models provide a robust framework for improved clinical decision-making in the treatment of candidemia.

Assuntos

Candidemia , Unidades de Terapia Intensiva , Aprendizado de Máquina , Humanos , Candidemia/tratamento farmacológico , Candidemia/diagnóstico , Antifúngicos/uso terapêutico , China , Masculino , Feminino , Atenção à Saúde

9.

Characterizing efficient feature selection for single-cell expression analysis.

Cho, Juok; Baik, Bukyung; Nguyen, Hai C T; Park, Daeui; Nam, Dougu.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38975891

RESUMO

Unsupervised feature selection is a critical step for efficient and accurate analysis of single-cell RNA-seq data. Previous benchmarks used two different criteria to compare feature selection methods: (i) proportion of ground-truth marker genes included in the selected features and (ii) accuracy of cell clustering using ground-truth cell types. Here, we systematically compare the performance of 11 feature selection methods for both criteria. We first demonstrate the discordance between these criteria and suggest using the latter. We then compare the distribution of selected genes in their means between feature selection methods. We show that lowly expressed genes exhibit seriously high coefficients of variation and are mostly excluded by high-performance methods. In particular, high-deviation- and high-expression-based methods outperform the widely used in Seurat package in clustering cells and data visualization. We further show they also enable a clear separation of the same cell type from different tissues as well as accurate estimation of cell trajectories.

Assuntos

Análise de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Humanos , Perfilação da Expressão Gênica/métodos , Algoritmos , Biologia Computacional/métodos , Análise de Sequência de RNA/métodos , RNA-Seq/métodos

10.

Smart decision support system for keratoconus severity staging using corneal curvature and thinnest pachymetry indices.

Muhsin, Zahra J; Qahwaji, Rami; AlShawabkeh, Mo'ath; AlRyalat, Saif Aldeen; Al Bdour, Muawyah; Al-Taee, Majid.

Eye Vis (Lond) ; 11(1): 28, 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38978067

RESUMO

BACKGROUND: This study proposes a decision support system created in collaboration with machine learning experts and ophthalmologists for detecting keratoconus (KC) severity. The system employs an ensemble machine model and minimal corneal measurements. METHODS: A clinical dataset is initially obtained from Pentacam corneal tomography imaging devices, which undergoes pre-processing and addresses imbalanced sampling through the application of an oversampling technique for minority classes. Subsequently, a combination of statistical methods, visual analysis, and expert input is employed to identify Pentacam indices most correlated with severity class labels. These selected features are then utilized to develop and validate three distinct machine learning models. The model exhibiting the most effective classification performance is integrated into a real-world web-based application and deployed on a web application server. This deployment facilitates evaluation of the proposed system, incorporating new data and considering relevant human factors related to the user experience. RESULTS: The performance of the developed system is experimentally evaluated, and the results revealed an overall accuracy of 98.62%, precision of 98.70%, recall of 98.62%, F1-score of 98.66%, and F2-score of 98.64%. The application's deployment also demonstrated precise and smooth end-to-end functionality. CONCLUSION: The developed decision support system establishes a robust basis for subsequent assessment by ophthalmologists before potential deployment as a screening tool for keratoconus severity detection in a clinical setting.

11.

Neuroimage Analysis Methods and Artificial Intelligence Techniques for Reliable Biomarkers and Accurate Diagnosis of Schizophrenia: Achievements Made by Chinese Scholars Around the Past Decade.

Du, Yuhui; Niu, Ju; Xing, Ying; Li, Bang; Calhoun, Vince D.

Schizophr Bull ; 2024 Jul 09.

Artigo em Inglês | MEDLINE | ID: mdl-38982882

RESUMO

BACKGROUND AND HYPOTHESIS: Schizophrenia (SZ) is characterized by significant cognitive and behavioral disruptions. Neuroimaging techniques, particularly magnetic resonance imaging (MRI), have been widely utilized to investigate biomarkers of SZ, distinguish SZ from healthy conditions or other mental disorders, and explore biotypes within SZ or across SZ and other mental disorders, which aim to promote the accurate diagnosis of SZ. In China, research on SZ using MRI has grown considerably in recent years. STUDY DESIGN: The article reviews advanced neuroimaging and artificial intelligence (AI) methods using single-modal or multimodal MRI to reveal the mechanism of SZ and promote accurate diagnosis of SZ, with a particular emphasis on the achievements made by Chinese scholars around the past decade. STUDY RESULTS: Our article focuses on the methods for capturing subtle brain functional and structural properties from the high-dimensional MRI data, the multimodal fusion and feature selection methods for obtaining important and sparse neuroimaging features, the supervised statistical analysis and classification for distinguishing disorders, and the unsupervised clustering and semi-supervised learning methods for identifying neuroimage-based biotypes. Crucially, our article highlights the characteristics of each method and underscores the interconnections among various approaches regarding biomarker extraction and neuroimage-based diagnosis, which is beneficial not only for comprehending SZ but also for exploring other mental disorders. CONCLUSIONS: We offer a valuable review of advanced neuroimage analysis and AI methods primarily focused on SZ research by Chinese scholars, aiming to promote the diagnosis, treatment, and prevention of SZ, as well as other mental disorders, both within China and internationally.

12.

Predicting performance of students by optimizing tree components of random forest using genetic algorithm.

Chen, Mengyao; Liu, Zhengqi.

Heliyon ; 10(12): e32570, 2024 Jun 30.

Artigo em Inglês | MEDLINE | ID: mdl-38975140

RESUMO

Prediction of student academic performance is still a problem because of the limitations of the existing methods specifically low generalizability and lack of interpretability. This study suggests a new approach that deals with the current problems and provides more reliable predictions. The proposed approach combines the information gain (IG) and Laplacian score (LS) for feature selection. In this feature selection scheme, combination of IG and LS is used for ranking features and then, Sequential Forward Selection mechanism is used for determining the most relevant indicators. Also, combination of random forest algorithm with a genetic algorithm for is introduced for multi-class classification. This approach strives to attain more accuracy and reliability than current techniques. The case study shows the proposed strategy can predict performance of students with average accuracy of 93.11 % which shows a minimum improvement of 2.25 % compared to the baseline methods. The findings were further confirmed by the analysis of different evaluation metrics (Accuracy, Precision, Recall, F-Measure) to prove the efficiency of the proposed mechanism.

13.

Hybrid YSGOA and neural networks based software failure prediction in cloud systems.

Kaur, Ramandeep; Vaithiyanathan, Revathi.

Sci Rep ; 14(1): 16035, 2024 Jul 11.

Artigo em Inglês | MEDLINE | ID: mdl-38992079

RESUMO

In the realm of cloud computing, ensuring the dependability and robustness of software systems is paramount. The intricate and evolving nature of cloud infrastructures, however, presents substantial obstacles in the pre-emptive identification and rectification of software anomalies. This study introduces an innovative methodology that amalgamates hybrid optimization algorithms with Neural Networks (NN) to refine the prediction of software malfunctions. The core objective is to augment the purity metric of our method across diverse operational conditions. This is accomplished through the utilization of two distinct optimization algorithms: the Yellow Saddle Goat Fish Algorithm (YSGA), which is instrumental in the discernment of pivotal features linked to software failures, and the Grasshopper Optimization Algorithm (GOA), which further polishes the feature compilation. These features are then processed by Neural Networks (NN), capitalizing on their proficiency in deciphering intricate data patterns and interconnections. The NNs are integral to the classification of instances predicated on the ascertained features. Our evaluation, conducted using the Failure-Dataset-OpenStack database and MATLAB Software, demonstrates that the hybrid optimization strategy employed for feature selection significantly curtails complexity and expedites processing.

14.

Exploiting the Role of Features for Antigens-Antibodies Interaction Site Prediction.

Quadrini, Michela; Ferrari, Carlo.

Methods Mol Biol ; 2780: 303-325, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38987475

RESUMO

Antibodies are a class of proteins that recognize and neutralize pathogens by binding to their antigens. They are the most significant category of biopharmaceuticals for both diagnostic and therapeutic applications. Understanding how antibodies interact with their antigens plays a fundamental role in drug and vaccine design and helps to comprise the complex antigen binding mechanisms. Computational methods for predicting interaction sites of antibody-antigen are of great value due to the overall cost of experimental methods. Machine learning methods and deep learning techniques obtained promising results.In this work, we predict antibody interaction interface sites by applying HSS-PPI, a hybrid method defined to predict the interface sites of general proteins. The approach abstracts the proteins in terms of hierarchical representation and uses a graph convolutional network to classify the amino acids between interface and non-interface. Moreover, we also equipped the amino acids with different sets of physicochemical features together with structural ones to describe the residues. Analyzing the results, we observe that the structural features play a fundamental role in the amino acid descriptions. We compare the obtained performances, evaluated using standard metrics, with the ones obtained with SVM with 3D Zernike descriptors, Parapred, Paratome, and Antibody i-Patch.

Assuntos

Biologia Computacional , Biologia Computacional/métodos , Antígenos/imunologia , Sítios de Ligação de Anticorpos , Anticorpos/imunologia , Anticorpos/química , Humanos , Complexo Antígeno-Anticorpo/química , Complexo Antígeno-Anticorpo/imunologia , Ligação Proteica , Aprendizado de Máquina , Bases de Dados de Proteínas , Algoritmos

15.

Efficient Generalized Electroencephalography-Based Drowsiness Detection Approach with Minimal Electrodes.

Zayed, Aymen; Belhadj, Nidhameddine; Ben Khalifa, Khaled; Bedoui, Mohamed Hedi; Valderrama, Carlos.

Sensors (Basel) ; 24(13)2024 Jun 30.

Artigo em Inglês | MEDLINE | ID: mdl-39001037

RESUMO

Drowsiness is a main factor for various costly defects, even fatal accidents in areas such as construction, transportation, industry and medicine, due to the lack of monitoring vigilance in the mentioned areas. The implementation of a drowsiness detection system can greatly help to reduce the defects and accident rates by alerting individuals when they enter a drowsy state. This research proposes an electroencephalography (EEG)-based approach for detecting drowsiness. EEG signals are passed through a preprocessing chain composed of artifact removal and segmentation to ensure accurate detection followed by different feature extraction methods to extract the different features related to drowsiness. This work explores the use of various machine learning algorithms such as Support Vector Machine (SVM), the K nearest neighbor (KNN), the Naive Bayes (NB), the Decision Tree (DT), and the Multilayer Perceptron (MLP) to analyze EEG signals sourced from the DROZY database, carefully labeled into two distinct states of alertness (awake and drowsy). Segmentation into 10 s intervals ensures precise detection, while a relevant feature selection layer enhances accuracy and generalizability. The proposed approach achieves high accuracy rates of 99.84% and 96.4% for intra (subject by subject) and inter (cross-subject) modes, respectively. SVM emerges as the most effective model for drowsiness detection in the intra mode, while MLP demonstrates superior accuracy in the inter mode. This research offers a promising avenue for implementing proactive drowsiness detection systems to enhance occupational safety across various industries.

Assuntos

Eletroencefalografia , Fases do Sono , Máquina de Vetores de Suporte , Humanos , Eletroencefalografia/métodos , Fases do Sono/fisiologia , Algoritmos , Eletrodos , Processamento de Sinais Assistido por Computador , Teorema de Bayes , Aprendizado de Máquina

16.

Optimizing IoT Intrusion Detection Using Balanced Class Distribution, Feature Selection, and Ensemble Machine Learning Techniques.

Musthafa, Muhammad Bisri; Huda, Samsul; Kodera, Yuta; Ali, Md Arshad; Araki, Shunsuke; Mwaura, Jedidah; Nogami, Yasuyuki.

Sensors (Basel) ; 24(13)2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-39001072

RESUMO

Internet of Things (IoT) devices are leading to advancements in innovation, efficiency, and sustainability across various industries. However, as the number of connected IoT devices increases, the risk of intrusion becomes a major concern in IoT security. To prevent intrusions, it is crucial to implement intrusion detection systems (IDSs) that can detect and prevent such attacks. IDSs are a critical component of cybersecurity infrastructure. They are designed to detect and respond to malicious activities within a network or system. Traditional IDS methods rely on predefined signatures or rules to identify known threats, but these techniques may struggle to detect novel or sophisticated attacks. The implementation of IDSs with machine learning (ML) and deep learning (DL) techniques has been proposed to improve IDSs' ability to detect attacks. This will enhance overall cybersecurity posture and resilience. However, ML and DL techniques face several issues that may impact the models' performance and effectiveness, such as overfitting and the effects of unimportant features on finding meaningful patterns. To ensure better performance and reliability of machine learning models in IDSs when dealing with new and unseen threats, the models need to be optimized. This can be done by addressing overfitting and implementing feature selection. In this paper, we propose a scheme to optimize IoT intrusion detection by using class balancing and feature selection for preprocessing. We evaluated the experiment on the UNSW-NB15 dataset and the NSL-KD dataset by implementing two different ensemble models: one using a support vector machine (SVM) with bagging and another using long short-term memory (LSTM) with stacking. The results of the performance and the confusion matrix show that the LSTM stacking with analysis of variance (ANOVA) feature selection model is a superior model for classifying network attacks. It has remarkable accuracies of 96.92% and 99.77% and overfitting values of 0.33% and 0.04% on the two datasets, respectively. The model's ROC is also shaped with a sharp bend, with AUC values of 0.9665 and 0.9971 for the UNSW-NB15 dataset and the NSL-KD dataset, respectively.

17.

Traffic Feature Selection and Distributed Denial of Service Attack Detection in Software-Defined Networks Based on Machine Learning.

Han, Daoqi; Li, Honghui; Fu, Xueliang; Zhou, Shuncheng.

Sensors (Basel) ; 24(13)2024 Jul 04.

Artigo em Inglês | MEDLINE | ID: mdl-39001123

RESUMO

As 5G technology becomes more widespread, the significant improvement in network speed and connection density has introduced more challenges to network security. In particular, distributed denial of service (DDoS) attacks have become more frequent and complex in software-defined network (SDN) environments. The complexity and diversity of 5G networks result in a great deal of unnecessary features, which may introduce noise into the detection process of an intrusion detection system (IDS) and reduce the generalization ability of the model. This paper aims to improve the performance of the IDS in 5G networks, especially in terms of detection speed and accuracy. It proposes an innovative feature selection (FS) method to filter out the most representative and distinguishing features from network traffic data to improve the robustness and detection efficiency of the IDS. To confirm the suggested method's efficacy, this paper uses four common machine learning (ML) models to evaluate the InSDN, CICIDS2017, and CICIDS2018 datasets and conducts real-time DDoS attack detection on the simulation platform. According to experimental results, the suggested FS technique may match 5G network requirements for high speed and high reliability of the IDS while also drastically cutting down on detection time and preserving or improving DDoS detection accuracy.

18.

Role of different omics data in the diagnosis of schizophrenia disorder: A machine learning study.

Varathan, Aarthy; Senthooran, Suntharalingam; Jeyananthan, Pratheeba.

Schizophr Res ; 271: 38-46, 2024 Jul 13.

Artigo em Inglês | MEDLINE | ID: mdl-39003990

RESUMO

Schizophrenia is a serious mental disorder that affects millions of people worldwide. This disorder slowly disintegrates thinking ability and changes behaviours of patients. These patients will show some psychotic symptoms such as hallucinations, delusions, thought disorder and movement disorder. These symptoms are in common with some other psychiatric disorders such as bipolar disorder, major depressive disorder and mood spectrum disorder. As patients would require immediate treatment, an on-time diagnosis is critical. This study explores the use of omics data in diagnosis of schizophrenia. Transcriptome, miRNA and epigenome data are used in diagnosis of patients with schizophrenia with the aid of machine learning algorithms. As the data is in high dimension, mutual information and feature importance are independently used for selecting relevant features for the study. Selected sets of features (biomarkers) are individually used with different machine learning algorithms and their performances are compared to select the best-performing model. This study shows that the top 140 miRNA features selected using mutual information along with support vector machines give the highest accuracy (0.86 ± 0.07) in the diagnosis of schizophrenia. All reported accuracies are validated using 5-fold cross validation. They are further validated using leave one out cross validation and the accuracies are reported in the supplementary material.

19.

Cross-Combination Analyses of Random Forest Feature Selection and Decision Tree Model for Predicting Intraoperative Hypothermia in Total Joint Arthroplasty.

Long, Keyu; Guo, Donghua; Deng, Lu; Shen, Haiyan; Zhou, Feiyang; Yang, Yan.

J Arthroplasty ; 2024 Jul 12.

Artigo em Inglês | MEDLINE | ID: mdl-39004384

RESUMO

BACKGROUND: In total joint arthroplasty patients, intraoperative hypothermia (IOH) is associated with perioperative complications and an increased economic burden. Previous models have some limitations and mainly focus on regression modeling. Random forest (RF) algorithms and decision tree modeling are effective for eliminating irrelevant features and making predictions that aid in accelerating modeling and reducing application difficulty. METHODS: We conducted this prospective observational study using convenience sampling and collected data from 327 total joint arthroplasty patients in a tertiary hospital from March 4, 2023 to September 11, 2023. Of those, 229 patients were assigned to the training and 98 to the testing sets. The Chi-square, Mann-Whitney U, and t-tests were used for baseline analyses. The feature variables selection used the RF algorithms, and the decision tree model was trained on 299 examples and validated on 98. The sensitivity, specificity, recall, F1 score, and area under the curve (AUC) were used to test the model's performance. RESULTS: The RF algorithms identified the preheating time, the volume of flushing fluids, the intraoperative infusion volume, the anesthesia time, the surgical time, and the core temperature after intubation as risk factors for IOH. The decision tree was grown to five levels with nine terminal nodes. The overall incidence of IOH was 42.13%. The sensitivity, specificity, recall, F1 score, and AUC were 0.651, 0.907, 0.916, 0.761, and 0.810, respectively. The model indicated strong internal consistency and predictive ability. CONCLUSIONS: The preheating time, the volume of flushing fluids, the intraoperative infusion volume, the anesthesia time, the surgical time, and the core temperature after intubation could accurately predict IOH in total joint arthroplasty patients. By monitoring these factors, the clinical staff could achieve early detection and intervention of IOH in total joint arthroplasty patients.

20.

A two-tier feature selection method for predicting mortality risk in ICU patients with acute kidney injury.

Liu, Mengqing; Fan, Zhiping; Gao, Yu; Mubonanyikuzo, Vivens; Wu, Ruiqian; Li, Wenjin; Xu, Naiyue; Liu, Kun; Zhou, Liang.

Sci Rep ; 14(1): 16794, 2024 Jul 22.

Artigo em Inglês | MEDLINE | ID: mdl-39039115

RESUMO

Acute kidney injury (AKI) is one of the most important lethal factors for patients admitted to intensive care units (ICUs), and timely high-risk prognostic assessment and intervention are essential to improving patient prognosis. In this study, a stacking model using the MIMIC-III dataset with a two-tier feature selection approach was developed to predict the risk of in-hospital mortality in ICU patients admitted for AKI. External validation was performed using separate MIMIC-IV and eICU-CRD. The area under the curve (AUC) was calculated using the stacking model, and features were selected using the Boruta and XGBoost feature selection methods. This study compares the performance of a stacking model using two-tier feature selection with a model using single-tier feature selection (XGBoost: 85; Boruta: 83; two-tier: 0.91). The predictive effectiveness of the stacking model was further validated by using different datasets (Validation 1: 0.83; Validation 2: 0.85) and comparing it with a simpler model and traditional clinical scores (SOFA: 0.65; APACH IV: 0.61). In addition, this study combined interpretable techniques and causal inference to analyze the causal relationship between features and predicted outcomes.

Assuntos

Injúria Renal Aguda , Mortalidade Hospitalar , Unidades de Terapia Intensiva , Humanos , Injúria Renal Aguda/mortalidade , Masculino , Feminino , Prognóstico , Pessoa de Meia-Idade , Idoso , Medição de Risco/métodos , Área Sob a Curva , Fatores de Risco

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA