RESUMO
Background: Infections caused by antibiotic-resistant bacteria pose a major challenge to modern healthcare. This systematic review evaluates the efficacy of machine learning (ML) approaches in predicting antimicrobial resistance (AMR) in critical pathogens (CP), considering Whole Genome Sequencing (WGS) and antimicrobial susceptibility testing (AST). Methods: The search covered databases including PubMed/MEDLINE, EMBASE, Web of Science, SCOPUS, and SCIELO, from their inception until June 2024. The review protocol was officially registered on PROSPERO (CRD42024543099). Results: The review included 26 papers, analyzing data from 104,141 microbial samples. Random Forest (RF), XGBoost, and logistic regression (LR) emerged as the top-performing models, with mean Area Under the Receiver Operating Characteristic (AUC) values of 0.89, 0.87, and 0.87, respectively. RF showed superior performance with AUC values ranging from 0.66 to 0.97, while XGBoost and LR showed similar performance with AUC values ranging from 0.83 to 0.91 and 0.76 to 0.96, respectively. Most studies indicate that integrating WGS and AST data into ML models enhances predictive performance, improves antibiotic stewardship, and provides valuable clinical decision support. ML shows significant promise for predicting AMR by integrating WGS and AST data in CP. Standardized guidelines are needed to ensure consistency in future research.
Assuntos
Farmacorresistência Bacteriana , Aprendizado de Máquina , Testes de Sensibilidade Microbiana , Sequenciamento Completo do Genoma , Humanos , Farmacorresistência Bacteriana/genética , Antibacterianos/uso terapêutico , Antibacterianos/farmacologia , Bactérias/efeitos dos fármacos , Bactérias/genéticaRESUMO
Urban Heat Islands are a major environmental and public health concern, causing temperature increase in urban areas. This study used satellite imagery and machine learning to analyze the spatial and temporal patterns of land surface temperature distribution in the Metropolitan Area of Merida (MAM), Mexico, from 2001 to 2021. The results show that land surface temperature has increased in the MAM over the study period, while the urban footprint has expanded. The study also found a high correlation (r> 0.8) between changes in land surface temperature and land cover classes (urbanization/deforestation). If the current urbanization trend continues, the difference between the land surface temperature of the MAM and its surroundings is expected to reach 3.12 °C ± 1.11 °C by the year 2030. Hence, the findings of this study suggest that the Urban Heat Island effect is a growing problem in the MAM and highlight the importance of satellite imagery and machine learning for monitoring and developing mitigation strategies.
RESUMO
The integration of machine learning (ML) with edge computing and wearable devices is rapidly advancing healthcare applications. This study systematically maps the literature in this emerging field, analyzing 171 studies and focusing on 28 key articles after rigorous selection. The research explores the key concepts, techniques, and architectures used in healthcare applications involving ML, edge computing, and wearable devices. The analysis reveals a significant increase in research over the past six years, particularly in the last three years, covering applications such as fall detection, cardiovascular monitoring, and disease prediction. The findings highlight a strong focus on neural network models, especially Convolutional Neural Networks (CNNs) and Long Short-Term Memory Networks (LSTMs), and diverse edge computing platforms like Raspberry Pi and smartphones. Despite the diversity in approaches, the field is still nascent, indicating considerable opportunities for future research. The study emphasizes the need for standardized architectures and the further exploration of both hardware and software to enhance the effectiveness of ML-driven healthcare solutions. The authors conclude by identifying potential research directions that could contribute to continued innovation in healthcare technologies.
Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Dispositivos Eletrônicos Vestíveis , Humanos , Atenção à Saúde , Smartphone , Monitorização Fisiológica/instrumentação , Monitorização Fisiológica/métodosRESUMO
Muscle tone is defined as the resistance to passive stretch, but this definition is often criticized for its ambiguity since some suggest it is related to a state of preparation for movement. Muscle tone is primarily regulated by the central nervous system, and individuals with neurological disorders may lose the ability to control normal tone and can exhibit abnormalities. Currently, these abnormalities are mostly evaluated using subjective scales, highlighting a lack of objective assessment methods in the literature. This study aimed to use surface electromyography (sEMG) and machine learning (ML) for the objective classification and characterization of the full spectrum of muscle tone in the upper limb. Data were collected from thirty-nine individuals, including spastic, healthy, hypotonic and rigid subjects. All of the classifiers applied achieved high accuracy, with the best reaching 96.12%, in differentiating muscle tone. These results underscore the potential of the proposed methodology as a more reliable and quantitative method for evaluating muscle tone abnormalities, aiming to address the limitations of traditional subjective assessments. Additionally, the main features impacting the classifiers' performance were identified, which can be utilized in future research and in the development of devices that can be used in clinical practice.
Assuntos
Eletromiografia , Aprendizado de Máquina , Tono Muscular , Humanos , Eletromiografia/métodos , Masculino , Adulto , Feminino , Tono Muscular/fisiologia , Músculo Esquelético/fisiologia , Adulto JovemRESUMO
Protamines play a critical role in DNA compaction and stabilization in sperm cells, significantly influencing male fertility and various biotechnological applications. Traditionally, identifying these proteins is a challenging and time-consuming process due to their species-specific variability and complexity. Leveraging advancements in computational biology, we present PROTA, a novel tool that combines machine learning (ML) and deep learning (DL) techniques to predict protamines with high accuracy. For the first time, we integrate Generative Adversarial Networks (GANs) with supervised learning methods to enhance the accuracy and generalizability of protamine prediction. Our methodology evaluated multiple ML models, including Light Gradient-Boosting Machine (LIGHTGBM), Multilayer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBOOST), k-Nearest Neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), and Radial Basis Function-Support Vector Machine (RBF-SVM). During ten-fold cross-validation on our training dataset, the MLP model with GAN-augmented data demonstrated superior performance metrics: 0.997 accuracy, 0.997 F1 score, 0.998 precision, 0.997 sensitivity, and 1.0 AUC. In the independent testing phase, this model achieved 0.999 accuracy, 0.999 F1 score, 1.0 precision, 0.999 sensitivity, and 1.0 AUC. These results establish PROTA, accessible via a user-friendly web application. We anticipate that PROTA will be a crucial resource for researchers, enabling the rapid and reliable prediction of protamines, thereby advancing our understanding of their roles in reproductive biology, biotechnology, and medicine.
Assuntos
Aprendizado Profundo , Aprendizado de Máquina , Protaminas , Protaminas/metabolismo , Biologia Computacional/métodos , Máquina de Vetores de Suporte , Humanos , SoftwareRESUMO
OBJECTIVES: To predict palatally impacted maxillary canines based on maxilla measurements through supervised machine learning techniques. MATERIALS AND METHODS: The maxilla images from 138 patients were analysed to investigate intermolar width, interpremolar width, interpterygoid width, maxillary length, maxillary width, nasal cavity width and nostril width, obtained through cone beam computed tomography scans. The predictive models were built using the following machine learning algorithms: Adaboost Classifier, Decision Tree, Gradient Boosting Classifier, K-Nearest Neighbours (KNN), Logistic Regression, Multilayer Perceptron Classifier (MLP), Random Forest Classifier and Support Vector Machine (SVM). A 5-fold cross-validation approach was employed to validate each model. Metrics such as area under the curve (AUC), accuracy, recall, precision and F1 Score were calculated for each model, and ROC curves were constructed. RESULTS: The predictive model included four variables (two dental and two skeletal measurements). The interpterygoid width and nostril width showed the largest effect sizes. The Gradient Boosting Classifier algorithm exhibited the best metrics, with AUC values ranging from 0.91 [CI95% = 0.74-0.98] for test data to 0.89 [CI95% = 0.86-0.94] for crossvalidation. The nostril width variable demonstrated the highest importance across all tested algorithms. CONCLUSION: The use of maxillary measurements, through supervised machine learning techniques, is a promising method for predicting palatally impacted maxillary canines. Among the models evaluated, both the Gradient Boosting Classifier and the Random Forest Classifier demonstrated the best performance metrics, with accuracy and AUC values exceeding 0.8, indicating strong predictive capability.
RESUMO
Urochloa grasses are widely used forages in the Neotropics and are gaining importance in other regions due to their role in meeting the increasing global demand for sustainable agricultural practices. High-throughput phenotyping (HTP) is important for accelerating Urochloa breeding programs focused on improving forage and seed yield. While RGB imaging has been used for HTP of vegetative traits, the assessment of phenological stages and seed yield using image analysis remains unexplored in this genus. This work presents a dataset of 2,400 high-resolution RGB images of 200 Urochloa hybrid genotypes, captured over seven months and covering both vegetative and reproductive stages. Images were manually labelled as vegetative or reproductive, and a subset of 255 reproductive stage images were annotated to identify 22,340 individual racemes. This dataset enables the development of machine learning and deep learning models for automated phenological stage classification and raceme identification, facilitating HTP and accelerated breeding of Urochloa spp. hybrids with high seed yield potential.
RESUMO
Objective: This study evaluates machine learning algorithms' effectiveness in classifying Parkinson's disease and Huntington's disease based on biomarker data obtained non-invasively from patients and healthy controls. Methods: Datasets containing biomarker data (x, y, and z values of accelerometers) from sensors were collected from Parkinson's disease, Huntington's disease patients, and healthy controls. An automatic selection model method was implemented for disease classification, using a unique Mexican database of human gait biomarkers, which we consider the only one of its kind. Random forest, random subspace method, and K-star algorithms were employed, with parameters optimized through an automated model selection. Results: The study achieved a 0.893 precision rate for Parkinson's disease and Huntington's disease using the random subspace method. The findings underscore the potential of machine learning techniques in medical diagnosis, particularly in neurological disorders. Conclusion: The automatic selection model method demonstrated efficacy in classifying Parkinson's disease and Huntington's disease based on non-invasive biomarker data. This research contributes to advancing non-invasive diagnostic approaches in neurological disorders, highlighting the significance of machine learning in healthcare.
RESUMO
BACKGROUND: Battling malaria's morbidity and mortality rates demands innovative methods related to malaria diagnosis. Thick blood smears (TBS) are the gold standard for diagnosing malaria, but their coloration quality is dependent on supplies and adherence to standard protocols. Machine learning has been proposed to automate diagnosis, but the impact of smear coloration on parasite detection has not yet been fully explored. METHODS: To develop Coloration Analysis in Malaria (CAM), an image database containing 600 images was created. The database was randomly divided into training (70%), validation (15%), and test (15%) sets. Nineteen feature vectors were studied based on variances, correlation coefficients, and histograms (specific variables from histograms, full histograms, and principal components from the histograms). The Machine Learning Matlab Toolbox was used to select the best candidate feature vectors and machine learning classifiers. The candidate classifiers were then tuned for validation and tested to ultimately select the best one. RESULTS: This work introduces CAM, a machine learning system designed for automatic TBS image quality analysis. The results demonstrated that the cubic SVM classifier outperformed others in classifying coloration quality in TBS, achieving a true negative rate of 95% and a true positive rate of 97%. CONCLUSIONS: An image-based approach was developed to automatically evaluate the coloration quality of TBS. This finding highlights the potential of image-based analysis to assess TBS coloration quality. CAM is intended to function as a supportive tool for analyzing the coloration quality of thick blood smears.
Assuntos
Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodos , Humanos , Malária , CorRESUMO
Breast cancer is a highly heterogeneous disease characterized by different subtypes arising from molecular alterations that give the disease different phenotypes, clinical behaviors, and prognostic. The ncRNA-derived micropeptides (MPs) represent a novel layer of complexity in cancer study once they can be biologically active and can present potential as biomarkers and also in therapeutics. However, few large-scale studies address the expression of these peptides at the peptidomics level or evaluate their functions and potential in peptide-based therapeutics for breast cancer. In this study, we propose deepening the landscape of ncRNA-derived MPs in breast cancer subtypes and advance the comprehension of the relevance of these molecules to the disease. Firstly, we constructed a 16,349 unique putative MP sequence dataset by integrating two previously published lists of predicted ncRNA-derived MPs. We evaluated its expression on high-throughput mass spectrometry data of breast tumor samples from different subtypes. Next, we applied several machine and deep learning tools, such as AntiCP 2.0, MULocDeep, PEPstrMOD, Peptipedia, and PreAIP, to predict its functions, cellular localization, tertiary structure, physicochemical features, and other properties related to therapeutics. We identified 58 peptides expressed on breast tissue, including 27 differentially expressed MPs in tumor compared to non-tumor samples and MPs exhibiting tumor or subtype specificity. These peptides presented physicochemical features compatible with the canonical proteome and were predicted to influence the tumor immune environment and participate in cell communication, metabolism, and signaling processes. Also, some MPs presented potential as anti-cancer, anti-inflammatory, and anti-angiogenic molecules. Our data demonstrate that MPs derived from ncRNAs have expression patterns associated with specific breast cancer subtypes and tumor specificity, thus highlighting their potential as biomarkers for molecular classification. We also reinforce the relevance of MPs as biologically active molecules that play a role in breast tumorigenesis, besides their potential in peptide-based therapeutics.
RESUMO
This research evaluates the application of advanced machine learning algorithms, specifically Random Forest and Gradient Boosting, for the imputation of missing data in solar energy generation databases and their impact on the size of green hydrogen production systems. The study demonstrates that the Random Forest model notably excels in harnessing solar data to optimize hydrogen production, achieving superior prediction accuracy with mean absolute error (MAE) of 0.0364, mean squared error (MSE) of 0.0097, root mean squared error (RMSE) of 0.0985, and a coefficient of determination (R2) of 0.9779. These metrics surpass those obtained from baseline models including linear regression and recurrent neural networks, highlighting the potential of accurate imputation to significantly enhance the efficiency and output of renewable energy systems. The findings advocate for the integration of robust data imputation methods in the design and operation of photovoltaic systems, contributing to the reliability and sustainability of energy resource management. Furthermore, this research makes significant contributions by showcasing the comparative performance of traditional machine learning models in handling data gaps, emphasizing the practical implications of data imputation on optimizing hydrogen production systems. By providing a detailed analysis and validation of the imputation models, this work offers valuable insights for future advancements in renewable energy technology.
RESUMO
Neutrophils are the innate immune system's first line of defense, and their storage organelles are essential to their function. The storage organelles are divided into three different granule types named azurophilic, specific, and gelatinase granules, besides a fourth component called secretory vesicles. The isolation of neutrophil's granules is challenging, and the existing procedures rely on large sample volumes, about 400 mL of peripheral blood, precluding the use of multiple biological and technical replicates. Therefore, the aim of this study was to develop a miniaturized isolation of neutrophil granules (MING) method, using biochemical assays, mass spectrometry-based proteomics and a machine learning approach to investigate the protein content of these organelles. Neutrophils were isolated from 40 mL of blood collected from three apparently healthy volunteers and disrupted using nitrogen cavitation; the organelles were fractionated with a discontinuous 3-layer Percoll density gradient. The isolation was proven successful and allowed for a reasonable separation of neutrophil's storage organelles using a gradient approximately 37 times smaller than the methods described in the literature. Moreover, mass spectrometry-based proteomics identified 368 proteins in at least 3 of the 5 analyzed samples, and using a machine learning strategy aligned with markers from the literature, the localization of 50 proteins was predicted with confidence. When using markers determined within our dataset by a clusterization tool, the localization of 348 proteins was confidently determined. Importantly, this study was the first to investigate the proteome of neutrophil granules using technical and biological replicates, creating a reliable database for further studies.
RESUMO
INTRODUCTION AND OBJECTIVES: With rising prevalence of pre-sarcopenia in metabolic dysfunction-associated steatotic liver disease (MASLD), this study aimed to develop and validate machine learning-based model to identify pre-sarcopenia in MASLD population. MATERIALS AND METHODS: A total of 571 MASLD subjects were screened from the National Health and Nutrition Examination Survey 2017-2018. This cohort was randomly divided into training set and internal testing set with a ratio of 7:3. Sixty-six MASLD subjects were collected from our institution as external validation set. Four binary classifiers, including Random Forest (RF), support vector machine, and extreme gradient boosting and logistic regression, were fitted to identify pre-sarcopenia. The best-performing model was further validated in external validation set. Model performance was assessed in terms of discrimination and calibration. Shapley Additive explanations were used for model interpretability. RESULTS: The pre-sarcopenia rate was 17.51 % and 15.16 % in NHANES cohort and external validation set, respectively. RF outperformed other models with area under receiver operating characteristic curve (AUROC) of 0.819 (95 %CI: 0.749, 0.889). When six top-ranking features were retained as per variable importance, including weight-adjusted waist, sex, race, creatinine, education and alkaline phosphatase, a final RF model reached an AUROC being 0.824 (0.737, 0.910) and 0.732 (95 %CI: 0.529, 0.936) in internal and external validation sets, respectively. The model robustness was proved in sensitivity analysis. The calibration curve and decision curve analysis confirmed a good calibration capacity and good clinical usage. CONCLUSIONS: This study proposed a user-friendly model using explainable machine learning algorithm to predict pre-sarcopenia in MASLD population. A web-based tool was provided to screening pre-sarcopenia in community and hospitalization settings.
RESUMO
Objective: To conduct a systematic review of external validation studies on the use of different Artificial Intelligence algorithms in breast cancer screening with mammography. Data source: Our systematic review was conducted and reported following the PRISMA statement, using the PubMed, EMBASE, and Cochrane databases with the search terms "Artificial Intelligence," "Mammography," and their respective MeSH terms. We filtered publications from the past ten years (2014 - 2024) and in English. Study selection: A total of 1,878 articles were found in the databases used in the research. After removing duplicates (373) and excluding those that did not address our PICO question (1,475), 30 studies were included in this work. Data collection: The data from the studies were collected independently by five authors, and it was subsequently synthesized based on sample data, location, year, and their main results in terms of AUC, sensitivity, and specificity. Data synthesis: It was demonstrated that the Area Under the ROC Curve (AUC) and sensitivity were similar to those of radiologists when using independent Artificial Intelligence. When used in conjunction with radiologists, statistically higher accuracy in mammogram evaluation was reported compared to the assessment by radiologists alone. Conclusion: AI algorithms have emerged as a means to complement and enhance the performance and accuracy of radiologists. They also assist less experienced professionals in detecting possible lesions. Furthermore, this tool can be used to complement and improve the analyses conducted by medical professionals.
Assuntos
Inteligência Artificial , Neoplasias da Mama , Mamografia , Mamografia/métodos , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Detecção Precoce de Câncer/métodos , Sensibilidade e Especificidade , Algoritmos , Estudos de Validação como AssuntoRESUMO
OBJECTIVE: This study introduces the complete blood count (CBC), a standard prenatal screening test, as a biomarker for diagnosing preeclampsia with severe features (sPE), employing machine learning models. METHODS: We used a boosting machine learning model fed with synthetic data generated through a new methodology called DAS (Data Augmentation and Smoothing). Using data from a Brazilian study including 132 pregnant women, we generated 3,552 synthetic samples for model training. To improve interpretability, we also provided a ridge regression model. RESULTS: Our boosting model obtained an AUROC of 0.90±0.10, sensitivity of 0.95, and specificity of 0.79 to differentiate sPE and non-PE pregnant women, using CBC parameters of neutrophils count, mean corpuscular hemoglobin (MCH), and the aggregate index of systemic inflammation (AISI). In addition, we provided a ridge regression equation using the same three CBC parameters, which is fully interpretable and achieved an AUROC of 0.79±0.10 to differentiate the both groups. Moreover, we also showed that a monocyte count lower than 490 / m m 3 yielded a sensitivity of 0.71 and specificity of 0.72. CONCLUSION: Our study showed that ML-powered CBC could be used as a biomarker for sPE diagnosis support. In addition, we showed that a low monocyte count alone could be an indicator of sPE. SIGNIFICANCE: Although preeclampsia has been extensively studied, no laboratory biomarker with favorable cost-effectiveness has been proposed. Using artificial intelligence, we proposed to use the CBC, a low-cost, fast, and well-spread blood test, as a biomarker for sPE.
Assuntos
Biomarcadores , Aprendizado de Máquina , Pré-Eclâmpsia , Humanos , Pré-Eclâmpsia/diagnóstico , Pré-Eclâmpsia/sangue , Feminino , Gravidez , Biomarcadores/sangue , Contagem de Células Sanguíneas/métodos , Adulto , Sensibilidade e Especificidade , Brasil , Índice de Gravidade de Doença , Curva ROC , Diagnóstico Pré-Natal/métodosRESUMO
INTRODUCTION: Interictal epileptiform discharges (IEDs) in electroencephalograms (EEGs) are an important biomarker for epilepsy. Currently, the gold standard for IED detection is the visual analysis performed by experts. However, this process is expert-biased, and time-consuming. Developing fast, accurate, and robust detection methods for IEDs based on EEG may facilitate epilepsy diagnosis. We aim to assess the performance of deep learning (DL) and classic machine learning (ML) algorithms in classifying EEG segments into IED and non-IED categories, as well as distinguishing whether the entire EEG contains IED or not. METHODS: We systematically searched PubMed, Embase, and Web of Science following PRISMA guidelines. We excluded studies that only performed the detection of IEDs instead of binary segment classification. Risk of Bias was evaluated with Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2). Meta-analysis with the overall area under the Summary Receiver Operating Characteristic (SROC), sensitivity, and specificity as effect measures, was performed with R software. RESULTS: A total of 23 studies, comprising 3,629 patients, were eligible for synthesis. Eighteen models performed discharge-level classification, and 6 whole-EEG classification. For the IED-level classification, 3 models were validated in an external dataset with more than 50 patients and achieved a sensitivity of 84.9 % (95 % CI: 82.3-87.2) and a specificity of 68.7 % (95 % CI: 7.9-98.2). Five studies reported model performance using both internal validation (cross-validation) and external datasets. The meta-analysis revealed higher performance for internal validation, with 90.4 % sensitivity and 99.6 % specificity, compared to external validation, which showed 78.1 % sensitivity and 80.1 % specificity. CONCLUSION: Meta-analysis showed higher performance for models validated with resampling methods compared to those using external datasets. Only a minority of models use more robust validation techniques, which often leads to overfitting.
RESUMO
Climate change brings a range of challenges and opportunities to shrimp fisheries globally. The case of the Colombian Pacific Ocean (CPO) is notable due the crucial role of shrimps in the economy, supporting livelihoods for numerous families. However, the potential impacts of climate change on the distribution of shrimps loom large, making it urgent to scrutinize the prospective alterations that might unfurl across the CPO. Employing the Species Distribution Modeling approach under Global Circulation Model scenarios, we predicted the current and future potential distributions of five commercially important shrimps (Litopenaeus occidentalis, Xiphopenaeus riveti, Solenocera agassizii, Penaeus brevirostris, and Penaeus californiensis) based on an annual cycle, and considering the decades 2030 and 2050 under the Shared Socioeconomic Pathways SSP 2.6, SSP 4.5, SSP 7.0, and SSP 8.5. The Bathymetric Projection Method was utilized to obtain spatiotemporal ocean bottom predictors, giving the models more realism for reliable habitat predictions. Six spatiotemporal attributes were computed to gauge the changes in these distributions: area, depth range, spatial aggregation, percentage suitability change, gain or loss of areas, and seasonality. L. occidentalis and X. riveti exhibited favorable shifts during the initial semester for both decades and all scenarios, but unfavorable changes during the latter half of the year, primarily influenced by projected modifications in bottom salinity and bottom temperature. Conversely, for S. agassizii, P. brevirostris, and P. californiensis, predominantly negative changes surfaced across all months, decades, and scenarios, primarily driven by precipitation. These changes pose both threats and opportunities to shrimp fisheries in the CPO. However, their effects are not uniform across space and time. Instead, they form a mosaic of complex interactions that merit careful consideration when seeking practical solutions. These findings hold potential utility for informed decision-making, climate change mitigation, and adaptive strategies within the context of shrimp fisheries management in the CPO.
Assuntos
Mudança Climática , Pesqueiros , Penaeidae , Animais , Oceano Pacífico , Colômbia , Penaeidae/fisiologia , EcossistemaRESUMO
The current detection method for Chikungunya Virus (CHIKV) involves an invasive and costly molecular biology procedure as the gold standard diagnostic method. Consequently, the search for a non-invasive, more cost-effective, reagent-free, and sustainable method for the detection of CHIKV infection is imperative for public health. The portable Fourier-transform infrared coupled with Attenuated Total Reflection (ATR-FTIR) platform was applied to discriminate systemic diseases using saliva, however, the salivary diagnostic application in viral diseases is less explored. The study aimed to identify unique vibrational modes of salivary infrared profiles to detect CHIKV infection using chemometrics and artificial intelligence algorithms. Thus, we intradermally challenged interferon-gamma gene knockout C57/BL6 mice with CHIKV (20 µl, 1 X 105 PFU/ml, n = 6) or vehicle (20 µl, n = 7). Saliva and serum samples were collected on day 3 (due to the peak of viremia). CHIKV infection was confirmed by Real-time PCR in the serum of CHIKV-infected mice. The best pattern classification showed a sensitivity of 83%, specificity of 86%, and accuracy of 85% using support vector machine (SVM) algorithms. Our results suggest that the salivary ATR-FTIR platform can discriminate CHIKV infection with the potential to be applied as a non-invasive, sustainable, and cost-effective detection tool for this emerging disease.
Assuntos
Algoritmos , Inteligência Artificial , Febre de Chikungunya , Vírus Chikungunya , Saliva , Animais , Saliva/virologia , Febre de Chikungunya/diagnóstico , Febre de Chikungunya/virologia , Vírus Chikungunya/isolamento & purificação , Vírus Chikungunya/genética , Camundongos , Espectroscopia de Infravermelho com Transformada de Fourier/métodos , Camundongos Endogâmicos C57BL , Camundongos KnockoutRESUMO
BACKGROUND: ML predictive models have shown their capability to improve risk prediction and assist medical decision-making, nevertheless, there is a lack of accuracy systems to early identify future rapid CKD progressors in Colombia and even in South America. OBJECTIVE: The purpose of this study was to develop a series of interpretable machine learning models that predict GFR at 6-months, 9-months, and 12-months. STUDY DESIGN AND SETTING: Over 29,000 CKD patients stage 1 to 3b (estimated GFR, <60 mL/min/1.73 m2) with an average of 3-year follow-up data were included. We used the machine learning extreme gradient boosting (XGBoost) to build three models to predict the next eGFR. Models were internally and externally validated. In addition, we included SHapley Additive exPlanation (SHAP) values to offer interpretable global and local prediction models. RESULTS: All models showed a good performance in development and external validation. However, the 6-months XGBoost prediction model showed the best performance in internal (MAE average = 6.07; RSME = 78.87), and in external validation (MAE average = 6.45, RSME = 18.94). The top 3 most influential features that pushed the predicted eGFR value to lower values were the interpolated values for eGFR and creatinine, and eGFR at baseline. CONCLUSION: In the current study we have developed and validated machine learning models to predict the next eGFR value at different intervals. Furthermore, we attempted to approach the need for prediction explanation by offering transparent predictions.
RESUMO
BACKGROUND: Artificial intelligence (AI) algorithms for the detection of retinoblastoma (RB) by fundus image analysis have been proposed as a potentially effective technique to facilitate diagnosis and screening programs. However, doubts remain about the accuracy of the technique, the best type of AI for this situation, and its feasibility for everyday use. Therefore, we performed a systematic review and meta-analysis to evaluate this issue. METHODS: Following PRISMA 2020 guidelines, a comprehensive search of MEDLINE, Embase, ClinicalTrials.gov and IEEEX databases identified 494 studies whose titles and abstracts were screened for eligibility. We included diagnostic studies that evaluated the accuracy of AI in identifying retinoblastoma based on fundus imaging. Univariate and bivariate analysis was performed using the random effects model. The study protocol was registered in PROSPERO under CRD42024499221. RESULTS: Six studies with 9902 fundus images were included, of which 5944 (60%) had confirmed RB. Only one dataset used a semi-supervised machine learning (ML) based method, all other studies used supervised ML, three using architectures requiring high computational power and two using more economical models. The pooled analysis of all models showed a sensitivity of 98.2% (95% CI: 0.947-0.994), a specificity of 98.5% (95% CI: 0.916-0.998) and an AUC of 0.986 (95% CI: 0.970-0.989). Subgroup analyses comparing models with high and low computational power showed no significant difference (p=0.824). CONCLUSIONS: AI methods showed a high precision in the diagnosis of RB based on fundus images with no significant difference when comparing high and low computational power models, suggesting a viability of their use. Validation and cost-effectiveness studies are needed in different income countries. Subpopulations should also be analyzed, as AI may be useful as an initial screening tool in populations at high risk for RB, serving as a bridge to the pediatric ophthalmologist or ocular oncologist, who are scarce globally. KEY MESSAGES: What is known Retinoblastoma is the most common intraocular cancer in childhood and diagnostic delay is the main factor leading to a poor prognosis. The application of machine learning techniques proposes reliable methods for screening and diagnosis of retinal diseases. What is new The meta-analysis of the diagnostic accuracy of artificial intelligence methods for diagnosing retinoblastoma based on fundus images showed a sensitivity of 98.2% (95% CI: 0.947-0.994) and a specificity of 98.5% (95% CI: 0.916-0.998). There was no statistically significant difference in the diagnostic accuracy of high and low computational power models. The overall performance of supervised machine learning was best than unsupervised, although few studies were available on the second type.