RESUMO
ABSTRACT Purpose: To compare the refractive prediction error of Hill-radial basis function 3.0 with those of 3 conventional formulas and 11 combination methods in eyes with short axial lengths. Methods: The refractive prediction error was calculated using 4 formulas (Hoffer Q, SRK-T, Haigis, and Hill-RBF) and 11 combination methods (average of two or more methods). The absolute error was determined, and the proportion of eyes within 0.25-diopter (D) increments of absolute error was analyzed. Furthermore, the intraclass correlation coefficients of each method were computed to evaluate the agreement between target refractive error and postoperative spherical equivalent. Results: This study included 87 eyes. Based on the refractive prediction error findings, Hoffer Q formula exhibited the highest myopic errors, followed by SRK-T, Hill-RBF, and Haigis. Among all the methods, the Haigis and Hill-RBF combination yielded a mean refractive prediction error closest to zero. The SRK-T and Hill-RBF combination showed the lowest mean absolute error, whereas the Hoffer Q, SRK-T, and Haigis combination had the lowest median absolute error. Hill-radial basis function exhibited the highest intraclass correlation coefficient, whereas SRK-T showed the lowest. Haigis and Hill-RBF, as well as the combination of both, demonstrated the lowest proportion of refractive surprises (absolute error >1.00 D). Among the individual formulas, Hill-RBF had the highest success rate (absolute error ≤0.50 D). Moreover, among all the methods, the SRK-T and Hill-RBF combination exhibited the highest success rate. Conclusions: Hill-radial basis function showed accuracy comparable to or surpassing that of conventional formulas in eyes with short axial lengths. The use and integration of various formulas in cataract surgery for eyes with short axial lengths may help reduce the incidence of refractive surprises.
RESUMO
Resumen La narrativa mitológica de Epimeteo y Prometeo, retratada por Platón, sirve de introducción a la importancia de la inteligencia artificial (IA). El hombre se caracteriza en este mito, frente al resto de criaturas, por tener un don divino: la capacidad de crear herramientas. La IA representa un avance revolucionario al sustituir la labor intelectual humana, destacando su capacidad para generar nuevo conocimiento de forma autónoma. En el ámbito científico, la IA agiliza la revisión por pares y mejora la eficiencia en la evaluación de manuscritos, además de aportar elementos creativos, como la reescritura, traducción o creación de ilustraciones. Sin embargo, su implementación debe ser ética, limitada a un asistente y bajo la supervisión experta para evitar errores y abusos. La IA, una herramienta divina en evolución, requiere que cada uno de sus avances se estudie y aplique críticamente.
Abstract The mythological story of Epimetheus and Prometheus, as told by Plato, serves as an introduction to the meaning of artificial intelligence (AI). In this myth, man, unlike other creatures, is endowed with a divine gift: the ability to create tools. AI represents a revolutionary advance, replacing human intellectual labour and emphasising its ability to autonomously generate new knowledge. In the scientific field, AI is speeding up peer review processes and increasing the efficiency of manuscript evaluation, while also contributing creative elements such as rewriting, translating or creating illustrations. However, its use must be ethical, limited to an assisting role, and subject to expert oversight to prevent errors and misuse. AI, an evolving divine tool, requires critical study and application of each of its advances.
RESUMO
Purpose: Different deep-learning models have been employed to aid in the diagnosis of musculoskeletal pathologies. The diagnosis of tendon pathologies could particularly benefit from applying these technologies. The objective of this study is to assess the performance of deep learning models in diagnosing tendon pathologies using various imaging modalities. Methods: A meta-analysis was conducted, with searches performed on MEDLINE/PubMed, SCOPUS, Cochrane Library, Lilacs, and SciELO. The QUADAS-2 tool was employed to assess the quality of the studies. Diagnostic measures, such as sensitivity, specificity, diagnostic odds ratio, positive and negative likelihood ratios, area under the curve, and summary receiver operating characteristic, were included using a random-effects model. Heterogeneity and subgroup analyses were also conducted. All statistical analyses and plots were generated using the R software package. The PROSPERO ID is CRD42024506491. Results: Eleven deep-learning models from six articles were analyzed. In the random effects models, the sensitivity and specificity of the algorithms for detecting tendon conditions were 0.910 (95% CI: 0.865; 0.940) and 0.954 (0.909; 0.977). The PLR, NLR, lnDOR, and AUC estimates were found to be 37.075 (95%CI: 4.654; 69.496), 0.114 (95%CI: 0.056; 0.171), 5.160 (95% CI: 4.070; 6.250) with a (P < 0.001), and 96%, respectively. Conclusion: The deep-learning algorithms demonstrated a high level of accuracy level in detecting tendon anomalies. The overall robust performance suggests their potential application as a valuable complementary tool in diagnosing medical images.
RESUMO
Urochloa grasses are widely used forages in the Neotropics and are gaining importance in other regions due to their role in meeting the increasing global demand for sustainable agricultural practices. High-throughput phenotyping (HTP) is important for accelerating Urochloa breeding programs focused on improving forage and seed yield. While RGB imaging has been used for HTP of vegetative traits, the assessment of phenological stages and seed yield using image analysis remains unexplored in this genus. This work presents a dataset of 2,400 high-resolution RGB images of 200 Urochloa hybrid genotypes, captured over seven months and covering both vegetative and reproductive stages. Images were manually labelled as vegetative or reproductive, and a subset of 255 reproductive stage images were annotated to identify 22,340 individual racemes. This dataset enables the development of machine learning and deep learning models for automated phenological stage classification and raceme identification, facilitating HTP and accelerated breeding of Urochloa spp. hybrids with high seed yield potential.
RESUMO
OBJECTIVES: To predict palatally impacted maxillary canines based on maxilla measurements through supervised machine learning techniques. MATERIALS AND METHODS: The maxilla images from 138 patients were analysed to investigate intermolar width, interpremolar width, interpterygoid width, maxillary length, maxillary width, nasal cavity width and nostril width, obtained through cone beam computed tomography scans. The predictive models were built using the following machine learning algorithms: Adaboost Classifier, Decision Tree, Gradient Boosting Classifier, K-Nearest Neighbours (KNN), Logistic Regression, Multilayer Perceptron Classifier (MLP), Random Forest Classifier and Support Vector Machine (SVM). A 5-fold cross-validation approach was employed to validate each model. Metrics such as area under the curve (AUC), accuracy, recall, precision and F1 Score were calculated for each model, and ROC curves were constructed. RESULTS: The predictive model included four variables (two dental and two skeletal measurements). The interpterygoid width and nostril width showed the largest effect sizes. The Gradient Boosting Classifier algorithm exhibited the best metrics, with AUC values ranging from 0.91 [CI95% = 0.74-0.98] for test data to 0.89 [CI95% = 0.86-0.94] for crossvalidation. The nostril width variable demonstrated the highest importance across all tested algorithms. CONCLUSION: The use of maxillary measurements, through supervised machine learning techniques, is a promising method for predicting palatally impacted maxillary canines. Among the models evaluated, both the Gradient Boosting Classifier and the Random Forest Classifier demonstrated the best performance metrics, with accuracy and AUC values exceeding 0.8, indicating strong predictive capability.
RESUMO
BACKGROUND: Studies are exploring ways to improve medication adherence, with sentiment analysis (SA) being an underutilized innovation in pharmacy. This technique uses artificial intelligence (AI) and natural language processing to assess text for underlying feelings and emotions. AIM: This study aimed to evaluate the use of two SA models, Valence Aware Dictionary for Sentiment Reasoning (VADER) and Emotion English DistilRoBERTa-base (DistilRoBERTa), for the identification of patients' sentiments and emotions towards their pharmacotherapy. METHOD: A dataset containing 320,095 anonymized patients' reports of experiences with their medication was used. VADER assessed sentiment polarity on a scale from - 1 (negative) to + 1 (positive). DistilRoBERTa classified emotions into seven categories: anger, disgust, fear, joy, neutral, sadness, and surprise. Performance metrics for the models were obtained using the sklearn.metrics module of scikit-learn in Python. RESULTS: VADER demonstrated an overall accuracy of 0.70. For negative sentiments, it achieved a precision of 0.68, recall of 0.80, and an F1-score of 0.73, while for positive sentiments, it had a precision of 0.73, recall of 0.59, and an F1-score of 0.65. The AUC for the ROC curve was 0.90. DistilRoBERTa analysis showed that higher ratings for medication effectiveness, ease of use, and satisfaction corresponded with more positive emotional responses. These results were consistent with VADER's sentiment analysis, confirming the reliability of both models. CONCLUSION: VADER and DistilRoBERTa effectively analyzed patients' sentiments towards pharmacotherapy, providing valuable information. These findings encourage studies of SA in clinical pharmacy practice, paving the way for more personalized and effective patient care strategies.
RESUMO
BACKGROUND: Artificial intelligence models are increasingly gaining popularity among patients and healthcare professionals. While it is impossible to restrict patient's access to different sources of information on the Internet, healthcare professional needs to be aware of the content-quality available across different platforms. OBJECTIVE: To investigate the accuracy and completeness of Chat Generative Pretrained Transformer (ChatGPT) in addressing frequently asked questions related to the management and treatment of female urinary incontinence (UI), compared to recommendations from guidelines. METHODS: This is a cross-sectional study. Two researchers developed 14 frequently asked questions related to UI. Then, they were inserted into the ChatGPT platform on September 16, 2023. The accuracy (scores from 1 to 5) and completeness (score from 1 to 3) of ChatGPT's answers were assessed individually by two experienced researchers in the Women's Health field, following the recommendations proposed by the guidelines for UI. RESULTS: Most of the answers were classified as "more correct than incorrect" (n = 6), followed by "incorrect information than correct" (n = 3), "approximately equal correct and incorrect" (n = 2), "near all correct" (n = 2, and "correct" (n = 1). Regarding the appropriateness, most of the answers were classified as adequate, as they provided the minimum information expected to be classified as correct. CONCLUSION: These results showed an inconsistency when evaluating the accuracy of answers generated by ChatGPT compared by scientific guidelines. Almost all the answers did not bring the complete content expected or reported in previous guidelines, which highlights to healthcare professionals and scientific community a concern about using artificial intelligence in patient counseling.
RESUMO
The increasing use of plastics in rural environments has led to concerns about agricultural plastic waste (APW). However, the plasticulture information gap hinders waste management planning and may lead to plastic residue leakage into the environment with consequent microplastic formation. The location and estimated quantity of the APW are crucial for territorial planning and public policies regarding land use and waste management. Agri-plastic remote detection has attracted increased attention but requires a consensus approach, particularly for mapping plastic-mulched farmlands (PMFs) scattered across vast areas. This article tests whether a streamlined time-series approach minimizes PMF confusion with the background using less processing. Based on the literature, we performed a vast assessment of machine learning techniques and investigated the importance of features in mapping tomato PMF. We evaluated pixel-based and object-based classifications in harmonized Sentinel-2 level-2A images, added plastic indices, and compared six classifiers. The best result showed an overall accuracy of 99.7% through pixel-based using the multilayer perceptron (MLP) classifier. The 3-time series with a 30-day composite exhibited increased accuracy, a decrease in background confusion, and was a viable alternative for overcoming the impact of cloud cover on images at certain times of the year in our study area, which leads to a potentially reliable methodology for APW mapping for future studies. To our knowledge, the presented PMF map is the first for Latin America. This represents a first step toward promoting the circularity of all agricultural plastic in the region, minimizing the impacts of degradation on the environment.
RESUMO
STUDY QUESTION: What are the implications of the presence cytoplasmic strings (Cyt-S) and their quantity and dynamics for the pre-implantation development of human blastocysts? SUMMARY ANSWER: Cyt-S are common in human embryos and are associated with faster blastocyst development, larger expansion, and better morphological quality. WHAT IS KNOWN ALREADY: Cyt-S are dynamic cellular projections connecting inner cell mass and trophectoderm (TE) cells, that can be observed during blastocyst expansion. Their prevalence in human embryos has been estimated to be between 44% and 93%. Data relevant to their clinical implications and role in development are lacking, limited, or controversial. STUDY DESIGN, SIZE, DURATION: Retrospective study conducted at a single IVF center between May 2013 and November 2014 and involving 124 pre-implantation genetic testing for aneuploidy cycles in a time-lapse incubator with ≥1 blastocyst biopsied and vitrified (N = 370 embryos assessed). These cycles resulted in 87 vitrified-warmed single-euploid blastocyst transfers. PARTICIPANTS/MATERIALS, SETTING, METHODS: ICSI, continuous blastocyst culture (Days 5-7), TE biopsy of fully expanded blastocysts without Day 3 zona pellucida drilling, qPCR to assess uniform full-chromosome aneuploidies, and vitrification were all performed. Only vitrified-warmed euploid single-embryo-transfers were conducted. Blastocyst morphological quality was defined according to Gardner's criteria. The AI-based software CHLOE™ (Fairtility) automatically registered timings from time of starting blastulation (tSB) to biopsy (t-biopsy, i.e. blastocyst full-expansion) as hours-post-insemination (hpi), embryo area (including zona pellucida in µm2), and spontaneous blastocyst collapses. One senior embryologist manually annotated Cyt-S presence, quantity, timings, and type (thick cell-to-cell connections and/or threads). All significant associations were confirmed through regression analyses. All couples', cycles', and embryos' main features were also tested for associations with Cyt-S presence, quantity, and dynamics. MAIN RESULTS AND THE ROLE OF CHANCE: About 94.3% of the patients (N = 117/124) had ≥1 embryo with Cyt-S. Out of a total of 370 blastocysts, 55 degenerated between blastulation and full-expansion (N = 55/370, 14.9%). The degeneration rate among embryos with ≥1 Cyt-S was 10.8% (N = 33/304), significantly lower than that of embryos without Cyt-S (33.3%, N = 22/66, P < 0.01). Of the remaining 315 viable blastocysts analyzed, 86% (N = 271/315; P < 0.01) had ≥1 Cyt-S, on average 3.5 ± 2.1 per embryo ranging 1-13. The first Cyt-S per viable embryo appeared at 115.3 ± 12.5 hpi (85.7-157.7), corresponding to 10.5 ± 5.8 h (0.5-31) after tSB. Overall, we analyzed 937 Cyt-S showing a mean duration of 3.8 ± 2.7 h (0.3-20.9). Cyt-S were mostly threads (N = 508/937, 54.2%) or thick cell-to-cell connections becoming threads (N = 382/937, 40.8%) than thick bridges (N = 47/937, 5.0%). The presence and quantity of Cyt-S were significantly associated with developmentally faster (on average 6-12 h faster) and more expanded (on average 2700 µm2-larger blastocyst's area at t-biopsy) embryos. Also, the presence and duration of Cyt-S were associated with better morphology. Lastly, while euploidy rates were comparable between blastocysts with and without Cyt-S, all euploid blastocysts transferred from the latter group failed to implant (N = 10). LIMITATIONS, REASONS FOR CAUTION: Cyt-S presence and dynamics were assessed manually on seven focal planes from video frames recorded every 15 min. The patients included were mostly of advanced maternal age. Only associations could be reported, but no causations/consequences. Lastly, larger datasets are required to better assess Cyt-S associations with clinical outcomes. WIDER IMPLICATIONS OF THE FINDINGS: Cyt-S are common during human blastocyst expansion, suggesting their physiological implication in this process. Their presence, quantity and dynamics mirror embryo viability, and morphological quality, yet their role is still unknown. Future basic science studies are encouraged to finally describe Cyt-S molecular nature and biophysical properties, and Artificial Intelligence tools should aid these studies by incorporating Cyt-S assessment. STUDY FUNDING/COMPETING INTEREST(S): None. TRIAL REGISTRATION NUMBER: N/A.
RESUMO
OBJECTIVES: To validate the performance of Mirai, a mammography-based deep learning model, in predicting breast cancer risk over a 1-5-year period in Mexican women. METHODS: This retrospective single-center study included mammograms in Mexican women who underwent screening mammography between January 2014 and December 2016. For women with consecutive mammograms during the study period, only the initial mammogram was included. Pathology and imaging follow-up served as the reference standard. Model performance in the entire dataset was evaluated, including the concordance index (C-Index) and area under the receiver operating characteristic curve (AUC). Mirai's performance in terms of AUC was also evaluated between mammography systems (Hologic versus IMS). Clinical utility was evaluated by determining a cutoff point for Mirai's continuous risk index based on identifying the top 10% of patients in the high-risk category. RESULTS: Of 3110 patients (median age 52.6 years ± 8.9), throughout the 5-year follow-up period, 3034 patients remained cancer-free, while 76 patients developed breast cancer. Mirai achieved a C-index of 0.63 (95% CI: 0.6-0.7) for the entire dataset. Mirai achieved a higher mean C-index in the Hologic subgroup (0.63 [95% CI: 0.5-0.7]) versus the IMS subgroup (0.55 [95% CI: 0.4-0.7]). With a Mirai index score > 0.029 (10% threshold) to identify high-risk individuals, the study revealed that individuals in the high-risk group had nearly three times the risk of developing breast cancer compared to those in the low-risk group. CONCLUSIONS: Mirai has a moderate performance in predicting future breast cancer among Mexican women. CRITICAL RELEVANCE STATEMENT: Prospective efforts should refine and apply the Mirai model, especially to minority populations and women aged between 30 and 40 years who are currently not targeted for routine screening. KEY POINTS: The applicability of AI models to non-White, minority populations remains understudied. The Mirai model is linked to future cancer events in Mexican women. Further research is needed to enhance model performance and establish usage guidelines.
RESUMO
BACKGROUND: Hepatocellular carcinoma (HCC) is a prevalent tumor with high mortality rates. Computed tomography (CT) is crucial in the non-invasive diagnosis of HCC. Recent advancements in artificial intelligence (AI) have shown significant potential in medical imaging analysis. However, developing these AI algorithms is hindered by the scarcity of comprehensive, publicly available liver imaging datasets. OBJECTIVES: This study aims to detail the tools, data organization, and database structuring used in creating HepatIA, a medical imaging annotation platform and database at a Brazilian tertiary teaching hospital. HepatIA supports liver disease AI research at the institution. MATERIAL AND METHODS: The authors collected baseline characteristics and CT scans of 656 patients from 2008 to 2021. The database, designed using PostgreSQL and implemented with Django and Vue.js, includes 692 CT volumes from a four-phase abdominal CT protocol. Radiologists made segmentation annotations using the OHIF medical image viewer, incorporating MONAI Label for pre-annotation segmentation models. The annotation process included detailed descriptions of liver morphology and nodule characteristics. RESULTS: The HepatIA database currently includes healthy individuals and those with liver diseases such as HCC and cirrhosis. The database dashboard facilitates user interaction with intuitive plots and histograms. Key patient demographics include 64% males and an average age of 56.89 years. The database supports various filters for detailed searches, enhancing research capabilities. CONCLUSION: A comprehensive data structure was successfully created and integrated with the IT systems of a teaching hospital, enabling research on deep learning algorithms applied to abdominal CT scans for investigating hepatic lesions such as HCC.
Assuntos
Inteligência Artificial , Carcinoma Hepatocelular , Bases de Dados Factuais , Hospitais de Ensino , Neoplasias Hepáticas , Centros de Atenção Terciária , Tomografia Computadorizada por Raios X , Humanos , Neoplasias Hepáticas/diagnóstico por imagem , Carcinoma Hepatocelular/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Brasil , Idoso , Adulto , AlgoritmosRESUMO
Background/Objectives: The prevalence of metabolic syndrome (MetS) is increasing worldwide, and an increasing number of cases are diagnosed in younger age groups. This study aimed to propose predictive models based on demographic, anthropometric, and non-invasive clinical variables to predict MetS in adolescents. Methods: A total of 2064 adolescents aged 18-19 from São Luís-Maranhão, Brazil were enrolled. Demographic, anthropometric, and clinical variables were considered, and three criteria for diagnosing MetS were employed: Cook et al., De Ferranti et al. and the International Diabetes Federation (IDF). A feed-forward artificial neural network (ANN) was trained to predict MetS. Accuracy, sensitivity, and specificity were calculated to assess the ANN's performance. The ROC curve was constructed, and the area under the curve was analyzed to assess the discriminatory power of the networks. Results: The prevalence of MetS in adolescents ranged from 5.7% to 12.3%. The ANN that used the Cook et al. criterion performed best in predicting MetS. ANN 5, which included age, sex, waist circumference, weight, and systolic and diastolic blood pressure, showed the best performance and discriminatory power (sensitivity, 89.8%; accuracy, 86.8%). ANN 3 considered the same variables, except for weight, and exhibited good sensitivity (89.0%) and accuracy (87.0%). Conclusions: Using non-invasive measures allows for predicting MetS in adolescents, thereby guiding the flow of care in primary healthcare and optimizing the management of public resources.
RESUMO
OBJECTIVE: This study introduces the complete blood count (CBC), a standard prenatal screening test, as a biomarker for diagnosing preeclampsia with severe features (sPE), employing machine learning models. METHODS: We used a boosting machine learning model fed with synthetic data generated through a new methodology called DAS (Data Augmentation and Smoothing). Using data from a Brazilian study including 132 pregnant women, we generated 3,552 synthetic samples for model training. To improve interpretability, we also provided a ridge regression model. RESULTS: Our boosting model obtained an AUROC of 0.90±0.10, sensitivity of 0.95, and specificity of 0.79 to differentiate sPE and non-PE pregnant women, using CBC parameters of neutrophils count, mean corpuscular hemoglobin (MCH), and the aggregate index of systemic inflammation (AISI). In addition, we provided a ridge regression equation using the same three CBC parameters, which is fully interpretable and achieved an AUROC of 0.79±0.10 to differentiate the both groups. Moreover, we also showed that a monocyte count lower than 490 / m m 3 yielded a sensitivity of 0.71 and specificity of 0.72. CONCLUSION: Our study showed that ML-powered CBC could be used as a biomarker for sPE diagnosis support. In addition, we showed that a low monocyte count alone could be an indicator of sPE. SIGNIFICANCE: Although preeclampsia has been extensively studied, no laboratory biomarker with favorable cost-effectiveness has been proposed. Using artificial intelligence, we proposed to use the CBC, a low-cost, fast, and well-spread blood test, as a biomarker for sPE.
Assuntos
Biomarcadores , Aprendizado de Máquina , Pré-Eclâmpsia , Humanos , Pré-Eclâmpsia/diagnóstico , Pré-Eclâmpsia/sangue , Feminino , Gravidez , Biomarcadores/sangue , Contagem de Células Sanguíneas/métodos , Adulto , Sensibilidade e Especificidade , Brasil , Índice de Gravidade de Doença , Curva ROC , Diagnóstico Pré-Natal/métodosRESUMO
Objective: To conduct a systematic review of external validation studies on the use of different Artificial Intelligence algorithms in breast cancer screening with mammography. Data source: Our systematic review was conducted and reported following the PRISMA statement, using the PubMed, EMBASE, and Cochrane databases with the search terms "Artificial Intelligence," "Mammography," and their respective MeSH terms. We filtered publications from the past ten years (2014 - 2024) and in English. Study selection: A total of 1,878 articles were found in the databases used in the research. After removing duplicates (373) and excluding those that did not address our PICO question (1,475), 30 studies were included in this work. Data collection: The data from the studies were collected independently by five authors, and it was subsequently synthesized based on sample data, location, year, and their main results in terms of AUC, sensitivity, and specificity. Data synthesis: It was demonstrated that the Area Under the ROC Curve (AUC) and sensitivity were similar to those of radiologists when using independent Artificial Intelligence. When used in conjunction with radiologists, statistically higher accuracy in mammogram evaluation was reported compared to the assessment by radiologists alone. Conclusion: AI algorithms have emerged as a means to complement and enhance the performance and accuracy of radiologists. They also assist less experienced professionals in detecting possible lesions. Furthermore, this tool can be used to complement and improve the analyses conducted by medical professionals.
Assuntos
Inteligência Artificial , Neoplasias da Mama , Mamografia , Mamografia/métodos , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Detecção Precoce de Câncer/métodos , Sensibilidade e Especificidade , Algoritmos , Estudos de Validação como AssuntoRESUMO
A inserção dos Assistentes Virtuais Inteligentes na vida cotidiana representa um marco na história da comunicação entre humanos e máquinas. Devido às suas características interativas, estes estão sendo cada vez mais apropriados e desenvolvidos para fins de cuidado, especialmente no âmbito da saúde mental. Este artigo visa compreender se e como o debate regulatório brasileiro oferece instrumentos para lidar com os desafios e as preocupações desses sistemas de Inteligência Artificial em relação à saúde mental. A partir de uma análise documental, mapeamos exemplos de aplicação dos Assistentes Virtuais Inteligentes em saúde mental, a fim de identificar riscos a direitos dos usuários e avaliar, na legislação brasileira vigente e em discussão, se há proteção suficiente para lidar com eles. Por meio de uma abordagem crítica, salientamos a insuficiência da legislação brasileira atual e a necessidade de ampliação do debate sobre como equilibrar possíveis riscos e benefícios dessas tecnologias.
The integration of Intelligent Virtual Assistants into everyday life marks a milestone in the history of human-machine communication. Due to their interactive characteristics, they are increasingly being appropriated and developed for caregiving purposes, especially in the field of mental health. This article aims to understand whether and how the Brazilian regulatory debate provides tools to address the challenges and concerns of these Artificial Intelligence systems concerning mental health. Through a document analysis, we map examples of Intelligent Virtual Assistants's applications to mental health to identify risks to users' rights and evaluate whether the current and the proposed Brazilian legislation offer sufficient protection to address these risks. Through a critical approach, we highlight the inadequacy of current Brazilian legislation and the need to expand the debate on how to balance the potential risks and benefits of these technologies.
La inserción de los Asistentes Virtuales Inteligentes en la vida cotidiana representa un hito en la historia de la comunicación entre humanos y máquinas. Debido a sus características interactivas, cada vez son más apropiados y desarrollados para fines de cuidado. Este artículo tiene como objetivo comprender si y cómo el debate regulatorio brasileño ofrece instrumentos para abordar los desafíos y preocupaciones de estos sistemas de Inteligencia Artificial en relación con la salud mental. A partir de un análisis documental, mapeamos ejemplos de la aplicación de los Asistentes Virtuales Inteligentes a la salud mental, con el fin de identificar riesgos para los derechos de los usuarios y evaluar, en la legislación brasileña vigente y en discusión, si hay protección suficiente para abordarlos. Destacamos la insuficiencia de la legislación brasileña actual y la necesidad de ampliar el debate sobre cómo equilibrar los posibles riesgos y beneficios de estas tecnologías.
Assuntos
Humanos , Fatores Socioeconômicos , Inteligência Artificial , Desenvolvimento Tecnológico , Saúde Mental , Meios de Comunicação , Legislação como Assunto , Tecnologia , Algoritmos , Comunicação , Congressos como Assunto , Computadores de Mão , Acesso à Internet , Internet das CoisasRESUMO
This study aimed to critically evaluate the information provided by ChatGPT on the role of lactate in fatigue and muscle pain during physical exercise. We inserted the prompt "What is the cause of fatigue and pain during exercise?" using ChatGPT versions 3.5 and 4o. In both versions, ChatGPT associated muscle fatigue with glycogen depletion and "lactic acid" accumulation, whereas pain was linked to processes such as inflammation and microtrauma. We deepened the investigation with ChatGPT 3.5, implementing user feedback to question the accuracy of the information about lactate. The response was then reformulated, involving a scientific debate about the true role of lactate in physical exercise and debunking the idea that it is the primary cause of muscle fatigue and pain. We also utilized the creation of a "well-crafted prompt," which included persona identification and thematic characterization, resulting in much more accurate information in both the ChatGPT 3.5 and 4o models, presenting a range of information from the physiological process of lactate to its true role in physical exercise. The results indicated that the accuracy of the responses provided by ChatGPT can vary depending on the data available in its database and, more importantly, on how the question is formulated. Therefore, it is indispensable that educators guide their students in the processes of managing the AI tool to mitigate risks of misinformation.NEW & NOTEWORTHY Generative artificial intelligence (AI), exemplified by ChatGPT, provides immediate and easily accessible answers about lactate and exercise. However, the reliability of this information may fluctuate, contingent upon the scope and intricacy of the knowledge derived from the training process before most recent update. Furthermore, a deep understanding of the basic principles of human physiology becomes crucial for the effective correction and safe use of this technology.
Assuntos
Exercício Físico , Ácido Láctico , Fadiga Muscular , Mialgia , Humanos , Ácido Láctico/sangue , Ácido Láctico/metabolismo , Exercício Físico/fisiologia , Fadiga Muscular/fisiologia , Mialgia/fisiopatologia , Mialgia/etiologia , Mialgia/metabolismo , Fisiologia/educaçãoRESUMO
Incorporation of dermoscopy and artificial intelligence (AI) is improving healthcare professionals' ability to diagnose melanoma earlier, but these algorithms often suffer from a "black box" issue, where decision-making processes are not transparent, limiting their utility for training healthcare providers. To address this, an automated approach for generating melanoma imaging biomarker cues (IBCs), which mimics the screening cues used by expert dermoscopists, was developed. This study created a one-minute learning environment where dermatologists adopted a sensory cue integration algorithm to combine a single IBC with a risk score built on many IBCs, then immediately tested their performance in differentiating melanoma from benign nevi. Ten participants evaluated 78 dermoscopic images, comprised of 39 melanomas and 39 nevi, first without IBCs and then with IBCs. Participants classified each image as melanoma or nevus in both experimental conditions, enabling direct comparative analysis through paired data. With IBCs, average sensitivity improved significantly from 73.69% to 81.57% (p = 0.0051), and the average specificity improved from 60.50% to 67.25% (p = 0.059) for the diagnosis of melanoma. The index of discriminability (d') increased significantly by 0.47 (p = 0.002). Therefore, the incorporation of IBCs can significantly improve physicians' sensitivity in melanoma diagnosis. While more research is needed to validate this approach across other healthcare providers, its use may positively impact melanoma screening practices.
RESUMO
INTRODUCTION: Pain associated with temporomandibular dysfunction (TMD) is often confused with odontogenic pain, which is a challenge in endodontic diagnosis. Validated screening questionnaires can aid in the identification and differentiation of the source of pain. Therefore, this study aimed to develop a virtual assistant based on artificial intelligence using natural language processing techniques to automate the initial screening of patients with tooth pain. METHODS: The PAINe chatbot was developed in Python (Python Software Foundation, Beaverton, OR) language using the PyCharm (JetBrains, Prague, Czech Republic) environment and the openai library to integrate the ChatGPT 4 API (OpenAI, San Francisco, CA) and the Streamlit library (Snowflake Inc, San Francisco, CA) for interface construction. The validated TMD Pain Screener questionnaire and 1 question regarding the current pain intensity were integrated into the chatbot to perform the differential diagnosis of TMD in patients with tooth pain. The accuracy of the responses was evaluated in 50 random scenarios to compare the chatbot with the validated questionnaire. The kappa coefficient was calculated to assess the agreement level between the chatbot responses and the validated questionnaire. RESULTS: The chatbot achieved an accuracy rate of 86% and a substantial level of agreement (κ = 0.70). Most responses were clear and provided adequate information about the diagnosis. CONCLUSIONS: The implementation of a virtual assistant using natural language processing based on large language models for initial differential diagnosis screening of patients with tooth pain demonstrated substantial agreement between validated questionnaires and the chatbot. This approach emerges as a practical and efficient option for screening these patients.
RESUMO
Amelogenesis imperfecta (AI) is a genetic disease characterized by poor formation of tooth enamel. AI occurs due to mutations, especially in AMEL, ENAM, KLK4, MMP20, and FAM83H, associated with changes in matrix proteins, matrix proteases, cell-matrix adhesion proteins, and transport proteins of enamel. Due to the wide variety of phenotypes, the diagnosis of AI is complex, requiring a genetic test to characterize it better. Thus, there is a demand for developing low-cost, noninvasive, and accurate platforms for AI diagnostics. This case-control pilot study aimed to test salivary vibrational modes obtained in attenuated total reflection fourier-transformed infrared (ATR-FTIR) together with machine learning algorithms: linear discriminant analysis (LDA), random forest, and support vector machine (SVM) could be used to discriminate AI from control subjects due to changes in salivary components. The best-performing SVM algorithm discriminates AI better than matched-control subjects with a sensitivity of 100%, specificity of 79%, and accuracy of 88%. The five main vibrational modes with higher feature importance in the Shapley Additive Explanations (SHAP) were 1010 cm-1, 1013 cm-1, 1002 cm-1, 1004 cm-1, and 1011 cm-1 in these best-performing SVM algorithms, suggesting these vibrational modes as a pre-validated salivary infrared spectral area as a potential biomarker for AI screening. In summary, ATR-FTIR spectroscopy and machine learning algorithms can be used on saliva samples to discriminate AI and are further explored as a screening tool.
Assuntos
Amelogênese Imperfeita , Aprendizado de Máquina , Saliva , Humanos , Amelogênese Imperfeita/diagnóstico , Amelogênese Imperfeita/genética , Amelogênese Imperfeita/metabolismo , Saliva/metabolismo , Saliva/química , Espectroscopia de Infravermelho com Transformada de Fourier/métodos , Feminino , Estudos de Casos e Controles , Masculino , Algoritmos , Adulto , Máquina de Vetores de Suporte , Projetos Piloto , Análise Discriminante , Biomarcadores , Triagem/métodos , Adolescente , Adulto JovemRESUMO
OBJECTIVE: To investigate the performance of ChatGPT in the differential diagnosis of oral and maxillofacial diseases. METHODS: Thirty-seven oral and maxillofacial lesions findings were presented to ChatGPT-3.5 and - 4, 18 dental surgeons trained in oral medicine/pathology (OMP), 23 general dental surgeons (DDS), and 16 dental students (DS) for differential diagnosis. Additionally, a group of 15 general dentists was asked to describe 11 cases to ChatGPT versions. The ChatGPT-3.5, -4, and human primary and alternative diagnoses were rated by 2 independent investigators with a 4 Likert-Scale. The consistency of ChatGPT-3.5 and - 4 was evaluated with regenerated inputs. RESULTS: Moderate consistency of outputs was observed for ChatGPT-3.5 and - 4 to provide primary (κ = 0.532 and κ = 0.533 respectively) and alternative (κ = 0.337 and κ = 0.367 respectively) hypotheses. The mean of correct diagnoses was 64.86% for ChatGPT-3.5, 80.18% for ChatGPT-4, 86.64% for OMP, 24.32% for DDS, and 16.67% for DS. The mean correct primary hypothesis rates were 45.95% for ChatGPT-3.5, 61.80% for ChatGPT-4, 82.28% for OMP, 22.72% for DDS, and 15.77% for DS. The mean correct diagnosis rate for ChatGPT-3.5 with standard descriptions was 64.86%, compared to 45.95% with participants' descriptions. For ChatGPT-4, the mean was 80.18% with standard descriptions and 61.80% with participant descriptions. CONCLUSION: ChatGPT-4 demonstrates an accuracy comparable to specialists to provide differential diagnosis for oral and maxillofacial diseases. Consistency of ChatGPT to provide diagnostic hypotheses for oral diseases cases is moderate, representing a weakness for clinical application. The quality of case documentation and descriptions impacts significantly on the performance of ChatGPT. CLINICAL RELEVANCE: General dentists, dental students and specialists in oral medicine and pathology may benefit from ChatGPT-4 as an auxiliary method to define differential diagnosis for oral and maxillofacial lesions, but its accuracy is dependent on precise case descriptions.