Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.909
Filtrar
1.
Ophthalmol Sci ; 5(1): 100599, 2025.
Artigo em Inglês | MEDLINE | ID: mdl-39346574

RESUMO

Purpose: To evaluate the capabilities of Chat Generative Pre-Trained Transformer (ChatGPT), as a large language model (LLM), for diagnosing glaucoma using the Ocular Hypertension Treatment Study (OHTS) dataset, and comparing the diagnostic capability of ChatGPT 3.5 and ChatGPT 4.0. Design: Prospective data collection study. Participants: A total of 3170 eyes of 1585 subjects from the OHTS were included in this study. Methods: We selected demographic, clinical, ocular, visual field, optic nerve head photo, and history of disease parameters of each participant and developed case reports by converting tabular data into textual format based on information from both eyes of all subjects. We then developed a procedure using the application programming interface of ChatGPT, a LLM-based chatbot, to automatically input prompts into a chat box. This was followed by querying 2 different generations of ChatGPT (versions 3.5 and 4.0) regarding the underlying diagnosis of each subject. We then evaluated the output responses based on several objective metrics. Main Outcome Measures: Area under the receiver operating characteristic curve (AUC), accuracy, specificity, sensitivity, and F1 score. Results: Chat Generative Pre-Trained Transformer 3.5 achieved AUC of 0.74, accuracy of 66%, specificity of 64%, sensitivity of 85%, and F1 score of 0.72. Chat Generative Pre-Trained Transformer 4.0 obtained AUC of 0.76, accuracy of 87%, specificity of 90%, sensitivity of 61%, and F1 score of 0.92. Conclusions: The accuracy of ChatGPT 4.0 in diagnosing glaucoma based on input data from OHTS was promising. The overall accuracy of ChatGPT 4.0 was higher than ChatGPT 3.5. However, ChatGPT 3.5 was found to be more sensitive than ChatGPT 4.0. In its current forms, ChatGPT may serve as a useful tool in exploring disease status of ocular hypertensive eyes when specific data are available for analysis. In the future, leveraging LLMs with multimodal capabilities, allowing for integration of imaging and diagnostic testing as part of the analyses, could further enhance diagnostic capabilities and enhance diagnostic accuracy. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

2.
Ophthalmol Sci ; 5(1): 100600, 2025.
Artigo em Inglês | MEDLINE | ID: mdl-39346575

RESUMO

Objective: Large language models such as ChatGPT have demonstrated significant potential in question-answering within ophthalmology, but there is a paucity of literature evaluating its ability to generate clinical assessments and discussions. The objectives of this study were to (1) assess the accuracy of assessment and plans generated by ChatGPT and (2) evaluate ophthalmologists' abilities to distinguish between responses generated by clinicians versus ChatGPT. Design: Cross-sectional mixed-methods study. Subjects: Sixteen ophthalmologists from a single academic center, of which 10 were board-eligible and 6 were board-certified, were recruited to participate in this study. Methods: Prompt engineering was used to ensure ChatGPT output discussions in the style of the ophthalmologist author of the Medical College of Wisconsin Ophthalmic Case Studies. Cases where ChatGPT accurately identified the primary diagnoses were included and then paired. Masked human-generated and ChatGPT-generated discussions were sent to participating ophthalmologists to identify the author of the discussions. Response confidence was assessed using a 5-point Likert scale score, and subjective feedback was manually reviewed. Main Outcome Measures: Accuracy of ophthalmologist identification of discussion author, as well as subjective perceptions of human-generated versus ChatGPT-generated discussions. Results: Overall, ChatGPT correctly identified the primary diagnosis in 15 of 17 (88.2%) cases. Two cases were excluded from the paired comparison due to hallucinations or fabrications of nonuser-provided data. Ophthalmologists correctly identified the author in 77.9% ± 26.6% of the 13 included cases, with a mean Likert scale confidence rating of 3.6 ± 1.0. No significant differences in performance or confidence were found between board-certified and board-eligible ophthalmologists. Subjectively, ophthalmologists found that discussions written by ChatGPT tended to have more generic responses, irrelevant information, hallucinated more frequently, and had distinct syntactic patterns (all P < 0.01). Conclusions: Large language models have the potential to synthesize clinical data and generate ophthalmic discussions. While these findings have exciting implications for artificial intelligence-assisted health care delivery, more rigorous real-world evaluation of these models is necessary before clinical deployment. Financial Disclosures: The author(s) have no proprietary or commercial interest in any materials discussed in this article.

3.
Digit Health ; 10: 20552076241284771, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39386109

RESUMO

Purpose: Large language models (LLMs) are deep learning models designed to comprehend and generate meaningful responses, which have gained public attention in recent years. The purpose of this study is to evaluate and compare the performance of LLMs in answering questions regarding breast cancer in the Chinese context. Material and Methods: ChatGPT, ERNIE Bot, and ChatGLM were chosen to answer 60 questions related to breast cancer posed by two oncologists. Responses were scored as comprehensive, correct but inadequate, mixed with correct and incorrect data, completely incorrect, or unanswered. The accuracy, length, and readability among answers from different models were evaluated using statistical software. Results: ChatGPT answered 60 questions, with 40 (66.7%) comprehensive answers and six (10.0%) correct but inadequate answers. ERNIE Bot answered 60 questions, with 34 (56.7%) comprehensive answers and seven (11.7%) correct but inadequate answers. ChatGLM generated 60 answers, with 35 (58.3%) comprehensive answers and six (10.0%) correct but inadequate answers. The differences for chosen accuracy metrics among the three LLMs did not reach statistical significance, but only ChatGPT demonstrated a sense of human compassion. The accuracy of the three models in answering questions regarding breast cancer treatment was the lowest, with an average of 44.4%. ERNIE Bot's responses were significantly shorter compared to ChatGPT and ChatGLM (p < .001 for both). The readability scores of the three models showed no statistical significance. Conclusions: In the Chinese context, the capabilities of ChatGPT, ERNIE Bot, and ChatGLM are similar in answering breast cancer-related questions at present. These three LLMs may serve as adjunct informational tools for breast cancer patients in the Chinese context, offering guidance for general inquiries. However, for highly specialized issues, particularly in the realm of breast cancer treatment, LLMs cannot deliver reliable performance. It is necessary to utilize them under the supervision of healthcare professionals.

5.
Sci Rep ; 14(1): 22800, 2024 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-39354005

RESUMO

With growing technology, it's easier for students to accomplish their assignments with little or no effort. This study assesses attitudes, opinions, perceptions, and perceived obstacles regarding ChatGPT among healthcare students at King Saud University, Riyadh, Saudi Arabia. A cross-sectional study among healthcare students was conducted from the beginning of January to April 2024 using social media, platforms through a prevalidated series of questionnaires. This study included students from the College of Pharmacy, Nursing, and Emergency Medical Services (EMS). A total of 354 students included in the study. The data was analyzed using a statistical package for social science (SPSS). Among the healthcare students 39.8% (n = 141) of them were between 23 and 24 years old and most of them were male 68.4% (n = 242). The majority of the healthcare students 91.2% (n = 323) were familiar with the term "ChatGPT, while 75.1% (n = 266) of them were comfortable to some extent in using ChatGPT in their academic activity. Although 22% (n = 78) of healthcare students stated they had used ChatGPT in their academics 29.5% of them perceived that ChatGPT is used in information gathering. In addition, 66.7% of them showed positive attitudes towards ChatGPT. While 61.3% of the healthcare students perceived that ChatGPT may increase productivity and 84.7% of them revealed that it has a positive impact on education. The mean perceptions score of ChatGPT was significantly higher among females (22.64 ± 6.36) comparing male students (21.71 ± 4.66) (t = 1.551; p = 0.001). Similarly, the mean perception score of ChatGPT was higher among pharmacy (22.99 ± 5.48) compared to other healthcare students (F = 4.941; p = 0.008). The current findings concluded that most of the healthcare students are familiar with ChatGPT and show positive perceptions, and perceived that ChatGPT may increase productivity and have a positive impact on education. Therefore, we recommend future studies concentrate on enhancing the ethical and successful application of AI tools and ChatGPT in the academic setting is pivotal.


Assuntos
Percepção , Humanos , Arábia Saudita , Masculino , Feminino , Estudos Transversais , Adulto Jovem , Inquéritos e Questionários , Adulto , Atitude do Pessoal de Saúde , Estudantes de Farmácia/psicologia , Estudantes/psicologia , Estudantes de Ciências da Saúde/psicologia
6.
Br J Clin Pharmacol ; 2024 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-39359001

RESUMO

Drug-drug interactions (DDIs) present a significant health burden, compounded by clinician time constraints and poor patient health literacy. We assessed the ability of ChatGPT (generative artificial intelligence-based large language model) to predict DDIs in a real-world setting. Demographics, diagnoses and prescribed medicines for 120 hospitalized patients were input through three standardized prompts to ChatGPT version 3.5 and compared against pharmacist DDI evaluation to estimate diagnostic accuracy. Area under receiver operating characteristic and inter-rater reliability (Cohen's and Fleiss' kappa coefficients) were calculated. ChatGPT's responses differed based on prompt wording style, with higher sensitivity for prompts mentioning 'drug interaction'. Confusion matrices displayed low true positive and high true negative rates, and there was minimal agreement between ChatGPT and pharmacists (Cohen's kappa values 0.077-0.143). Low sensitivity values suggest a lack of success in identifying DDIs by ChatGPT, and further development is required before it can reliably assess potential DDIs in real-world scenarios.

7.
Cureus ; 16(8): e68307, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39350844

RESUMO

Introduction The study assesses the readability of AI-generated brochures for common emergency medical conditions like heart attack, anaphylaxis, and syncope. Thus, the study aims to compare the AI-generated responses for patient information guides of common emergency medical conditions using ChatGPT and Google Gemini. Methodology Brochures for each condition were created by both AI tools. Readability was assessed using the Flesch-Kincaid Calculator, evaluating word count, sentence count and ease of understanding. Reliability was measured using the Modified DISCERN Score. The similarity between AI outputs was determined using Quillbot. Statistical analysis was performed with R (v4.3.2). Results ChatGPT and Gemini produced brochures with no statistically significant differences in word count (p= 0.2119), sentence count (p=0.1276), readability (p=0.3796), or reliability (p=0.7407). However, ChatGPT provided more detailed content with 32.4% more words (582.80 vs. 440.20) and 51.6% more sentences (67.00 vs. 44.20). In addition, Gemini's brochures were slightly easier to read with a higher ease score (50.62 vs. 41.88). Reliability varied by topic with ChatGPT scoring higher for Heart Attack (4 vs. 3) and Choking (3 vs. 2), while Google Gemini scored higher for Anaphylaxis (4 vs. 3) and Drowning (4 vs. 3), highlighting the need for topic-specific evaluation. Conclusions Although AI-generated brochures from ChatGPT and Gemini are comparable in readability and reliability for patient information on emergency medical conditions, this study highlights that there is no statistically significant difference in the responses generated by the two AI tools.

8.
Cureus ; 16(8): e68298, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39350878

RESUMO

GPT-4 Vision (GPT-4V) represents a significant advancement in multimodal artificial intelligence, enabling text generation from images without specialized training. This marks the transformation of ChatGPT as a large language model (LLM) into GPT-4's promised large multimodal model (LMM). As these AI models continue to advance, they may enhance radiology workflow and aid with decision support. This technical note explores potential GPT-4V applications in radiology and evaluates performance for sample tasks. GPT-4V capabilities were tested using images from the web, personal and institutional teaching files, and hand-drawn sketches. Prompts evaluated scientific figure analysis, radiologic image reporting, image comparison, handwriting interpretation, sketch-to-code, and artistic expression. In this limited demonstration of GPT-4V's capabilities, it showed promise in classifying images, counting entities, comparing images, and deciphering handwriting and sketches. However, it exhibited limitations in detecting some fractures, discerning a change in size of lesions, accurately interpreting complex diagrams, and consistently characterizing radiologic findings. Artistic expression responses were coherent. WhileGPT-4V may eventually assist with tasks related to radiology, current reliability gaps highlight the need for continued training and improvement before consideration for any medical use by the general public and ultimately clinical integration. Future iterations could enable a virtual assistant to discuss findings, improve reports, extract data from images, provide decision support based on guidelines, white papers, and appropriateness criteria. Human expertise remain essential for safe practice and partnerships between physicians, researchers, and technology leaders are necessary to safeguard against risks like bias and privacy concerns.

9.
Front Artif Intell ; 7: 1393903, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39351510

RESUMO

Introduction: Recent advances in generative Artificial Intelligence (AI) and Natural Language Processing (NLP) have led to the development of Large Language Models (LLMs) and AI-powered chatbots like ChatGPT, which have numerous practical applications. Notably, these models assist programmers with coding queries, debugging, solution suggestions, and providing guidance on software development tasks. Despite known issues with the accuracy of ChatGPT's responses, its comprehensive and articulate language continues to attract frequent use. This indicates potential for ChatGPT to support educators and serve as a virtual tutor for students. Methods: To explore this potential, we conducted a comprehensive analysis comparing the emotional content in responses from ChatGPT and human answers to 2000 questions sourced from Stack Overflow (SO). The emotional aspects of the answers were examined to understand how the emotional tone of AI responses compares to that of human responses. Results: Our analysis revealed that ChatGPT's answers are generally more positive compared to human responses. In contrast, human answers often exhibit emotions such as anger and disgust. Significant differences were observed in emotional expressions between ChatGPT and human responses, particularly in the emotions of anger, disgust, and joy. Human responses displayed a broader emotional spectrum compared to ChatGPT, suggesting greater emotional variability among humans. Discussion: The findings highlight a distinct emotional divergence between ChatGPT and human responses, with ChatGPT exhibiting a more uniformly positive tone and humans displaying a wider range of emotions. This variance underscores the need for further research into the role of emotional content in AI and human interactions, particularly in educational contexts where emotional nuances can impact learning and communication.

10.
JMIR Form Res ; 8: e51383, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39353189

RESUMO

BACKGROUND: Generative artificial intelligence (AI) and large language models, such as OpenAI's ChatGPT, have shown promising potential in supporting medical education and clinical decision-making, given their vast knowledge base and natural language processing capabilities. As a general purpose AI system, ChatGPT can complete a wide range of tasks, including differential diagnosis without additional training. However, the specific application of ChatGPT in learning and applying a series of specialized, context-specific tasks mimicking the workflow of a human assessor, such as administering a standardized assessment questionnaire, followed by inputting assessment results in a standardized form, and interpretating assessment results strictly following credible, published scoring criteria, have not been thoroughly studied. OBJECTIVE: This exploratory study aims to evaluate and optimize ChatGPT's capabilities in administering and interpreting the Sour Seven Questionnaire, an informant-based delirium assessment tool. Specifically, the objectives were to train ChatGPT-3.5 and ChatGPT-4 to understand and correctly apply the Sour Seven Questionnaire to clinical vignettes using prompt engineering, assess the performance of these AI models in identifying and scoring delirium symptoms against scores from human experts, and refine and enhance the models' interpretation and reporting accuracy through iterative prompt optimization. METHODS: We used prompt engineering to train ChatGPT-3.5 and ChatGPT-4 models on the Sour Seven Questionnaire, a tool for assessing delirium through caregiver input. Prompt engineering is a methodology used to enhance the AI's processing of inputs by meticulously structuring the prompts to improve accuracy and consistency in outputs. In this study, prompt engineering involved creating specific, structured commands that guided the AI models in understanding and applying the assessment tool's criteria accurately to clinical vignettes. This approach also included designing prompts to explicitly instruct the AI on how to format its responses, ensuring they were consistent with clinical documentation standards. RESULTS: Both ChatGPT models demonstrated promising proficiency in applying the Sour Seven Questionnaire to the vignettes, despite initial inconsistencies and errors. Performance notably improved through iterative prompt engineering, enhancing the models' capacity to detect delirium symptoms and assign scores. Prompt optimizations included adjusting the scoring methodology to accept only definitive "Yes" or "No" responses, revising the evaluation prompt to mandate responses in a tabular format, and guiding the models to adhere to the 2 recommended actions specified in the Sour Seven Questionnaire. CONCLUSIONS: Our findings provide preliminary evidence supporting the potential utility of AI models such as ChatGPT in administering standardized clinical assessment tools. The results highlight the significance of context-specific training and prompt engineering in harnessing the full potential of these AI models for health care applications. Despite the encouraging results, broader generalizability and further validation in real-world settings warrant additional research.


Assuntos
Delírio , Humanos , Delírio/diagnóstico , Inquéritos e Questionários , Inteligência Artificial
11.
Cureus ; 16(9): e68521, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39364514

RESUMO

Background There has been a significant increase in cervical fusion procedures, both anterior and posterior, across the United States. Despite this upward trend, limited research exists on adherence to evidence-based medicine (EBM) guidelines for cervical fusion, highlighting a gap between recommended practices and surgeon preferences. Additionally, patients are increasingly utilizing large language models (LLMs) to aid in decision-making. Methodology This observational study evaluated the capacity of four LLMs, namely, Bard, BingAI, ChatGPT-3.5, and ChatGPT-4, to adhere to EBM guidelines, specifically the 2023 North American Spine Society (NASS) cervical fusion guidelines. Ten clinical vignettes were created based on NASS recommendations to determine when fusion was indicated. This novel approach assessed LLM performance in a clinical decision-making context without requiring institutional review board approval, as no human subjects were involved. Results No LLM achieved complete concordance with NASS guidelines, though ChatGPT-4 and Bing Chat exhibited the highest adherence at 60%. Discrepancies were notably observed in scenarios involving head-drop syndrome and pseudoarthrosis, where all LLMs failed to align with NASS recommendations. Additionally, only 25% of LLMs agreed with NASS guidelines for fusion in cases of cervical radiculopathy and as an adjunct to facet cyst resection. Conclusions The study underscores the need for improved LLM training on clinical guidelines and emphasizes the importance of considering the nuances of individual patient cases. While LLMs hold promise for enhancing guideline adherence in cervical fusion decision-making, their current performance indicates a need for further refinement and integration with clinical expertise to ensure optimal patient care. This study contributes to understanding the role of AI in healthcare, advocating for a balanced approach that leverages technological advancements while acknowledging the complexities of surgical decision-making.

12.
J Med Internet Res ; 26: e51635, 2024 Oct 04.
Artigo em Inglês | MEDLINE | ID: mdl-39365643

RESUMO

Hospital pharmacy plays an important role in ensuring medical care quality and safety, especially in the area of drug information retrieval, therapy guidance, and drug-drug interaction management. ChatGPT is a powerful artificial intelligence language model that can generate natural-language texts. Here, we explored the applications and reflections of ChatGPT in hospital pharmacy, where it may enhance the quality and efficiency of pharmaceutical care. We also explored ChatGPT's prospects in hospital pharmacy and discussed its working principle, diverse applications, and practical cases in daily operations and scientific research. Meanwhile, the challenges and limitations of ChatGPT, such as data privacy, ethical issues, bias and discrimination, and human oversight, are discussed. ChatGPT is a promising tool for hospital pharmacy, but it requires careful evaluation and validation before it can be integrated into clinical practice. Some suggestions for future research and development of ChatGPT in hospital pharmacy are provided.


Assuntos
Serviço de Farmácia Hospitalar , Humanos , Inteligência Artificial , Processamento de Linguagem Natural
13.
Int Wound J ; 21(10): e70055, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39353602

RESUMO

Diabetic foot ulcers (DFUs) are a growing public health problem, paralleling the increasing incidence of diabetes. While prevention is most effective treatment for DFUs, challenge remains on selecting the optimal treatment in cases with DFUs. Health sciences have greatly benefited from the integration of artificial intelligence (AI) applications across various fields. Regarding amputations in DFUs, both literature and clinical practice have mainly focused on strategies to prevent amputation and identify avoidable risk factor. However, there are very limited data on assistive parameters/tools that can be used to determine the level of amputation. This study investigated how well ChatGPT, with its lately released version 4o, matches the amputation level selection of an experienced team in this field. For this purpose, clinical photographs from patients who underwent amputations due to diabetic foot ulcers between May 2023 and May 2024 were submitted to the ChatGPT-4o program. The AI was tasked with recommending an appropriate amputation level based on these clinical photographs. Data from a total of 60 patients were analysed, with a median age of 64.5 years (range: 41-91). According to the Wagner Classification, 32 patients (53.3%) had grade 4 ulcers, 16 patients (26.6%) had grade 5 ulcers, 10 patients (16.6%) had grade 3 ulcers and 2 patients (3.3%) had grade 2 ulcers. A one-to-one correspondence between the AI tool's recommended amputation level and the level actually performed was observed in 50 out of 60 cases (83.3%). In the remaining 10 cases, discrepancies were noted, with the AI consistently recommending a more proximal level of amputation than what was performed. The inter-rater agreement analysis between the actual surgeries and the AI tool's recommendations yielded a Cohen's kappa coefficient of 0.808 (SD: 0.055, 95% CI: 0.701-0.916), indicating substantial agreement. Relying solely on clinical photographs, ChatGPT-4.0 demonstrates decisions that are largely consistent with those of an experienced team in determining the optimal level of amputation for DFUs, with the exception of hindfoot amputations.


Assuntos
Amputação Cirúrgica , Inteligência Artificial , Pé Diabético , Humanos , Pé Diabético/cirurgia , Amputação Cirúrgica/métodos , Amputação Cirúrgica/estatística & dados numéricos , Idoso , Masculino , Feminino , Pessoa de Meia-Idade , Idoso de 80 Anos ou mais , Adulto
14.
Cureus ; 16(10): e70640, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39359332

RESUMO

This editorial explores the recent advancements in generative artificial intelligence with the newly-released OpenAI o1-Preview, comparing its capabilities to the traditional ChatGPT (GPT-4) model, particularly in the context of healthcare. While ChatGPT has shown many applications for general medical advice and patient interactions, OpenAI o1-Preview introduces new features with advanced reasoning skills using a chain of thought processes that could enable users to tackle more complex medical queries such as genetic disease discovery, multi-system or complex disease care, and medical research support. The article explores some of the new model's potential and other aspects that may affect its usage, like slower response times due to its extensive reasoning approach yet highlights its potential for reducing hallucinations and offering more accurate outputs for complex medical problems. Ethical challenges, data diversity, access equity, and transparency are also discussed, identifying key areas for future research, including optimizing the use of both models in tandem for healthcare applications. The editorial concludes by advocating for collaborative exploration of all large language models (LLMs), including the novel OpenAI o1-Preview, to fully utilize their transformative potential in medicine and healthcare delivery. This model, with its advanced reasoning capabilities, presents an opportunity to empower healthcare professionals, policymakers, and computer scientists to work together in transforming patient care, accelerating medical research, and enhancing healthcare outcomes. By optimizing the use of several LLM models in tandem, healthcare systems may enhance efficiency and precision, as well as mitigate previous LLM challenges, such as ethical concerns, access disparities, and technical limitations, steering to a new era of artificial intelligence (AI)-driven healthcare.

15.
J Med Internet Res ; 26: e58831, 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39352738

RESUMO

BACKGROUND: Artificial intelligence and the language models derived from it, such as ChatGPT, offer immense possibilities, particularly in the field of medicine. It is already evident that ChatGPT can provide adequate and, in some cases, expert-level responses to health-related queries and advice for patients. However, it is currently unknown how patients perceive these capabilities, whether they can derive benefit from them, and whether potential risks, such as harmful suggestions, are detected by patients. OBJECTIVE: This study aims to clarify whether patients can get useful and safe health care advice from an artificial intelligence chatbot assistant. METHODS: This cross-sectional study was conducted using 100 publicly available health-related questions from 5 medical specialties (trauma, general surgery, otolaryngology, pediatrics, and internal medicine) from a web-based platform for patients. Responses generated by ChatGPT-4.0 and by an expert panel (EP) of experienced physicians from the aforementioned web-based platform were packed into 10 sets consisting of 10 questions each. The blinded evaluation was carried out by patients regarding empathy and usefulness (assessed through the question: "Would this answer have helped you?") on a scale from 1 to 5. As a control, evaluation was also performed by 3 physicians in each respective medical specialty, who were additionally asked about the potential harm of the response and its correctness. RESULTS: In total, 200 sets of questions were submitted by 64 patients (mean 45.7, SD 15.9 years; 29/64, 45.3% male), resulting in 2000 evaluated answers of ChatGPT and the EP each. ChatGPT scored higher in terms of empathy (4.18 vs 2.7; P<.001) and usefulness (4.04 vs 2.98; P<.001). Subanalysis revealed a small bias in terms of levels of empathy given by women in comparison with men (4.46 vs 4.14; P=.049). Ratings of ChatGPT were high regardless of the participant's age. The same highly significant results were observed in the evaluation of the respective specialist physicians. ChatGPT outperformed significantly in correctness (4.51 vs 3.55; P<.001). Specialists rated the usefulness (3.93 vs 4.59) and correctness (4.62 vs 3.84) significantly lower in potentially harmful responses from ChatGPT (P<.001). This was not the case among patients. CONCLUSIONS: The results indicate that ChatGPT is capable of supporting patients in health-related queries better than physicians, at least in terms of written advice through a web-based platform. In this study, ChatGPT's responses had a lower percentage of potentially harmful advice than the web-based EP. However, it is crucial to note that this finding is based on a specific study design and may not generalize to all health care settings. Alarmingly, patients are not able to independently recognize these potential dangers.


Assuntos
Relações Médico-Paciente , Humanos , Estudos Transversais , Masculino , Feminino , Adulto , Pessoa de Meia-Idade , Inteligência Artificial , Médicos/psicologia , Internet , Empatia , Inquéritos e Questionários
16.
Dig Dis Sci ; 2024 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-39354272

RESUMO

Artificial Intelligence and Natural Language Processing technology have demonstrated significant promise across several domains within the medical and healthcare sectors. This technique has numerous uses in the field of healthcare. One of the primary challenges in implementing ChatGPT in healthcare is the requirement for precise and up-to-date data. In the case of the involvement of sensitive medical information, it is imperative to carefully address concerns regarding privacy and security when using GPT in the healthcare sector. This paper outlines ChatGPT and its relevance in the healthcare industry. It discusses the important aspects of ChatGPT's workflow and highlights the usual features of ChatGPT specifically designed for the healthcare domain. The present review uses the ChatGPT model within the research domain to investigate disorders associated with the hepatic system. This review demonstrates the possible use of ChatGPT in supporting researchers and clinicians in analyzing and interpreting liver-related data, thereby improving disease diagnosis, prognosis, and patient care.

17.
Artigo em Inglês | MEDLINE | ID: mdl-39356355

RESUMO

OBJECTIVE: To investigate the accuracy of information provided by ChatGPT-4o to patients about tracheotomy. METHODS: Twenty common questions of patients about tracheotomy were presented to ChatGPT-4o twice (7-day intervals). The accuracy, clarity, relevance, completeness, referencing, and usefulness of responses were assessed by a board-certified otolaryngologist and a board-certified intensive care unit practitioner with the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool. The interrater reliability and the stability of the ChatGPT-4o responses were evaluated with intraclass correlation coefficient (ICC) and Pearson correlation analysis. RESULTS: The total scores of QAMAI were 22.85 ± 4.75 for the intensive care practitioner and 21.45 ± 3.95 for the otolaryngologist, which consists of moderate-to-high accuracy. The otolaryngologist and the ICU practitioner reported high ICC (0.807; 95%CI: 0.655-0.911). The highest QAMAI scores have been found for clarity and completeness of explanations. The QAMAI scores for the accuracy of the information and the referencing were the lowest. The information related to the post-laryngectomy tracheostomy remains incomplete or erroneous. ChatGPT-4o did not provide references for their responses. The stability analysis reported high stability in regenerated questions. CONCLUSION: The accuracy of ChatGPT-4o is moderate-to-high in providing information related to the tracheotomy. However, patients using ChatGPT-4o need to be cautious about the information related to tracheotomy care, steps, and the differences between temporary and permanent tracheotomies.

18.
Medeni Med J ; 39(3): 221-229, 2024 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-39350577

RESUMO

Objective: Acute post-streptococcal glomerulonephritis (APSGN) is a common cause of acute glomerulonephritis in children. The condition may present as acute nephritic and/or nephrotic syndrome and rarely as rapidly progressive glomerulonephritis. ChatGPT (OpenAI, San Francisco, California, United States of America) has been developed as a chat robot supported by artificial intelligence (AI). In this study, we evaluated whether AI can be used in the follow-up of patients with APSGN. Methods: The clinical characteristics of patients with APSGN were noted from patient records. Twelve questions about APSGN were directed to ChatGPT 3.5. The accuracy of the answers was evaluated by the researchers. Then, the clinical features of the patients were transferred to ChatGPT 3.5 and the follow-up management of the patients was examined. Results: The study included 11 patients with an average age of 9.08±3.96 years. Eight (72.7%) patients had elevated creatinine and 10 (90.9%) had hematuria and/or proteinuria. Anti-streptolysin O was high in all patients (955±353 IU/mL) and C3 was low in 9 (81.8%) patients (0.56±0.34 g/L). Hypertensive encephalopathy, nephrotic syndrome, and rapidly progressive glomerulonephritis were observed in three patients. Normal creatinine levels were achieved in all patients. Questions assessing the definition, epidemiologic characteristics, pathophysiologic mechanisms, diagnosis, and treatment of APSGN were answered correctly by ChatGPT 3.5. All patients were diagnosed with APSGN, and the treatment steps applied by clinicians were similarly recommended by ChatGPT 3.5. Conclusions: The insights and recommendations offered by ChatGPT for patients with APSGN can be an asset in the care and management of patients. With AI applications, clinicians can review treatment decisions and create more effective treatment plans.

19.
Hu Li Za Zhi ; 71(5): 7-13, 2024 Oct.
Artigo em Chinês | MEDLINE | ID: mdl-39350704

RESUMO

Artificial intelligence (AI) is driving global change, and the implementation of generative AI in higher education is inevitable. AI language models such as the chat generative pre-trained transformer (ChatGPT) hold the potential to revolutionize the delivery of nursing education in the future. Nurse educators play a crucial role in preparing nursing students for a future technology-integrated healthcare system. While the technology has limitations and potential biases, the emergence of ChatGPT presents both opportunities and challenges. It is critical for faculty to be familiar with the capabilities and limitations of this model to foster effective, ethical, and responsible utilization of AI technology while preparing students in advance for the dynamic and rapidly advancing landscape of nursing and healthcare. Therefore, this article was written to present a strengths, weaknesses, opportunities, and threats (SWOT) analysis of integrating ChatGPT into nursing education, providing a guide for implementing ChatGPT in nursing education and offering a well-rounded assessment to help nurse educators make informed decisions.


Assuntos
Inteligência Artificial , Educação em Enfermagem , Humanos
20.
Hu Li Za Zhi ; 71(5): 21-28, 2024 Oct.
Artigo em Chinês | MEDLINE | ID: mdl-39350706

RESUMO

The current uses, potential risks, and practical recommendations for using chat generative pre-trained transformers (ChatGPT) in systematic reviews (SRs) and meta-analyses (MAs) are reviewed in this article. The findings of prior research suggest that, for tasks such as literature screening and information extraction, ChatGPT can match or exceed the performance of human experts. However, for complex tasks such as risk of bias assessment, its performance remains significantly limited, underscoring the critical role of human expertise. The use of ChatGPT as an adjunct tool in SRs and MAs requires careful planning and the implementation of strict quality control and validation mechanisms to mitigate potential errors such as those arising from artificial intelligence (AI) 'hallucinations'. This paper also provides specific recommendations for optimizing human-AI collaboration in SRs and MAs. Assessing the specific context of each task and implementing the most appropriate strategies are critical when using ChatGPT in support of research goals. Furthermore, transparency regarding the use of ChatGPT in research reports is essential to maintaining research integrity. Close attention to ethical norms, including issues of privacy, bias, and fairness, is also imperative. Finally, from a human-centered perspective, this paper emphasizes the importance of researchers cultivating continuous self-iteration, prompt engineering skills, critical thinking, cross-disciplinary collaboration, and ethical awareness skills with the goals of: continuously optimizing human-AI collaboration models within reasonable and compliant norms, enhancing the complex-task performance of AI tools such as ChatGPT, and, ultimately, achieving greater efficiency through technological innovative while upholding scientific rigor.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA