Diagnostic accuracy of ChatGPT-4 in light of data derived from patient’s clinical and imaging findings in gastrointestinal surgery

Author :  

Year-Number: 2025-3
Yayımlanma Tarihi: 2025-09-29 18:58:33.0
Language : İngilizce
Konu : General Surgery; Gastrointestinal Surgery
Number of pages: 28-42
Mendeley EndNote Alıntı Yap

Abstract

Keywords

Abstract

Aim: This study aims to evaluate the diagnostic accuracy (DA) of ChatGPT-4 in GIS applications.

Methods: 231 GIS cases from between October 1 2021 and January 10 2024 were sourced from the International Journal of Surgery Case Reports. ChatGPT-4 was used to generate diagnoses based on the patient’s medical histories and imaging findings. Each case was categorized by anatomical location (esophagus, gastric, small intestine, and colon), and divided into emergency and elective groups.

Results: ChatGPT-4 generated 2–10 differential diagnoses per case (median: 5, mean: 4.8±1.2). No significant association was found between age, gender, or the number of differential diagnoses (p=0.687 and p=0.862). A significant difference was found between anatomical localizations according to the urgency of the cases (p<0.001). Overall, ChatGPT-4 achieved 83.1% accuracy for differential diagnoses and 57.1% for final diagnoses (p<0.001 and p=0.035). During the trained period, accuracy was 82.9% (differential) and 59.1% (final), while in the non-trained period it was 83.6% (differential) and 52.2% (final).

Conclusion: In this large-scale, case-based evaluation, ChatGPT-4 demonstrated a substantially higher accuracy in generating differential diagnoses (DA 83.1%) compared to establishing final diagnoses (DA 57.1%). Diagnostic accuracy was slightly improved during the trained period, particularly for final diagnoses (59.1% vs. 52.2%). These findings indicate that ChatGPT-4 may serve as a valuable clinical decision support tool in the early diagnostic stage by broadening the differential diagnosis list, while its limited accuracy in final diagnosis highlights the need for cautious integration into practice.

Keywords


                                                                                                                                                                                                        
  • Article Statistics