Assessing the Accuracy of AI Language Models in Providing Information on Urinary Incontinence: A Comparative Study

Burhan Coşkun; Omer Bayrak; Gokhan Ocakoglu; Halit Mustafa Acar; Onur Kaygisiz

doi:10.29228/ejhh.71797

Assessing the Accuracy of AI Language Models in Providing Information on Urinary Incontinence: A Comparative Study

Author :

DOI : 10.29228/ejhh.71797

Year-Number: 2023-3

Yayımlanma Tarihi: 2023-08-22 08:58:37.0

Language : İngilizce

Konu : Urology

Number of pages: 61-70

Mendeley

EndNote

Alıntı Yap

English Turkish

Abstract

Keywords

Abstract

Objective: To assess the accuracy and comprehensiveness of health information generated by different large language models (LLMs) focusing on urinary incontinence.

Methods: Using the website www.answerthepublic.com, we retrieved the most frequently searched questions related to urinary incontinence. After applying exclusion criteria, the chosen questions, categorized into definition/diagnosis, causes, treatment, complications, and others, were input into LLMs: GPT-3.5, GPT-4, and BARD. Outputs were assessed for accuracy and comprehensiveness by two urologists using a Likert scale.

Results: Of the initial 630 questions, 38 were selected for analysis. GPT-4 demonstrated superior performance, with 73.68% of its responses achieving the maximum accuracy score, significantly outperforming GPT-3.5 (42.11%) and BARD (28.95%). In terms of comprehensiveness, GPT-4 also excelled with a score of 71.05%, whereas GPT-3.5 and BARD scored 36.84% and 28.95% respectively. For the 'causes' category, GPT-4 provided significantly more comprehensive responses.

Conclusion: While all LLMs generated relevant health information on urinary incontinence, GPT-4 showed superior accuracy and comprehensiveness. However, the potential for generating incorrect information by these models necessitates caution in their utilization.

Keywords

Last issue
Previous issues
Article Statistics

Assessing the Accuracy of AI Language Models in Providing Information on Urinary Incontinence: A Comparative Study

Author :

Abstract

Keywords

Abstract

Keywords

MAKALE İSTATİSTİKLERİ

LINKS

Share