Aim: Artificial intelligence chatbots (AICs) in the health field have become increasingly common, allowing individuals to gather information about their health conditions. However, concerns remain regarding the accuracy, reliability, ethics, and security of these AICs' information in medicine. This study aims to assess the reliability and quality of the information provided by AICs for NMOSD disease, symptoms, and treatment options while discussing potential benefits and disadvantages for NMOSD patients.
Methods: The study aimed to evaluate the responses of three AICs, ChatGPT, Gemini, and Perplexity when asked about the most frequently searched keyword for NMOSD via Google Trends. The responses were assessed by three separate examiners for readability, understandability, actionability, reliability, and transparency.
Results: Based on the Coleman-Liau index, the responses were challenging to read and suitable for professionals. Perplexity PEMAT-P scored highest in understandability (50%) compared to Gemini(40%) and ChatGPT(40%). Regarding PEMAT-P actionability scores, Gemini scored the highest (48%), while ChatGPT obtained the lowest(37%). The reliability of responses varied from poor to fair. The treatment information quality was assessed using the DISCERN score, and it was found that ChatGPT received the lowest score while Perplexity received the highest. None of the AI chatbots addressed the side effects of treatment, potential consequences of not undergoing treatment, effects on quality of life, or shared decision-making.
Conclusions: It is important to address the accuracy and reliability of these technologies before full integration into the medical field. Patients should critically evaluate information from AI chatbots and be cautious about relying solely on them for health-related decisions.