New research suggests that including evidence in health-related questions can confuse ChatGPT, an AI-powered bot, and impact its ability to provide accurate answers. Scientists observed a decrease in accuracy when evidence was included in questions, hypothesizing that it introduces too much noise for the chatbot to process effectively.
The study, conducted by researchers from the Commonwealth Scientific and Industrial Research Organisation (CSIRO) and The University of Queensland (UQ), Australia, examined how ChatGPT responded to health-related questions with and without evidence. They presented 100 questions to the chatbot and compared its responses to known correct answers based on existing medical knowledge.
Findings showed that while ChatGPT achieved an 80% accuracy rate when responding to question-only prompts, accuracy dropped to 63% when evidence was included in the prompts. The team expressed uncertainty about the exact cause of this decrease in accuracy but suggested that the presence of evidence might contribute excess noise to the chatbot’s processing.
Given the increasing reliance on AI tools like ChatGPT for health-related information, the researchers emphasized the need for further research to understand and mitigate potential risks associated with using large language models online. They highlighted the importance of informing the public about these risks and improving the accuracy of AI-generated responses in health contexts through continued investigation and optimization efforts.
Post Your Comments