A Multi-Chatbot Evaluation Framework for Knee MRI Diagnosis Assistance

Loading...
Publication Logo

Date

2025-10-26

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Journal Issue

Abstract

Knee injuries, and in particular abnormalities of the Anterior Cruciate Ligament (ACL) and the meniscus, are diagnosed frequently using MRI scans. Although MRI interpretations typically require expert knowledge, that expertise may not always be accessible. Recently, researchers have begun using Large Language Models (LLMs) in the medical domain, applied to assist with diagnostic interpretative tasks. Here, we investigate the potential for LLM-based chatbots to assist and augment the reasoned diagnostic interpretation of knee MRI images. Specifically, we report our comparisons across chatbot diagnostics including ChatGPT-4o, Gemini 2.5 Flash, and Claude Sonnet 4, to see if they can annotate ACL injury, meniscal tear, and abnormality of any type. Using visual MRI slices as input, we evaluated the interpretations produced by multimodal capable chatbots against the ground truth data labelled by professional radiologists. Our findings illustrate each chatbot model's relative strengths and weaknesses in medical imaging analysis that contribute evidence towards supporting the development of AI-augmented workflows for medical imaging and radiology.

Description

Keywords

Ai, Chatbot, Large Language Models, Medical Imaging, Healthcare, Medical Decision-Making

Fields of Science

Citation

WoS Q

Scopus Q

Source

TIPTEKNO 2025 - Medical Technologies Congress, Proceedings -- 2025 Medical Technologies Congress, TIPTEKNO 2025 -- 26 October 2025 through 28 October 2025 -- Gazi Magusa -- 217812

Volume

Issue

Start Page

End Page

Page Views

7

checked on Apr 29, 2026

Google Scholar Logo
Google Scholar™

Sustainable Development Goals

SDG data is not available