Photo credit: www.sciencedaily.com
Generative AI’s application in medical diagnostics has garnered significant attention recently, with numerous studies emerging to explore its potential. However, variations in evaluation metrics across these studies highlight the need for a holistic analysis to gauge how AI might be integrated into real-world medical practices and to assess its benefits relative to human practitioners.
A team of researchers, led by Dr. Hirotaka Takita alongside Associate Professor Daiju Ueda from the Graduate School of Medicine at Osaka Metropolitan University, undertook a meta-analysis focusing on the diagnostic abilities of generative AI. They reviewed 83 research articles published between June 2018 and June 2024, spanning an array of medical fields. Notably, ChatGPT emerged as the most frequently examined large language model (LLM) in their analysis.
The results of this comparative study indicated that medical professionals held a diagnostic accuracy rate that was 15.8% higher than that of generative AI. Specifically, the average accuracy for generative AI was found to be 52.1%. Interestingly, the most recent generative AI models sometimes achieved accuracy levels comparable to those of non-specialist physicians.
Dr. Takita remarked, “This research illustrates that generative AI’s diagnostic performance is on par with that of non-specialist doctors. Its utilization in medical education could be advantageous, aiding non-specialist doctors and enhancing diagnostic support, particularly in regions with insufficient medical resources.” He emphasized the need for more comprehensive investigations that include evaluations in complex clinical conditions, performance assessments using real medical records, enhancing AI decision-making transparency, and validation across diverse patient populations to further elucidate AI capabilities.
The findings of this study have been documented in npj Digital Medicine.
Source
www.sciencedaily.com