News
Article
Author(s):
Research finds open-source AI rivals GPT-4, paving the way for secure, in-house AI diagnostic tools.
© inspiring.team - stock.adobe.com
A new study from Harvard Medical School has found that an open-source artificial intelligence (AI) model performed on par with GPT-4 — one of the leading proprietary AI models — in diagnosing complex medical cases. The findings, published in JAMA Health Forum, suggest that physicians may soon have more options for integrating AI into clinical decision-making while maintaining greater control over patient data.
In recent years, proprietary AI models — developed by companies like OpenAI and Google — have been the dominant players in AI-assisted diagnostics. These closed-source models operate on external servers, meaning that hospitals and clinicians must send patient data outside their networks to use them.
In contrast, open-source AI models are freely available and can be customized for specific clinical environments. They can run on a hospital’s own servers, offering more data privacy and the flexibility to adapt to a practice’s unique patient population. However, open-source models have traditionally lagged behind their proprietary counterparts in performance — until now.
Researchers evaluated Meta’s Llama 3.1 405B, an open-source AI model, against GPT-4 by testing both models on 92 complex diagnostic cases published in The New England Journal of Medicine. The study found that:
These results indicate that open-source AI models are closing the gap and could soon offer physicians a viable alternative to proprietary models.
For primary care physicians, practice owners and administrators, the choice between proprietary and open-source AI comes down to three key factors:
“To our knowledge, this is the first time an open-source AI model has matched the performance of GPT-4 on such challenging cases as assessed by physicians,” said Arjun Manrai, PhD, senior author of the study and an assistant professor of biomedical informatics in the Blavatnik Institute of Harvard Medical School. “It really is stunning that the Llama models caught up so quickly with the leading proprietary model. Patients, care providers, and hospitals stand to gain from his competition.”
As AI continues to evolve, the study highlights a growing opportunity for hospitals and private practices to explore open-source alternatives that balance diagnostic accuracy with data security and customization. While proprietary models still offer convenience, the emergence of high-performing open-source AI could shift the landscape of AI-assisted medicine in the coming years.
For now, researchers emphasize that AI should serve as a “copilot” rather than a replacement for physician judgment.
“Used wisely and incorporated responsibly in current health infrastructure, AI tools could be invaluable copilots for busy clinicians and serve as trusted diagnostic aides to enhance both the accuracy and speed of diagnosis,” Manrai said. “But it remains crucial that physicians help drive these efforts to make sure AI works for them.”