News
Article
Author(s):
Study found that AI is seeing patterns in data and jumping to conclusions
Artificial intelligence may be transforming health care, particularly in the realm of medical imaging, but it is not infallible.
By identifying patterns invisible to human eyes, AI has the potential to enhance the accuracy and scope of diagnostic tools. However, a recent study published in Scientific Reports underscores the risks associated with this powerful technology, specifically a phenomenon known as “shortcut learning.”
Researchers from Dartmouth analyzed over 25,000 knee X-rays from the National Institutes of Health-funded Osteoarthritis Initiative. They found that AI models could predict implausible traits, such as whether patients abstained from eating refried beans or drinking beer. While these predictions have no medical basis, researchers say they highlight the ability of AI to exploit subtle, unintended patterns in the data—patterns unrelated to medical insights.
“While AI has the potential to transform medical imaging, we must be cautious,” said Peter Schilling, M.D., senior author of the study, and an orthopedic surgeon at Dartmouth Health’s Dartmouth Hitchcock Medical Center, and an assistant professor at Dartmouth’s Geisel School of Medicine. “These models can see patterns humans cannot, but not all patterns they identify are meaningful or reliable. It’s crucial to recognize these risks to prevent misleading conclusions and ensure scientific integrity.”
The study found that AI algorithms often relied on confounding variables such as differences in X-ray equipment or site-specific markers to make predictions, rather than medically relevant features. Even when researchers attempted to eliminate these biases, the models adapted by learning new hidden patterns.
Brandon Hill, a co-author of the study and a machine learning scientist at Dartmouth Hitchcock, emphasized the extent of the problem. “This goes beyond bias from clues of race or gender. We found the algorithm could even learn to predict the year an X-ray was taken. It’s pernicious—when you prevent it from learning one of these elements, it will instead learn another it previously ignored. This danger can lead to some really dodgy claims, and researchers need to be aware of how readily this happens when using this technique.”
Researchers say the findings underscore the importance of implementing rigorous evaluation standards in AI-driven medical research. Without deeper scrutiny, reliance on standard algorithms could result in misleading clinical insights and inappropriate treatment recommendations.
“The burden of proof just goes way up when it comes to using models for the discovery of new patterns in medicine,” Hill said. “Part of the problem is our own bias. It is incredibly easy to fall into the trap of presuming that the model ‘sees’ the same way we do. In the end, it doesn’t.”
Hill likened AI to an “alien intelligence,” noting, “You want to say the model is ‘cheating,’ but that anthropomorphizes the technology. It learned a way to solve the task given to it, but not necessarily how a person would. It doesn’t have logic or reasoning as we typically understand it.”