AI models have difficulty making analogies when analyzing complex objects, unlike humans, which means their use in real-world decision-making situations can be risky. (Image credit: imaginima/Getty Images)
We recognize that artificial intelligence (AI) cannot think like humans, but new research has demonstrated how this difference can impact AI decision-making, which can lead to real-world consequences that humans may not be prepared for.
A study published in February 2025 in the journal Transactions on Machine Learning Research analyzed how effectively large language models (LLMs) can generate analogies.
The results showed that in both simple letter string analogies and number matrix tasks (where the task was to fill in a matrix by identifying the missing digit), humans performed well, while the AI's performance dropped significantly.
When testing the reliability of human and AI models in text analogy tasks, the study found that the models were susceptible to response order effects (variations in responses due to the order of processing in the experiment) and may be more prone to paraphrasing.
Overall, the study found that AI models lack the ability to learn from “zero shots,” where the learner observes examples from classes that were not present during training and makes predictions about which class they belong to, depending on the question.
Study co-author Martha Lewis, an associate professor of neurosymbolic artificial intelligence at the University of Amsterdam, cited an example of AI not being able to perform similar reasoning as effectively as humans when solving letter sequence problems.
“The analogies with strings of letters are, ‘If abcd goes to abce, where does ijkl go?’ Most people will answer, ‘ijkm,’ and [the AI] tends to answer that, too,” Lewis told Live Science. “But another problem might be, ‘If abbcd goes to abcd, where does ijkkl go?’ People tend to answer, ‘ijkl,’ because the pattern is to remove the repeating element. But GPT-4 tends to get [these] problems wrong.”
Why it matters that AI can't think like humans
Lewis noted that while we can abstract from specific patterns to more general rules, LLMs lack this ability. “They’re good at recognizing and matching patterns, but they’re not good at generalizing those patterns.”
Most AI applications rely to some degree on volume — the more training data, the more patterns can be uncovered. But Lewis emphasized that pattern matching and abstraction are not the same thing. “It’s more about how the data is used than what’s in the data,” she added.
To illustrate the implications, AI is increasingly being used in the legal field for research, case law analysis, and sentencing suggestions. But with a lower ability to draw analogies, it may not notice how legal precedents apply to slightly different cases when they arise.
Given that the lack of reliability could impact real-world results, the study highlights that this serves as evidence for the need to carefully evaluate AI systems not only for accuracy, but also for the reliability of their cognitive abilities.
Drew TurneySocial Links Navigation
Drew is a freelance journalist with 20 years of experience, specializing in
Sourse: www.livescience.com