(Image credit: SEAN GLADWELL/Getty Images)
While the ways humans and artificial intelligence (AI) systems think differ greatly, recent research has shown that AI sometimes makes decisions as irrationally as we do.
In nearly half of the situations examined in the new study, ChatGPT exhibited many of the most common human decision-making biases. The results, published April 8 in the journal Manufacturing & Service Operations Management, are the first to evaluate ChatGPT’s behavior across 18 known cognitive biases found in human psychology.
The study's authors, representing five academic institutions in Canada and Australia, tested OpenAI's GPT-3.5 and GPT-4 — the two large language models (LLMs) on which ChatGPT is based — and found that while they were “impressively consistent” in their inferences, they were not immune to human flaws.
Moreover, according to the authors, this sequence has both positive and negative consequences.
“Managers will benefit most from using these tools for tasks with clear, formulaic decisions,” said lead study author Ian Chen, an associate professor of operations management at Ivey Business School. “However, if you apply them to subjective or preference-based decisions, proceed with caution.”
The study looked at known human biases such as risk aversion, overconfidence, and the endowment effect (where we place more value on items we own) and applied them to the cues given to ChatGPT to see if it would fall into the same traps as humans.
Rational decisions – sometimes
The researchers asked LLMs hypothetical questions based on traditional psychology and in the context of real-world commercial applications in areas such as inventory management or supplier negotiations. The goal was to find out not only whether the AI would mimic human biases, but also whether it would do so in response to questions from different business areas.
GPT-4 outperformed GPT-3.5 when solving problems with clear mathematical solutions, showing fewer errors in probabilistic and logical situations. However, in subjective simulations, such as choosing a risky option to make a profit, the chatbot often reflected the irrational preferences characteristic of humans.
“GPT-4 exhibits a stronger certainty bias than even humans,” the researchers note in the paper, referring to the AI’s tendency to seek safer and more predictable outcomes when solving ambiguous problems.
More importantly, the chatbots’ behavior remained largely consistent, regardless of whether the questions were framed as abstract psychological tasks or as operational business processes. The study concluded that the biases identified were not simply the result of learned examples, but part of the way the AI reasoned.
One striking finding of the study was that GPT-4 sometimes amplified human errors. “In the confirmation bias test, GPT-4 always gave biased responses,” the study’s authors wrote. It also showed a stronger tendency to “hot-hand” bias (a bias in expecting patterns in randomness) than GPT-3.5.
On the other hand, ChatGPT managed to avoid some common human biases, including baseline ignorance (where we ignore statistical facts in favor of specific information) and sunk cost bias (where decisions are influenced by costs already incurred, leading to judgment being clouded by irrelevant information).
According to the authors, ChatGPT’s human-like biases arise from training data that contains cognitive biases and heuristics that are characteristic of humans. These tendencies are amplified during fine-tuning, especially when human feedback favors plausible responses over rational ones. When faced with more ambiguous tasks, the AI leans more toward human reasoning patterns than pure logic.
“If you need accurate, unbiased decision support, use
Sourse: www.livescience.com