The AI's high average scores are due to the fact that law students are trained on vast amounts of content from the internet, allowing them to effectively reproduce popular humorous sayings. (Image credit: Kilito Chan/Getty Images)
Don't be too quick to quit your day job, as new research suggests artificial intelligence (AI) may be funnier than you.
In a recent study assessing the co-creation capabilities of large language models (LLMs), internet memes generated by OpenAI’s GPT-4o model scored higher on average for humor, creativity, and spread than memes created by humans or humans with the help of a chatbot. However, when it came to the quality of the highest-scoring memes, human-generated humor still prevailed.
The results of the study were published on the preprint server arXiv on January 20 and presented at the 30th International Conference on Intelligent User Interfaces, held from March 24 to 27 in Cagliari, Italy.
Commenting on the results on the BlueSky social network, Ethan Mollick, a professor and co-director of the Generative Artificial Intelligence Lab at Wharton University in Pennsylvania, said: “I regret to report that the meme Turing test has been passed.”
The original Turing test was proposed in 1950 by British mathematician Alan Turing as a standard for machine intelligence: if a judge could not distinguish between a human and a machine in conversation, the machine could be said to exhibit human-level intelligence.
While the study did not assess whether AI-generated memes were indistinguishable from human-generated memes, it does raise interesting questions about how we perceive creativity, especially given that participants often rated AI-generated content more positively.
machem training
The researchers from KTH Royal Institute of Technology, Ludwig Maximilian University of Munich and Technical University of Darmstadt did not set out to demonstrate the AI’s comedy abilities. Instead, they decided to investigate co-creation, specifically how LLMs can support humans in creative tasks such as writing jokes.
They chose meme creation as an ideal test case, given its combination of cultural references, sarcasm, and low-pressure performance. Memes are typically images with captions that rhyme with familiar situations or pop culture. They have become a sort of common internet shorthand, used to joke or react to contemporary events in an easily digestible and often irreverent format.
“The complexity of humor makes it a rich area for studying the dynamics of co-creation, as participants must navigate these nuances to create content that resonates with others,” the researchers wrote in their paper.
The experiment consisted of two stages. In the first stage, the researchers gathered 124 participants and divided them into two groups: one worked individually, and the other – with the help of a chatbot based on artificial intelligence.
Participants were given three rounds of captioning classic meme templates on work, food, and sports, including templates for Fry from Futurama, Dodge, and Boromir (you can't just walk into Mordor). The AI-powered group could use a chatbot to generate ideas, but were responsible for choosing the best ones and creating the final memes themselves.
The all-human group created 335 memes, while 307 were developed by hybrid human-AI teams. Another 150 memes were created by GPT-4o for comparison.
In the second stage, 98 people rated the memes based on humor, creativity, and prevalence. The memes were randomly mixed, so the evaluators did not know who or what created them. In all three categories, the AI-generated memes came out on top.
“Interestingly, memes created solely by AI performed better on average than memes created solely by humans and memes created
Sourse: www.livescience.com