Labeling AI-generated science posts may boost the credibility of false claims
What happens when you tell people that a piece of scientific information was generated by artificial intelligence? The intuitive answer is that they become more skeptical. The actual answer, according to a new experimental study, is more troubling: they become more skeptical of the true information and less skeptical of the false information.
The crossover nobody expected
Teng Lin, a PhD candidate at the University of Chinese Academy of Social Sciences in Beijing, and colleague Yiqing Zhang designed an experiment to test whether AI disclosure labels achieve what regulators intend them to do. They recruited 433 participants through the Credamo platform between March and May 2024 and presented them with social media posts formatted in the style of Weibo, the Chinese microblogging platform.
The posts came in four varieties: accurate scientific information with an AI label, accurate information without a label, misinformation with an AI label, and misinformation without a label. The texts were adapted using GPT-4 from items published by China's Science Rumour Debunking Platform, and the researchers independently verified accuracy. Participants rated the credibility of each post on a 1-to-5 scale.
The results were stark. When true scientific content carried an AI label, participants rated it as less credible. When false scientific content carried the same label, they rated it as more credible. Teng calls this the "truth-falsity crossover effect."
Why skepticism does not cut evenly
The asymmetry is the key finding here. If AI labels simply made people more cautious across the board, that might be defensible as a net positive. Instead, the labels appear to redistribute credibility in a way that actively harms public understanding.
Why would a label increase trust in false content? One possible mechanism is that the AI label introduces a secondary signal that overwhelms the content itself. People who are already uncertain about a scientific claim may interpret the label as a marker of sophistication or computational authority, lending the content an unearned sheen of credibility. For true content, the opposite dynamic appears to play out: people who are already skeptical of AI may discount accurate information simply because a machine produced it.
The researchers also examined whether attitudes toward AI moderated the effect. Participants with more negative views of AI penalized true information even more heavily when it was labeled. But even among AI skeptics, the credibility boost for false claims was only partially reduced and varied by topic.
Algorithm aversion is not a uniform shield
This nuance matters for policy. A common assumption in regulatory circles is that so-called "algorithm aversion," the documented tendency of some people to distrust algorithmic outputs, would act as a natural corrective. If people are wary of AI-generated content, the thinking goes, labeling it should help them discount unreliable information.
The data suggest otherwise. Algorithm aversion does not produce a blanket rejection of AI content. It produces an asymmetric reaction that is, in this experimental context, worse than no label at all.
What regulators might consider instead
The study has direct implications for jurisdictions that are implementing or considering AI transparency requirements. The European Union's AI Act, for example, includes provisions for labeling AI-generated content. Similar proposals are under discussion in multiple countries.
Teng and Zhang offer two recommendations, both of which they emphasize need further validation. The first is a dual-labeling approach: instead of simply marking content as AI-generated, the label could include a disclaimer noting that the information has not been independently verified, or carry a risk warning. The second is a graded labeling system that calibrates the strength of disclosure to the risk level of the content. Medical or health-related information, for instance, might warrant a stronger warning than a post about consumer technology.
Limits of the evidence
The study was conducted on a single platform format (Weibo-style posts) with Chinese-language participants. Whether the crossover effect replicates across different platforms, languages, cultures, and content types remains an open question. The sample size of 433, while adequate for the experimental design, limits the ability to detect smaller effects or to disaggregate results by demographic subgroups.
The study also tested only a binary label: either the post was marked as AI-generated or it was not. More nuanced labeling strategies were discussed in the paper but not experimentally tested.
Still, the core finding is clear enough to give policymakers pause. When a transparency measure designed to protect the public from misinformation ends up making misinformation more credible, the measure needs rethinking before it gets scaled.