Wikipedia Bans AI-Generated Text in Articles Under New Editing Policy

Thank you for reading this post, don't forget to subscribe!

In brief

Wikipedia now prohibits editors from using large language models to generate or rewrite article content.
The policy still allows limited AI-assisted copyediting if editors review the changes and no new content is introduced.
The rule reflects growing concerns about hallucinations, fabricated sources, and accuracy in AI-generated text.

Wikipedia editors have moved to restrict how artificial intelligence can be used on the platform, in a recent policy update banning the use of large language models to write or rewrite articles.

The new guideline reflects growing concern within the Wikipedia community that AI-generated text can conflict with the platform’s standards, particularly around verifiability and reliable sourcing.

“Text generated by large language models often violates several of Wikipedia’s core content policies,” the policy update reads. “For this reason, the use of LLMs to generate or rewrite article content is prohibited, save for the exceptions given below.”

The policy still allows limited use of AI tools, including suggesting basic copy edits to an editor’s own writing, provided the system does not introduce new information. However, editors are advised to review those suggestions carefully.

While the new policy does not mention penalties for using AI-generated content, according to Wikipedia’s guidelines around disclosure, repeating misuse forms a “pattern of disruptive editing,” and may lead to a block or ban. Wikipedia does give editors a path to reinstate their accounts following an appeal process.

“Blocks can be reversed with the agreement of the blocking admin, an override by other admins in the case that the block was clearly unjustifiable, or (in very rare cases) on appeal to the Arbitration Committee,” Wikipedia said.

According to Emily M. Bender, a professor of linguistics at the University of Washington, some uses of language models in editing tools may be reasonable, but drawing a clear boundary between editing and generating text can be difficult.

“So one of the things that you can do with a language model is build a very good spell checker, for example,” Bender told Decrypt. “I think it’s reasonable to say it’s fine to run a spell checker over edits. And if you are doing the next level up, a grammar checker, that can also be fine.”

Bender said the challenge comes when systems move beyond correcting grammar and begin altering or generating content, noting that large language models lack the kind of accountability that human contributors bring to collaborative knowledge projects.

“Using large language models to produce synthetic text, it is a fundamental property of these systems that there is no accountability, no connection to what someone believes or stands behind,” she said. “When we speak, we speak based on what we believe and what we are accountable for, not based on some objective notion of truth. And that’s not there for large language models.”

Bender said widespread use of AI-generated edits could also affect the site’s reputation.

“If people are instead taking shortcuts and making something that looks like a Wikipedia edit or article and putting it there, then that degrades the overall value and reputation of the site,” she said.

Joseph Reagle, associate professor of communication studies at Northeastern University, who studies Wikipedia’s culture and governance, said the community’s response reflects longstanding concerns about accuracy and sourcing.

“Wikipedia is wary of AI generated prose,” Reagle told Decrypt. “They take the accurate characterizations of what reliable sources state about a topic seriously. AI has had serious limitations on that front, such as ‘hallucinated’ claims and fabricated sources.”

Reagle said Wikipedia’s core policies also shape how editors view AI tools, noting that many large language models have been trained on Wikipedia content. In October, the Wikimedia Foundation said human visits to Wikipedia fell about 8% year over year as search engines and chatbots increasingly provide answers directly on their platforms, rather than sending users to the site.

In January, the Wikimedia Foundation announced agreements with AI companies, including Microsoft, Google, Amazon, and Meta, allowing them to use Wikipedia material through its Enterprise product, a commercial service designed for large-scale reuse of its content.

“While the use of Wikipedia content is permitted by Wikipedia’s licenses, there’s still some antipathy among Wikipedians about services that appropriate the content of communities and then place unwanted demands on those communities to deal with the consequent glut of AI slop,” Reagle said.

Despite the prohibition on using LLMs, Wikipedia does permit AI tools to translate articles from other language editions into English, provided editors verify the original text. The policy also warns editors not to rely on writing style alone to identify AI-generated content and instead focus on whether the material complies with Wikipedia’s core policies and the contributor’s editing history.

“Some editors may have similar writing styles to LLMs,” the update says. “More evidence than just stylistic or linguistic signs is needed to justify sanctions, and it is best to consider the text’s compliance with core content policies and recent edits by the editor in question.”

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Source link