Technology

Publishers sue Meta over alleged AI training on pirated books

Meta is accused of training Llama on millions of pirated books and journal articles, a case that could force AI makers to pay for the data they ingest.

Lisa Park·5/6/2026·2 min read

Published 09:44 PM

Listen to this article•0:00 min

Share this article:

Follow on Google

Five major publishers and novelist Scott Turow have taken Meta Platforms to Manhattan federal court, accusing the company of training its Llama models on millions of copyrighted books and journal articles without permission or payment. The complaint filed Tuesday names Elsevier, Cengage, Hachette Book Group, Macmillan Publishers and McGraw Hill, and says the material included textbooks, scientific articles and novels such as The Fifth Season and The Wild Robot.

The lawsuit goes further by naming Meta founder and CEO Mark Zuckerberg directly, alleging he “personally authorized and actively encouraged” the infringement. The complaint says the copied material came from pirate sites and unauthorized copies, and it seeks class-action status and unspecified monetary damages. At stake is not just whether Meta crossed a legal line, but whether a company can build a lucrative AI product on books, classroom materials and other writing that authors and publishers say should have been licensed.

Meta said training AI on copyrighted material can qualify as fair use and vowed to fight the case aggressively. The company’s response places it squarely inside the industry’s central legal fight: whether training itself is transformative enough to count as fair use when the system can compete with the original works. That question matters most to writers and publishers whose livelihoods depend on licensing fees, sales and the long-term value of their catalogs, especially in textbook publishing where one unauthorized dataset can undercut years of paid work.

The case also lands amid a wider copyright clash across artificial intelligence. Judges have already signaled the law is unsettled, with the first two federal judges to weigh the broader issue issuing diverging rulings last year. In February 2025, a federal judge in Delaware rejected a training-related fair-use defense in Thomson Reuters v. Ross Intelligence, one of the first reported rulings to come down against an AI company on that issue. And in September 2025, a federal judge preliminarily approved Anthropic’s $1.5 billion settlement with authors, a deal that underscored how expensive these cases can become.

For Meta, the lawsuit reaches beyond one model or one complaint. Llama is part of the company’s broader AI ambition, and the outcome could shape whether frontier systems keep relying on broad readings of fair use or whether they must start paying for the books, articles and other human work that power them. For authors and publishers, the case is a test of whether copyright still has force in an economy increasingly built on automated extraction.

This article was produced by Prism’s automated news system from verified source data, official records, and press releases, then run through automated quality and moderation checks before publishing. The system is built and supervised by the people who set the standards it runs under. Read our full AI policy.

Did this article answer your question?

Publishers sue Meta over alleged AI training on pirated books

Discussion (0 Comments)

More in Technology

JetZero builds blended-wing jet to challenge Airbus and Boeing

TV Time app shuts down as Whip Media pivots to enterprise AI

Meta caps Ray-Ban smart glasses voice boost at three hours free