Publishers sue Meta over alleged AI training on pirated books
Meta is accused of training Llama on millions of pirated books and journal articles, a case that could force AI makers to pay for the data they ingest.

Five major publishers and novelist Scott Turow have taken Meta Platforms to Manhattan federal court, accusing the company of training its Llama models on millions of copyrighted books and journal articles without permission or payment. The complaint filed Tuesday names Elsevier, Cengage, Hachette Book Group, Macmillan Publishers and McGraw Hill, and says the material included textbooks, scientific articles and novels such as The Fifth Season and The Wild Robot.
The lawsuit goes further by naming Meta founder and CEO Mark Zuckerberg directly, alleging he “personally authorized and actively encouraged” the infringement. The complaint says the copied material came from pirate sites and unauthorized copies, and it seeks class-action status and unspecified monetary damages. At stake is not just whether Meta crossed a legal line, but whether a company can build a lucrative AI product on books, classroom materials and other writing that authors and publishers say should have been licensed.
Meta said training AI on copyrighted material can qualify as fair use and vowed to fight the case aggressively. The company’s response places it squarely inside the industry’s central legal fight: whether training itself is transformative enough to count as fair use when the system can compete with the original works. That question matters most to writers and publishers whose livelihoods depend on licensing fees, sales and the long-term value of their catalogs, especially in textbook publishing where one unauthorized dataset can undercut years of paid work.
The case also lands amid a wider copyright clash across artificial intelligence. Judges have already signaled the law is unsettled, with the first two federal judges to weigh the broader issue issuing diverging rulings last year. In February 2025, a federal judge in Delaware rejected a training-related fair-use defense in Thomson Reuters v. Ross Intelligence, one of the first reported rulings to come down against an AI company on that issue. And in September 2025, a federal judge preliminarily approved Anthropic’s $1.5 billion settlement with authors, a deal that underscored how expensive these cases can become.
For Meta, the lawsuit reaches beyond one model or one complaint. Llama is part of the company’s broader AI ambition, and the outcome could shape whether frontier systems keep relying on broad readings of fair use or whether they must start paying for the books, articles and other human work that power them. For authors and publishers, the case is a test of whether copyright still has force in an economy increasingly built on automated extraction.
Know something we missed? Have a correction or additional information?
Submit a Tip
