In a significant ruling, a US judge has decided that using literary works to train AI models doesn’t violate copyright laws. The decision follows last year’s case brought by three authors—a novelist and two non-fiction writers—who claimed that Anthropic had unlawfully used their work to train its Claude AI model. Judge William Alsup described the company’s use of the texts as “exceedingly transformative,” meaning it fits within current US legislation.
However, the judge also denied Anthropic’s motion to dismiss the case, so the firm now faces trial over allegations of using pirated content to build its vast library. With backing from tech giants Amazon and Alphabet, Anthropic could be hit with fines of up to $150,000 for each work implicated. The judge’s findings revealed that the company reportedly holds a repository of more than seven million pirated books.
This verdict comes at a time when legal challenges related to Large Language Models (LLMs) are on the rise. If you’ve ever worried about whether copying for AI training could flout copyright rules, you’ll appreciate the judge’s point: any necessary copying within the LLM framework is deemed transformative. Importantly, the authors did not claim that the training process led to “infringing knockoffs,” a detail that might have shifted the case considerably.
Other disputes have emerged across the industry, affecting media like journalism, music, and video. For example, Disney and Universal have taken legal action against AI image generator Midjourney over piracy issues, while the BBC is weighing its own options against unauthorized content use. In response, some AI companies are now seeking licensing agreements with original creators and publishers.
Although Judge Alsup acknowledged Anthropic’s fair use argument, he also stressed that the firm crossed a line by retaining pirated books in what he described as a “central library of all the books in the world.” Anthropic welcomed the recognition of its transformative use but disagreed with the decision to try the case over the acquisition and use of specific texts. The company remains confident in its legal position and is exploring its options.
The case, featuring works by mystery thriller author Andrea Bartz and non-fiction writers Charles Graeber and Kirk Wallace Johnson, highlights the broader impact such legal battles can have on content creators and the future of AI training methods.