Dark
Light

Senate Investigates Meta’s Use of Pirated Books for AI Training

July 17, 2025

At a recent Senate Judiciary Subcommittee hearing, Senator Josh Hawley brought up concerns about Meta’s decision to use pirated books for its AI training. Over 200 terabytes of copyrighted material were utilised without compensating the authors, and internal emails reveal that several employees were uneasy about the ethics of the practice.

This development forms part of a broader probe into AI companies’ handling of copyrighted materials. Just weeks ago, a San Francisco ruling allowed firms like Meta and Anthropic to continue using books without explicit permission, complicating the balance between innovation and authors’ rights. Legal scholar Bhamati Viswanathan noted that while not all data is obtained unlawfully, there is unmistakable evidence of pirated content being used.

Authors such as David Baldacci have testified about the unauthorised use of their work, and legal expert Maxwell Pritt highlighted the troubling use of so-called ‘shadow libraries’. According to Pritt, Meta had initially considered licensing the material but chose pirated content instead to meet tight deadlines. Furthermore, Professor Edward Lee mentioned a recent court ruling by Judge Vince Chhabria, suggesting that although the impact on the market wasn’t clearly shown, there remains potential for further legal disputes.

If you’ve ever struggled with the fine line between tech innovation and safeguarding creative rights, these revelations will strike a chord. The case serves as a timely reminder that as technology evolves, so too must our considerations of fairness and legal accountability.

Don't Miss