Dark
Light

Google continues using publisher content for search AI training

May 8, 2025

Despite publishers trying to pull 80 billion training tokens from Google Deepmind, Google keeps using their content to power its search AI systems. At a recent Washington court hearing, Eli Collins, Vice President at Google Deepmind, confirmed that while the opt-out policy applies to Deepmind, other teams—particularly those in web search—aren’t bound by these rules.

Diana Aguilar from the US Department of Justice asked if the Gemini AI model, now part of the search organisation, could train on data that publishers had chosen to opt out of. Collins replied, “Correct — for use in search.” This practice fuels features like “AI Overviews,” which place AI-generated responses above standard search results, potentially cutting into traffic for original content sites.

An internal document from 2024 revealed that out of 160 billion tokens earmarked for training, half were removed after publishers opted out. Nevertheless, Collins’ comments suggest that this data still finds a place in Google’s broader search AI systems, effectively bypassing the opt-out intentions.

This issue is a pivotal point in the ongoing antitrust lawsuit, which calls for Google to divest its Chrome browser and stop paying partners to lock in Google as the default search engine. The Department of Justice argues that these same restrictions should extend to Google’s AI products, including Gemini, given their close ties with Google’s search dominance.

For anyone who has wrestled with the complexities of content scraping and fair use, this development underlines a critical shift in how premium training data might soon be treated in the market. It could challenge the tradition of scraping publicly available web content, a stance already questioned in a recent case against Meta.

If you’re keeping an eye on AI and tech regulations, it’s worth watching how these practices evolve and what they mean for content creators and publishers alike.

Don't Miss