AI Chatbots Echo CCP Narratives: A Call for Data Integrity

Artificial Intelligence chatbots occasionally mirror narratives that align with Chinese state perspectives, especially when tackling sensitive topics. The American Security Project (ASP) recently dug into how CCP censorship and disinformation are seeping into the training data of leading platforms like OpenAI’s ChatGPT, Microsoft’s Copilot, Google’s Gemini, DeepSeek’s R1, and xAI’s Grok.

If you’ve ever struggled with ensuring that your digital tools are both balanced and reliable, you’ll understand the challenge here. For instance, Microsoft’s Copilot has come under scrutiny for often presenting CCP-backed viewpoints in a seemingly credible tone, while Grok tends to offer a more critical stance toward Chinese state narratives. This contrast underscores the varied approaches across the tech landscape.

The core issue is the vast dataset these models are trained on. Tactics like astroturfing—where content is generated by state-sponsored agents under false identities—flood these datasets, making it harder for developers to filter out biased information. When state media amplifies this content, it becomes a significant hurdle to ensure that AI outputs remain factual and impartial.

Navigating these challenges is particularly tricky for companies operating globally. In China, strict regulations require AI chatbots to promote socialist values and positive energy, with severe consequences for deviations. This means that topics like ‘Tiananmen Square’ or ‘democracy’ might be scrubbed from the conversation, leading to where responses differ markedly depending on the language of the query.

The ASP investigation also revealed a fascinating language divide. When asked in English about COVID-19’s origins, models like ChatGPT, Gemini, and Grok leaned toward the widely supported theory of zoonotic transmission, albeit nodding to the lab leak discussion. In contrast, when the same questions were posed in Chinese, the narrative shifted, often describing the outbreak as an ‘unsolved mystery.’

A similar pattern emerged with Hong Kong’s civil liberties. English prompts tended to elicit acknowledgements of diminished freedoms, while Chinese inquiries downplayed these concerns to better reflect CCP narratives. Notably, on the topic of the Tiananmen Square Massacre, only Grok in English pointed to the documented military violence against civilians, with responses in Chinese being further sanitised.

At its core, the ASP report is a reminder that AI alignment is only as good as its training data. As someone who values reliable technology, you know the risks that misaligned AI can pose—not just to industries but to the very fabric of democratic institutions and national security. Broadening access to verified data is a key step towards mitigating these challenges.