Dark
Light

Global AI Safety Report: Experts Urge Enhanced Research and Collaboration

May 14, 2025

Over 100 leading AI experts from eleven countries have joined forces to call for deeper research into controlling general-purpose AI systems. This gathering at the Singapore Conference on AI in April 2025 led to the formulation of the Singapore Consensus on Global AI Safety Research Priorities, which emphasises oversight over the creation of new AI models.

The report zeroes in on general-purpose AI (GPAI)—systems capable of a wide range of cognitive tasks such as language processing and autonomous decision-making. By keeping political debates aside, it sets a clear agenda for technical improvements.

It outlines three key research areas: risk assessment, building trustworthy systems, and post-deployment control. If you’ve ever grappled with the unpredictability of new tech, you’ll appreciate this straightforward breakdown designed to nurture a trusted ecosystem that supports innovation while minimising societal risks.

Risk assessment is the starting point. Here, experts advocate for standardised audit techniques and benchmarks to measure AI-related dangers and social impacts. Developing precise, repeatable methods for gauging risk thresholds is certainly challenging, but it’s a critical step—one that borrows tried-and-tested approaches from industries like nuclear safety and aviation.

Early detection of dangerous capabilities is another focus. Whether it’s potential support for cyberattacks or biological threats, the report recommends “uplift studies” to examine if AI systems might inadvertently boost the effectiveness of malicious users.

The second pillar of this initiative involves building trustworthy systems. This means carefully defining desired behaviours and regularly checking that systems perform as expected. Even minor errors in setting human goals can lead to problems like reward hacking or deceptive outcomes. To address these issues, researchers are exploring robust training methods, targeted model editing, and even designing agent-free or capacity-limited models.

Post-deployment control is the final area under review. Once AI systems go live, maintaining oversight becomes crucial. Traditional monitoring, emergency protocols, and enhanced surveillance measures form part of a scalable strategy to keep systems under control. Research into system “corrigibility”—ensuring that systems remain amenable to correction—further supports this effort.

The report goes beyond individual systems by recommending broader measures to monitor the entire AI ecosystem. With strategies like model tracking, watermarking, comprehensive logging infrastructure, and stronger authentication standards, the goal is to effectively manage issues such as deepfakes and oversee open-source models responsibly.

Notably, the safety measures outlined are designed to benefit all stakeholders—including competitors. By establishing common technical risk thresholds, the AI community can work together to counteract potential risks collectively.

Edited by prominent figures such as Yoshua Bengio, Stuart Russell, and Max Tegmark, and enriched by contributions from respected institutions like Tsinghua, Berkeley, MILA, and OpenAI, this consensus underscores the importance of shared safety standards and inter-organisational collaboration in advancing AI safety.

Don't Miss