A new study from the METR institute reveals a surprising trend among seasoned open‑source developers. Despite a strong belief that AI coding tools accelerate work, those using such tools actually ended up taking more time.
In a trial conducted early in 2025, 16 experienced developers tackled 246 tasks split between AI‑assisted work—with tools like Cursor Pro leveraging models such as Claude 3.5 and 3.7 Sonnet—and traditional coding without AI help. While respondents initially expected a 24% speed boost, the reality was quite the opposite: tasks took 19% longer when AI was in the mix.
The study used a randomised controlled trial (RCT), widely considered the gold standard for assessing cause and effect. By recording task durations and adjusting for task difficulty with developers’ own time estimates, the research managed to isolate the impact of AI. Even more intriguingly, after finishing their work, developers still felt that AI had made them 20% faster, seemingly missing the extra time that creeped into their workflow.
What’s happening is that AI tools are shifting the work process. Instead of simply coding faster, developers found themselves spending more time on activities like prompting, reviewing, and even waiting. One developer pointed out that in mature, complex projects—where high standards are essential—such extra oversight is almost inevitable. In contrast, for new projects or rapid prototypes, these tools might indeed offer a noticeable boost.
These findings highlight the need for more nuanced ways to evaluate generative AI’s real impact. Traditional benchmarks often look at isolated tasks, but real-world scenarios tell a more comprehensive story. If you’ve ever felt that a tool isn’t quite living up to its promise, this study suggests it might be worth rethinking how we measure AI’s benefits in everyday development work.