The integration of advanced AI tools into software development workflows has been heralded as a major productivity boost, yet recent empirical research suggests that this promise may be more complex in practice. A randomized controlled trial conducted in early 2025 explored the real-world impact of state-of-the-art AI tools on experienced open-source developers. Over the course of the study, 16 seasoned contributors—each with an average of five years working on mature codebases—completed hundreds of tasks, some with access to AI tools and others without.
The AI tools in question included popular platforms such as Cursor Pro and Claude 3.5/3.7 Sonnet, considered cutting-edge as of early 2025. Before beginning the tasks, developers confidently predicted a significant reduction in completion time (around 24%) when using AI. Interestingly, similar optimism was echoed by domain experts in both economics and machine learning. However, the post-trial findings revealed a starkly different reality: developers actually took 19% longer on tasks when AI tools were allowed.
This paradoxical outcome challenges prevailing assumptions about AI’s immediate benefits in skilled programming contexts. The study delved deeper into 20 plausible contributing factors, ranging from project complexity and tooling familiarity to quality standards and human-AI collaboration dynamics. While experimental limitations cannot be fully discounted, the persistence of the slowdown effect across different conditions suggests that the issue lies more with the nuanced interaction between developers and AI tools than with the experimental design itself.
These findings underline the importance of cautious optimism when integrating AI into high-skill domains like software engineering. Rather than a blanket solution for productivity, AI may introduce new cognitive and coordination costs that offset or even outweigh its intended benefits—at least until tools and workflows evolve to better support expert users.