Artificial intelligence does not always enhance programmer efficiency, especially when working on open source projects. A new study by Model Evaluation and Threat Research (METR) revealed that the use of AI tools in practice led to a 19% slowdown in task completion among experienced developers.
This is reported by Business • Media
Disappointment in AI Productivity: Experiment Results
During the experiment, 16 professional developers worked on real tasks: from bug fixing to code refactoring in large open source repositories. Half of the tasks were performed using AI tools like Claude and Cursor Pro, while the other half used traditional methods. Although programmers expected a productivity boost of up to 24%, tasks using AI were actually completed more slowly.
The main reason for the time loss was the need to verify the results of code generation, wait for responses, and overcome the inefficiencies of the tools in understanding the project context. In 56% of cases, developers had to manually refine the code suggested by the AI. 9% of working time was spent solely on validating the AI’s responses.
“Screen recordings showed that while AI accelerates code writing and testing, this advantage is offset by the time spent formulating queries, checking results, and waiting for generation.”
AI Limitations in Complex Projects
Researchers emphasize that most popular benchmarks are based on simplified tasks, while in real projects, programmers deal with millions of lines of code and years of change history. In such conditions, understanding hidden dependencies, quality standards, and unwritten code requirements is critically important – and here AI currently shows weak results.
As a result, researchers concluded that modern artificial intelligence tools are ineffective in performing complex tasks in mature projects, where speed is secondary to the importance of quality. However, METR believes that with the improvement of models, particularly Claude 3.7, the situation may improve.
Overall, the study highlights that while AI is already useful for certain aspects of programming, its real application in large, complex projects remains limited for now. Developers and companies are advised to be mindful of these limitations and not to overestimate expectations from code automation using artificial intelligence.