Microsoft Tests AI Agents in Magentic Marketplace and Identifies Their Flaws

Microsoft закупила 400 МВт сонячної енергії для підтримки ШІ-продуктів

The Microsoft Corporation, in collaboration with the University of Arizona, conducted a series of experiments with leading artificial intelligence models in a specially created simulation environment called Magentic Marketplace. This platform allowed researchers to explore how AI agents behave in competitive and cooperative conditions, as well as to identify their main weaknesses.

This is reported by Business • Media

Conducted Experiments and Results

During the trials, hundreds of artificial agents interacted on a digital marketplace, where client agents performed tasks—such as placing food orders—while corporate agents competed against each other for deals. The source code of the Magentic Marketplace simulation is already open to third-party teams so they can replicate and enhance the conducted research.

The tests demonstrated that modern AI models, particularly GPT-4o, GPT-5, and Gemini 2.5 Flash, are vulnerable to manipulation. Researchers found that agents could be influenced to favor certain sellers, which calls into question their autonomy. Additionally, as the number of possible actions increased, the agents’ performance significantly declined due to cognitive overload.

Collaboration and Autonomy Issues in AI

Another significant issue was the agents’ inability to collaborate effectively without clear instructions. When the models received detailed step-by-step guidance, their performance improved, but even then, there was a limitation in their ability to independently allocate roles and make decisions.

“The key question is whether autonomous systems can effectively interact and negotiate without human oversight.”

According to Edge Kamara, head of the AI Frontiers Lab at Microsoft Research, the results of the experiment indicate a significant gap between the current level of development of AI agents and the expected level of autonomous operation. Despite advancements in generative AI, the path to creating fully autonomous agent systems capable of making complex decisions in real-world environments remains far from completion.

It was previously reported that the team at nof1.ai organized a competition for trading crypto assets among six artificial intelligence models.