Discover the Latest Features of Claude Opus 4.8 AI Model

Anthropic has announced the release of an updated version of its flagship artificial intelligence model — Claude Opus 4.8. The model has received a number of improvements compared to the previous version, including enhanced performance in key benchmarks and advancements in transparency when working with code and generating responses.

This is reported by Business • Media

Improvements in Benchmarks and Functionality

Claude Opus 4.8 demonstrates significant progress in tests that evaluate the model’s ability to correct real errors in code and solve complex tasks. Specifically, in SWE-Bench Pro, the model achieved a score of 69.2% (compared to 64.3% in version 4.7), surpassing not only its predecessor but also its main competitor — OpenAI GPT-5.5, which scored 58.6%. In the OSWorld test (evaluating real tasks within operating systems), Claude Opus 4.8 scored 83.4%. In the GDPval-AA intelligence benchmark, the score was 1890 points, significantly higher than the previous version.

At the same time, in specialized tests, particularly Terminal-Bench 2.1, Claude Opus 4.8 still lags behind GPT-5.5. However, in Humanity’s Last Exam — a comprehensive set of 2500 scientific questions — the model achieved 49.8% without tools and 57.9% with them, outperforming three main competitors.

Among expert feedback, the claim from Linkup stands out, stating that Claude Opus 4.8 is the only model that managed to pass all cases within the Super-Agent benchmark while maintaining pricing comparable to the previous version and GPT-5.5.

One of the main advantages is the increase in honesty: the model is four times less likely to hide its own errors in code and is less prone to unverified claims. When compared to the previous release, the company emphasized that in terms of cybersecurity, Opus 4.8 does not outperform the closed model Mythos Preview.

“We tested the model on a set of cybersecurity tests, some of which we used for the first time in the system map. When operating without security measures, Opus 4.8 demonstrates somewhat higher capabilities than Claude Opus 4.7; with security measures, its performance is comparable. It still significantly lags behind Mythos Preview in cybersecurity capabilities,” the report on the model states.

Regarding discussions of sensitive topics, the model shows the same results as Opus 4.7, although it more frequently acknowledges opposing viewpoints during political discussions and notes a certain decrease in satisfaction with its responses.

Innovations and Company Prospects

Claude Opus 4.8 has received new features, including Dynamic Workflows in Claude Code. The model can now break down complex tasks into parts using subagents, allowing for higher quality work within a single session, and the results undergo additional verification. This capability is already available to users on the Enterprise, Team, and Max plans.

Another innovation is the ability to select the computation volume in the model selector: from Low to Max, with a default value of High. This affects the depth of responses and token consumption, and the feature is available for all pricing plans.

The Fast Mode has become three times cheaper, allowing for faster query execution without sacrificing performance. Users also gained the ability to refine and supplement queries during task execution, with Claude not re-reading the entire context each time.

Anthropic has also increased the query limits in Claude Code and announced preparations for the public release of the Mythos family, scheduled for the coming weeks. These models were previously considered too dangerous for open launch.

The release of Claude Opus 4.8 and the announcement of Mythos occurred against the backdrop of Anthropic’s preparations for an initial public offering (IPO). The company recently completed a Series H funding round, raising $65 billion with a business valuation of $965 billion — more than double the previous valuation from February and even exceeding the market value of OpenAI.

The investments raised will be used by Anthropic to scale and develop its own high-performance computing capabilities. The company’s recent achievements, including the release of Opus 4.8 and the announcement of Mythos, strengthen competition with OpenAI, although both companies do not disclose the timelines for their upcoming IPOs.

Improvements in Benchmarks and Functionality

Innovations and Company Prospects

Share: