Anthropic has released a new study in which modern artificial intelligences demonstrated the ability to identify vulnerabilities in smart contracts operating on the Ethereum and BNB Chain blockchains. The models Claude Sonnet 4.5, Claude Opus 4.5, and GPT-5 were used for testing based on the SCONE-bench dataset, covering the period from 2020 to 2025.
This is reported by Business • Media
AI Effectiveness in Finding Exploits in Smart Contracts
During the experiments, the AI models were able to simulate exploits for approximately half of the recorded hacking incidents. It is estimated that the total value of assets in the hacked contracts at the time of the attacks exceeded $550 million. Special attention was given to checking contracts that were hacked after March 2025 — a period not included in the models’ training data. The AI agents identified 19 vulnerabilities out of 34, which is equivalent to about $4.6 million in potential losses.

Importantly, the models had no prior information about these cases and even discovered new types of vulnerabilities. The best results were shown by Claude Opus 4.5, which generated exploits for 17 cases (50% of the sample) amounting to approximately $4.5 million. Together, the three models identified 19 vulnerabilities, accounting for 55.8% of the test set, with an estimated loss of the same $4.6 million.
Open Benchmark and Future Prospects
Anthropic conducted additional tests for recently deployed contracts to assess the AI’s ability to find previously unknown issues. In this block, two “zero-day” vulnerabilities were identified, confirming the models’ capability to detect defects without historical data or signals.
“The study does not aim to exploit vulnerabilities but is focused on creating tools to assess the ability of AI systems to recognize defects in code,” said Anthropic.
The company plans to make SCONE-bench an open standard for testing and comparing the capabilities of large language models in the field of smart contract security. The authors emphasize that such tools can significantly aid in development and auditing, allowing critical errors to be identified before the code is deployed on the blockchain.
At the same time, Anthropic stresses that the study covers only a sample of historical contracts in controlled conditions, and therefore does not reflect the full picture of risks. The company will continue to expand the benchmark and explore the potential use of AI to support teams working on the security of blockchain protocols.