- OpenAI and Paradigm launched EVMbench to evaluate AI programs’ skill to deal with Ethereum sensible contract vulnerabilities.
- The benchmark makes use of 120 real-world audit points and evaluates detection, restore and exploit capabilities in managed environments.
- Early outcomes present vital efficiency variations between GPT-5.3-Codex and GPT-5, highlighting speedy mannequin development.
OpenAI has unveiled EVMbench, a wise contract safety benchmark developed alongside crypto funding agency Paradigm to check synthetic intelligence brokers on Ethereum vulnerabilities. The framework is meant to find out whether or not AI programs can detect, exploit and repair severe flaws in Ethereum sensible contracts.
As a result of sensible contracts are usually immutable as soon as deployed, errors can have enduring monetary penalties. OpenAI mentioned such contracts routinely shield greater than US$100 billion (AU$141 billion) in open-source crypto assets, growing the significance of rigorous safety analysis as AI coding capabilities advance.
Associated: Stripe-Owned Bridge Wins Conditional OCC Approval to Become National Crypto Bank
Measuring AI Efficiency
The dataset underpinning EVMbench consists of 120 curated vulnerabilities drawn from 40 skilled audits, with most sourced from open audit competitions together with Code4rena. Further situations stem from safety auditing work for Tempo, a purpose-built Layer-1 blockchain designed to assist high-throughput, low-cost stablecoin funds.
AI brokers are assessed throughout three classes: detecting recognized vulnerabilities, patching contracts with out compromising supposed performance, and executing exploit makes an attempt inside a managed blockchain setting. Exploit duties are graded utilizing deterministic transaction replay and on-chain checks.
In benchmark outcomes, GPT-5.3-Codex achieved 72.2% in exploit mode, whereas GPT-5 recorded 31.9%, regardless of being launched simply over six months earlier. OpenAI mentioned the target is to create a transparent customary for evaluating AI programs in blockchain safety as decentralised finance continues to develop.
Associated: Ledger Integrates OKX DEX to Enable In-App Multichain Token Swaps
The put up OpenAI Launches EVMbench to Test AI’s Ability to Secure Ethereum Smart Contracts appeared first on Crypto News Australia.



