Case Study 1: The Optimism Fraud Proof Problem — Why No Fraud Proof Had Ever Been Submitted (Until Recently)

DataField.Dev

Case Study 1: The Optimism Fraud Proof Problem — Why No Fraud Proof Had Ever Been Submitted (Until Recently)

The Paradox of Unused Security

In the spring of 2024, the Optimism rollup had been operating on Ethereum mainnet for over two years. During that time, it had processed hundreds of millions of transactions and secured billions of dollars in user assets. Its entire security model rested on a single premise: if anyone ever posted an invalid state root, an honest observer would submit a fraud proof during the 7-day challenge period, and the fraudulent state would be rejected.

There was just one problem. In over two years of operation, not a single fraud proof had ever been submitted. Not one. And for most of that period, the fraud proof system was not even functional.

This was not a secret. Optimism launched its mainnet in January 2022 with what the team and the broader community openly called "training wheels." The rollup operated with a centralized sequencer that posted state roots to Ethereum, but the on-chain contract did not have a working fraud proof verification mechanism. Instead, a multisig — a set of trusted signers controlled by the Optimism Foundation — had the power to override any state root. If the sequencer posted something invalid, the multisig could intervene. If the multisig itself went rogue, there was no on-chain recourse.

The question that haunted the L2 security community was both technical and philosophical: Is a system secure if its safety mechanism has never been tested? And was it even accurate to call Optimism a "rollup" during this period, or was it something closer to a multisig-guarded sidechain?

The Technical Background

To understand why Optimism launched without fraud proofs, you need to understand the engineering difficulty involved.

A fraud proof system for an optimistic rollup must do something remarkably ambitious: it must allow Ethereum's Layer 1 to re-execute a disputed computation and determine who is correct — the sequencer who posted the state root, or the challenger who claims it is invalid. This means L1 must be able to emulate the rollup's execution environment.

Optimism's original architecture used the OVM (Optimistic Virtual Machine), a modified version of the EVM designed to be "provable" on L1. The OVM recompiled smart contracts so that certain operations (like reading block timestamps or interacting with L1 state) were intercepted and handled through a sandbox. In theory, a fraud proof would re-execute the disputed transaction inside this sandbox on L1.

In practice, the OVM was fiendishly complex. Every edge case in the EVM — every precompiled contract, every gas accounting subtlety, every interaction between storage and memory — had to be faithfully replicated in the on-chain sandbox. A single discrepancy between the L2 execution environment and the L1 fraud proof sandbox would mean that a legitimate fraud proof could fail, or that a fraudulent proof could succeed.

The Optimism team decided that shipping a functional L2 with training wheels was better than waiting years for a perfect fraud proof system. The multisig provided a safety net, and the team's reputation provided an additional (albeit informal) guarantee.

The Criticism

The decision was controversial. Critics raised several pointed objections:

The marketing mismatch. Optimism was frequently described — by the team, by media, by DeFi dashboards — as a "rollup." But the technical definition of a rollup includes the ability for users to verify the state and force-withdraw assets using on-chain proofs. Without a working fraud proof system, Optimism's security was equivalent to trusting the multisig. In the words of L2Beat, which tracks rollup security, Optimism was classified as "Stage 0" — full training wheels.

The precedent concern. If the largest optimistic rollups could operate without fraud proofs and still attract billions in TVL, what incentive was there to implement them at all? The training wheels were comfortable. Users clearly did not care (or did not understand the distinction). The fear was that fraud proofs would become a perpetual "coming soon" feature.

The trust assumption. The Optimism multisig was controlled by a small number of individuals, many of whom were employees or close associates of OP Labs. While these individuals were generally well-regarded in the Ethereum community, the security of billions of dollars rested on the assumption that none of them were compromised, coerced, or colluding. This was the exact type of trust assumption that blockchain technology was supposed to eliminate.

The data availability question. Even if the multisig was honest, the lack of a fraud proof mechanism meant that the system had no formal way to detect if the sequencer was silently posting incorrect state roots. The multisig would need to independently verify every state root — and if they relied on the same software as the sequencer, a common-mode bug could affect both.

The Defense

Optimism's defenders offered counterarguments:

Pragmatic shipping. The alternative to launching without fraud proofs was not launching at all. The L2 ecosystem needed live systems to attract developers, users, and feedback. Waiting for perfect security before shipping would have delayed the entire L2 ecosystem by years.

Transparency. Unlike many projects that obscure their trust assumptions, Optimism was relatively transparent about its training wheels. The team published detailed documentation of the multisig setup, the upgrade process, and the roadmap for fraud proofs. Users who deposited funds did so with (theoretically) available information about the risks.

Incremental security. The path to Stage 2 was always envisioned as incremental. First, launch with a multisig (Stage 0). Then, implement fraud proofs with a Security Council backstop (Stage 1). Finally, remove the training wheels entirely (Stage 2). This is how most complex systems are built — incrementally, with safety nets at each stage.

The broader ecosystem comparison. At the time of Optimism's launch, the alternative for users was either paying exorbitant L1 fees or using sidechains (like Polygon PoS) with even weaker security guarantees. Even with training wheels, Optimism arguably offered a better security profile than a sidechain with 100 validators.

The Resolution: Fault Proofs Go Live

On June 10, 2024, Optimism activated its Fault Proof VM (FPVM) on mainnet — the first permissionless fraud proof system for the OP Stack. This was a milestone roughly two and a half years in the making.

The FPVM works as follows:

A proposer posts an output root (state root) to L1, along with a bond (a deposit that is forfeited if the claim is proven false).
During the challenge period, any challenger can dispute the output root by initiating a bisection game on L1.
The bisection game proceeds through multiple rounds, narrowing the dispute to a single instruction.
The disputed instruction is executed in the MIPS VM — an on-chain emulator of the MIPS instruction set architecture, which runs a compiled version of the OP Stack's state transition function.
The game resolves: the loser forfeits their bond, and the correct output root is accepted.

The use of MIPS (a well-understood, minimal instruction set) was a deliberate architectural choice. Rather than trying to prove EVM execution directly on L1 (the approach that proved too complex with the original OVM), the OP team compiled the entire state transition function into MIPS and proved individual MIPS instructions on L1. This made the on-chain verification component much simpler and easier to audit.

Within the first few months of deployment, the system was battle-tested through a bug bounty program and several deliberately triggered challenges. And then, inevitably, the first real fraud proof was submitted — not because of a malicious sequencer, but because of a bug in a third-party proposer that submitted an incorrect output root. The fraud proof system worked as designed: the incorrect output root was challenged and rejected, and the challenger received the proposer's bond.

The Philosophical Aftermath

The Optimism fraud proof saga raises questions that extend beyond any single rollup:

Can security be retroactive? Optimism operated for over two years without its primary safety mechanism. During that time, billions of dollars were at risk of multisig failure. The fact that no attack occurred does not retroactively make the system secure during that period — it makes the system lucky. Or does it? If the multisig was genuinely honest and competent, and the system was monitored by many eyes, perhaps the practical security was adequate even without formal fraud proofs.

What is the right standard for "good enough" security? Traditional finance operates with circuit breakers, insurance funds, and regulatory backstops — not mathematical proofs. The blockchain ethos demands trustlessness, but the path to trustlessness is not instantaneous. Is there a principled way to evaluate "how trustless is trustless enough" at each stage of development?

How should users evaluate L2 risk? The L2Beat Stage classification (0, 1, 2) provides a useful framework, but most users never check it. The DeFi dashboards show TVL and yields, not security stages. This information asymmetry is a persistent problem in the L2 ecosystem.

Is the honest minority assumption practical? Now that fraud proofs work, the security depends on at least one honest party watching the chain and willing to submit a fraud proof. But who, specifically, is watching? Is it economically rational for a third party to run fraud-detecting infrastructure when the expected return (winning a bond) is low and the expected frequency of fraud is near zero? The mechanism works in theory, but its practical robustness depends on the presence of well-funded, always-online watchers.

Lessons for the L2 Ecosystem

Training wheels are a valid engineering approach, but they must be temporary. Launching with a multisig and a clear roadmap to trustlessness is reasonable. Operating with a multisig indefinitely is not a rollup — it is a trusted bridge.
Transparency about trust assumptions is non-negotiable. Users cannot evaluate risk if they do not understand the system's actual security properties. The L2Beat classification system is one of the most important contributions to the ecosystem precisely because it makes these distinctions clear.
The first use of a safety mechanism is the most important. Until a fraud proof (or any safety mechanism) is tested in production, it is an untested theory. The controlled challenges and the first real fraud proof on Optimism were critical validation events.
Security is a spectrum, not a binary. The journey from Stage 0 to Stage 2 is not a flip of a switch. Each stage removes a trust assumption and adds a formal guarantee. Users should understand where on this spectrum their rollup falls.

Discussion Questions

Would you have deposited assets into Optimism before fraud proofs were activated? What factors would influence your decision?
The Optimism team chose to launch without fraud proofs rather than delay for years. Was this the right decision? What would you have done differently?
How should DeFi dashboards and aggregators communicate L2 security properties to users? Should protocols be required to display their L2Beat Stage classification?
If no fraud proof is ever submitted in the normal course of operations (because the sequencer is honest), how can users be confident that the system would work if it were needed? Is there a way to regularly test the safety mechanism without disrupting the rollup?