Chapter 12 Key Takeaways: The Ethereum Virtual Machine
Core Concepts
1. The EVM is a deterministic, stack-based virtual machine
Every Ethereum node runs its own copy of the EVM. Given identical state and the same transaction, every node must produce identical results. The EVM achieves this through a stack-based architecture with a 256-bit word size, approximately 140 opcodes, and strict sandboxing that forbids all nondeterministic operations (I/O, randomness, network access). The stack holds up to 1,024 items, but only the top 16 are directly accessible.
2. The three data locations have dramatically different costs
- Stack: Free (cost is embedded in opcodes). Volatile. Limited to 1,024 items.
- Memory: Cheap for small sizes (3 gas per word), but expansion cost is quadratic. Volatile — resets each call.
- Storage: Extremely expensive (20,000 gas for new slot, 5,000 for update, 2,100 for cold read). Persistent across transactions. This is where state lives.
The single most important gas optimization principle: minimize storage writes. One SSTORE costs as much as ~6,600 ADD operations.
3. Gas costs reflect real network resource consumption
Gas pricing is not arbitrary. It tracks the hardware cost hierarchy: CPU operations (ADD: 3 gas) are cheapest, memory operations (MLOAD: 3 gas + expansion) are moderate, and disk/state operations (SSTORE: up to 20,000 gas) are expensive. The cold/warm distinction (EIP-2929) adds precision by charging more for first-time access (disk read) than cached access (memory hit). Gas costs are also a security mechanism — they prevent DoS attacks, ensure execution termination, and historically provided implicit reentrancy protection through the .transfer() gas stipend.
4. Solidity compiles to two distinct bytecode phases
Creation bytecode runs once during deployment: it executes the constructor, initializes state, then copies and returns the runtime bytecode. Runtime bytecode is stored permanently on-chain and contains the function dispatcher (matching 4-byte selectors from calldata to function bodies) and all function implementations. Understanding this two-phase model explains why constructors cannot be called after deployment and why contract code is immutable.
5. Contract interaction opcodes have distinct security properties
- CALL: Executes code at another address in a new context (new stack, memory; target's storage).
- STATICCALL: Like CALL but enforces read-only — reverts on any state modification.
- DELEGATECALL: Executes target's code in the caller's context (caller's storage, msg.sender, msg.value). Powers proxy patterns but is dangerous if storage layouts are mismatched.
The call stack is limited to 1,024 frames. Failed calls push 0 on the stack; they do not automatically revert the caller.
6. ABI encoding standardizes all external contract communication
Function calls are identified by a 4-byte selector (first 4 bytes of keccak256 of the function signature). Parameters are encoded in 32-byte words: static types inline, dynamic types via offset/data pointers. Events encode indexed parameters as topics (enabling efficient filtering) and non-indexed parameters in the data field. Custom errors (Solidity 0.8.4+) use the same selector scheme, saving gas versus string error messages.
7. The sandbox is the foundation of trustless execution
The EVM's inability to perform I/O, access files, generate randomness, or make network requests is not a limitation — it is the mechanism that makes decentralized consensus possible. Every restriction eliminates a source of nondeterminism. The oracle problem (needing external data on-chain) is a direct consequence of this design, solved through oracle networks rather than VM modification.
Common Misconceptions
| Misconception | Reality |
|---|---|
| "The EVM is slow because it's on a blockchain" | The EVM is slow by design — it prioritizes determinism and security over speed. Every node must produce identical results. |
| "Storage is just like a database" | Storage is a key-value mapping where every write costs 5,000-20,000 gas and persists forever on every full node. It is fundamentally different from a database. |
| "Solidity types like uint8 save gas" | At the EVM level, all arithmetic operates on 256-bit words. Smaller types add masking overhead. uint8 saves gas only in struct packing (multiple small values in one storage slot). |
| ".transfer() prevents reentrancy" | .transfer() forwards only 2,300 gas, which historically prevented SSTORE in callbacks. But gas cost changes (EIP-1283/2200) showed this is fragile. Use explicit reentrancy guards instead. |
| "Contract code is always immutable" | Proxy patterns (DELEGATECALL) allow logic changes. Before EIP-6780, metamorphic contracts (CREATE2 + SELFDESTRUCT) allowed actual code replacement at the same address. |
| "CREATE2 addresses are random" | CREATE2 addresses are deterministic — computed from deployer, salt, and init code hash. This is a feature (counterfactual deployment) and a former risk (metamorphic contracts). |
Rules of Thumb
-
When estimating gas, count the SSTOREs. Storage operations typically dominate total gas cost for any function that modifies state. Everything else is rounding error by comparison.
-
Cache storage reads in local variables. The first SLOAD costs 2,100 gas (cold). Reading the same slot again costs only 100 gas (warm), but reading a local variable on the stack costs 3 gas. For values read multiple times,
uint256 cached = stateVar;saves gas. -
Pack structs carefully. Solidity packs multiple variables smaller than 256 bits into single storage slots. Ordering struct fields by size (smallest together) minimizes the number of slots and thus the number of SSTORE operations.
-
Never rely on gas costs for security. Gas prices change with network upgrades. Any security assumption that depends on a specific gas cost (like the 2,300 stipend preventing SSTORE) can be invalidated by a future EIP.
-
DELEGATECALL is a loaded weapon. It executes foreign code with full access to your storage. Use established proxy patterns (OpenZeppelin's TransparentProxy or UUPS) rather than rolling your own.
-
Check return values from CALL. At the EVM level, a failed CALL does not revert the caller. Solidity's high-level syntax adds checks automatically, but low-level
.call()does not. -
Understand what lives where. If you cannot answer "is this variable on the stack, in memory, or in storage?" for every piece of data in your function, you do not yet understand your contract's gas profile.
Key Formulas and Numbers
| Item | Value |
|---|---|
| Stack depth limit | 1,024 items |
| Stack access limit | Top 16 items (DUP/SWAP) |
| Word size | 256 bits (32 bytes) |
| Memory cost formula | 3n + floor(n^2 / 512), where n = 32-byte words |
| SSTORE (new slot) | 20,000 gas |
| SSTORE (update) | 5,000 gas |
| SSTORE (clear) + refund | 5,000 gas - 4,800 refund = 200 net |
| SLOAD (cold) | 2,100 gas |
| SLOAD (warm) | 100 gas |
| CALL (cold) | 2,600 gas |
| CREATE / CREATE2 | 32,000 gas base |
| Code deployment fee | 200 gas per byte of runtime code |
| Max contract size | 24,576 bytes (EIP-170) |
| Call stack depth limit | 1,024 frames |
| Function selector | First 4 bytes of keccak256(canonical_signature) |
| Gas stipend (.transfer) | 2,300 gas |
| Refund cap | 20% of transaction gas (EIP-3529) |
Bridge to Chapter 13
You now understand the machine. In Chapter 13, you will learn to program it. Solidity is a high-level language that compiles to the opcodes, stack operations, and storage patterns covered in this chapter. Every Solidity construct — mapping, require, modifier, event, inheritance — has a direct translation to EVM bytecode. When a Solidity function "does not work as expected," the EVM-level behavior is always the ground truth. The bytecode does not lie.