Detail visualize/ simplify and explanation from Yellow paper. Thank you for the amazing illustration from Ethereum EVM Illustrated.
Quicknote: After The Merge → Ethereum will have 2 service working together. Execution Layer and consensus layer. Yellow paper is define most the the Execution layer part.
High Level about Ethereum Blockchain
This post focuses on post-Merge Ethereum, sometimes historically called Eth2.0. At a high level, Ethereum is a decentralized blockchain network: many nodes run across the world and communicate through peer-to-peer networks. You can see a live rough view of public nodes from Etherscan node tracker.
After The Merge, Ethereum is best understood as two cooperating layers:
- Execution layer: Handles transactions, smart contract execution, gas accounting, and state transitions.
- Consensus layer: Runs Proof of Stake, chooses block proposers, organizes attestations, and decides which blocks become part of the canonical chain.
A full node normally runs two clients:
- Execution client: Maintains the execution state, validates transactions, executes blocks, and exposes the JSON-RPC APIs used by wallets and applications.
- Consensus client: Participates in the beacon chain, follows consensus rules, and communicates with the execution client through the Engine API.
A validator node is a node that participates in Proof of Stake. To become a validator, an operator stakes 32 ETH and runs a validator client.
- Validator client: Holds validator duties, proposes blocks when selected, and attests to blocks proposed by other validators.
A simplified flow is:
- A user sends a transaction to a node.
- The execution client performs basic checks.
- If valid, the transaction is added to the local mempool and propagated through the execution-layer peer-to-peer network.
- For each slot, a validator is selected as the block proposer. The proposer can build a block locally or use an external block builder, such as MEV-Boost/Flashbots-style infrastructure, to optimize the block.
- The proposer includes an execution payload inside a beacon block.
- Other validators check the beacon block and attest to it.
- The execution client executes the payload, verifies the resulting state transition, and imports the block if valid.
- The process repeats for the next slot.
We can demonstrate as the image below
For a block to be considered strongly safe, we usually wait for finality. In normal conditions, Ethereum finalizes after two epochs, roughly 13-14 minutes.
Ethereum Blockchain
From this section onward, when we discuss the Ethereum blockchain from the Yellow Paper viewpoint, we are mainly talking about the Execution Layer: transactions, blocks, account state, and the state transition function.
“Ethereum, taken as a whole, can be viewed as a transaction-based state machine” - Ethereum Yellow Paper
It begins with the first state - “genesis state” and transition state by state and become the current Ethereum chains.
The transactions = the VALID link between 2 states and include a lot of information.
We can present this formally by:
Where:
- $\Upsilon$ is the Ethereum state transition function. It represents the rules, the computing
- $\sigma$ is the state. $\sigma_t$ the Ethereum state at the specific time $t$
- $T$ is the transaction. E.g: An instruction from a user or a smart contract
Together this simple formula represent all the Ethereum power. Which $\Upsilon$ allows components to carry out computation, $\sigma$ allows components to store arbitrary state between transactions.
Block($B$) is the collection of transactions - a packages of data.
The chaining function($\prod$) is using a cryptographic hash - as a mean of reference transactions together with the previous block and an identifier for the final state.
From the viewpoint of the states Ethereum can be seen as a state chain.
From the viewpoint of the implementation Ethereum can also be seen as a chain of blocks, so it is BLOCKCHAIN.
From the viewpoint of the ledger, Ethereum can also be seen as a stack of transactions
Value:
From $\Upsilon$ and $\sigma$ we can compute anything, but to limit and calculate the right amount of computational power within the network → there is a need for a common currency to evaluate the power.
Ethereum define an intrinsic currency - Ether - ETH along with other sub-denominations of Ether. Conversion: 1 Ether = $10^{18}$ Wei
| Multiplier | Name |
|---|---|
| $10^0$ | Wei |
| $10^9$ | Gwei |
| $10^{12}$ | Szabo |
| $10^{15}$ | Finney |
| $10^{18}$ | Ether |
Since the system is decentralised → anyone can create a new block from older existing block, and when there was disagreement between the part from root (genesis block) to the leaf (newest block), a fork is created → a new branch with a different rule.
A fork can be created on purpose by a developer when their disagree with the current state of the chain. Like the ETH classic and current ETH chains, or can be forked by developer to include and update new features.
The newest fork and what is blog is currently describe is the shanghai fork - start from block 17034870.
Furthermore we can read more at Ethereum Roadmap.
Occasionally actors do not agree on a protocol change, and a permanent fork occurs. In order to distinguish between diverged blockchains, chain ID is introduced and the main Ethereum network will have chain id $\beta = 1$.
Block, State and Transaction:
World State:
Definition: the world state is a mapping from 160-bit addresses to account states. It is referenced through Ethereum’s trie structure rather than stored as one flat mapping on-chain.
A TRIE stores the global account state, and contract accounts may also have their own storage tries.
TRIE is an immutable data structure, so previous states can be referenced efficiently by their root hash.
Account state $\sigma[a]$:
Nonce: A scalar value equal to the number of transactions sent by an EOA or the number of contract creations performed by a contract. For account $a$ in state $\sigma$, denote it by $\sigma[a]_n$.
Balance: A scalar value equal to the number of wei held by the account. Denote it by $\sigma[a]_b$.
StorageRoot: A 256-bit hash of the root node of the trie that encodes the account’s storage contents. Denote it by $\sigma[a]_s$.
CodeHash: The hash of the EVM bytecode associated with this account. Denote it by $\sigma[a]_c$. If some byte sequence $b$ is the account code, then $KEC(b) = \sigma[a]_c$.
Here is the implementation in Go Ethereum (geth):
|
|
In practice, we usually work with account mappings and code rather than manipulating trie roots directly, so we use the following equivalence:
where
About the codeHash field: if $\sigma[a]_c = KEC(())$, then the account has no deployed code, so it is a simple account rather than a contract account.
We define world-state collapse function $L_s$:
$L_s$ is used together with the TRIE function to derive the identity, or root hash, of the world state.
An EMPTY account has no code, nonce 0, and balance 0.
A DEAD account is either absent from the world state or empty.
EOAs vs Contract Accounts
| EOA | Contract Account | |
|---|---|---|
| Nonce | Yes | Yes |
| Balance | Yes | Yes |
| StorageRoot | Empty trie root | Storage trie root |
| CodeHash | Empty code hash by default; may be affected by EIP-7702 delegation semantics |
Hash of deployed contract bytecode |
Note: EIP-7702 allows EOAs to temporarily delegate execution semantics in a way that makes them behave more like programmable accounts.
Transaction:
Definition: a single cryptographically-signed instruction constructed by an actor externally to the scope of Ethereum. There are three transactions type and 2 subtypes of transactions: message call and creation of new accounts with code (contract creation)
The 3 transaction type - Denote as 0,1,2:
- Transaction type 0: legacy: Original format when the Ethereum launch. Do not include any new features like dynamic gas fee or access lists for smart contracts.
- Transaction type 1 : following
EIP-2930introduceaccessListparameter → to specifies addresses and storage keys the transaction expects to access. - Transaction type 2: following
EIP-1559→ introducemaxPriorityFeePerGasandmaxFeePerGashelping transactors during periods of high network congestion → now is the default for transaction because of its flexibility. - Transaction type 3: (Blob)
EIP-4844→ Help handle massive binary data more efficiently → helping for Layer 2 scaling by using zk-roll up. (0x03) - Transaction type 4:
EIP-7702→ give superpowers to EOAs by binding smart contract into the code field of the EOAs , make the EOAs to temporality behave like smart contract accounts.
| Fields: | Meaning | Symbol | Transaction Type | Boundaries |
|---|---|---|---|---|
| type | Transaction type | $T_x$ | 0,1,2 | ${0,1,2}$ |
| nonce | Scalar - number of transaction by the sender. | $T_n$ | 0,1,2 | $\mathbb{N}_{256}$ |
| to | 160-bit address. Message call: recipient. Contract creation: empty value | $T_t$ | 0,1,2 | $\mathbb{B}$ or $\mathbb{B}_{20}$ |
| value | Number of Wei. | $T_v$ | 0,1,2 | $\mathbb{N}_{256}$ |
| r,s | Signature → determine the sender. | $T_r$ and $T_s$ | 0,1,2 | $\mathbb{N}_{256}$ |
| accessList | List of access entries to warm up. Access entry: $E=(E_a,E_s)$. Touple of account address ,storage keys. | $T_A$ | 1,2 | |
| chainId | Chain ID . Must be equal to the network chain ID $\beta$ | $T_c$ | 1,2 | $\beta$ |
| yParity | Signature Y parity | $T_y$ | 1,2 | ${0,1}$ |
| w | Combine chainId, yParity. $T_w=27+T_y$ or $T_w = 2*\beta+35+T_y$ | $T_w$ | 0 | $\mathbb{N}_{256}$ |
| maxFeePerGas | Scalar = maximum number of Wei to be paid per unit of gas for all computation costs. | $T_m$ | 2 | $\mathbb{N}_{256}$ |
| maxPriorityFeePerGas | Scalar = the maximum number of Wei to be paid to the recipient as an incentive to include the transaction | $T_f$ | 2 | $\mathbb{N}_{256}$ |
| gasPrice | Scalar = number of Wei pay per unit of gas |
$T_p$ | 0,1 | $\mathbb{N}_{256}$ |
| init | Deployment Code | $T_i$ | 0,1,2. Contract creation transaction | |
| data | Calldata | $T_d$ | 0,1,2. Message call transaction |
We assume all components are interpreted by the RLP as integer values, with the exception of the access list $T_A$ and the arbitrary length byte arrays $T_i$ and $T_d$.
Example of Transaction on the Test net:
|
|
Withdrawal (new in Shapella upgrade)
A withdrawal $W$ → a tuple of data describing a consensus layer validator’s withdrawal of some amount of its staked Ether. A withdrawal is created and validated in the consensus layer → push to the execution layer to proceed.
A withdrawal composes:
| Field | Meaning | Symbol | Boundaries |
|---|---|---|---|
| globalindex | Increase from 0 - unique identifier for the withdrawal | $W_g$ | $\mathbb{N}_{64}$ |
| validatorIndex | index of the consensus layer validator | $W_v$ | $\mathbb{B}_{20}$ |
| recipient | 20-bytes address receive Ether from the withdrawal | $W_r$ | $\mathbb{N}_{64}$ |
| amount | Scalar (≠0) = amount of Ether demoninated in Gwei($10^9$ Wei) | $W_a$ | $\mathbb{N}_{64}$ |
Withdrawal serialisation is defined as:
Quick note: after Ethereum merge, changing from PoW to PoS. Ethereum now have 2 different layer: the consensus layer and the execution layer. See Execution and Consensus Layers.
Block
The block is the collection of information (block header) $H$ + Information of the transactions $T$ + $U$ (deprecated) + $W$ ← collection of validator’s withdrawal push from the consensus layer.
A Block $B$ formally define as:
A block header $B_H$ include:
| Field | Meaning | Symbol | Boundaries |
|---|---|---|---|
| parentHash | $H_p =KEC(ParentBlockHeader)$ | $H_p$ | $\mathbb{B}_{32}$ |
| ommersHash | 256-bit hash field. Deprecated now = $KEC(RLP(()))$ | $H_o$ | $\mathbb{B}_{32}$ |
| beneficiary | 160-bit address to which priority fees from this block is transferred | $H_c$ | $\mathbb{B}_{20}$ |
| stateRoot | 256-bit hash of the root node of the state TRIE after all transactions exec and withdrawals - (it all about the WORLD STATES and accounts states) | $H_r$ | $\mathbb{B}_{32}$ |
| transactionsRoot | 256-bit hash root node of the TRIE populated with each transactions in the transactions list (all about the TRANSACTION) | $H_t$ | $\mathbb{B}_{32}$ |
| receiptsRoot | 256-bit hash root note of the TRIE populated with the receipts of each transaction in the transactions list | $H_e$ | $\mathbb{B}_{32}$ |
| logsBloom | The Bloom filter composed from indexable information - log entries from receipts of transactions | $H_b$ | $\mathbb{B}_{256}$ |
| difficulty | Scalar - deprecated - now equal 0 | $H_d$ | $\mathbb{N}$ |
| number | Scalar = number of ancestor blocks. Genesis block has number = 0 | $H_i$ | $\mathbb{N}$ |
| gasLimit | Scalar = limit of gas expenditure / block | $H_l$ | $\mathbb{N}$ |
| gasUsed | Scalar = total gas used in transactions for the block | $H_g$ | $\mathbb{N}$ |
| timestamp | Scalar = unix time at this block’s inception | $H_s$ | $\mathbb{N}_{256}$ |
| extraData | Byte array contain data relevant to the block must be less than 32 bytes | $H_x$ | $\mathbb{B}$ |
| prevRandao | lastest RANDAO mix of the post beacon state | $H_a$ | $\mathbb{B}_{32}$ |
| nonce | 64 bit value deprecated . Current value = 0x0...0 |
$H_n$ | $\mathbb{B}_{8}$ |
| baseFeePerGas | Scalar = base amount of Wei burned for each unit of gas consumed |
$H_f$ | $\mathbb{N}$ |
| withdrawalsRoot | 256-bit hash root node of the TRIE populated with each withdrawal operations | $H_w$ | $\mathbb{B}_{32}$ |
Go struct for the block header:
|
|
Note: RANDAO = pseudorandom value gen by validators on the Ethereum consensus layer. See Ethereum Consensus Specs.
Transaction Receipt:
Because there may be useful information to form a zk proof, index and search. The receipt is encoded as $B_R[i]$ for the $i$ transaction (i-index in the array) → placed in an index-keyed TRIE → root as $H_e$.
We can view it in Geth:
|
|
Transaction Receipt $R$ have form:
Where:
- $R_x$ = Type of transaction.
- $R_z$ = Status code.
0x00for revert.0x01for succeed. - $R_u$ = Cumulative gas used in the block immediately after the specific transaction finishes executing.
- $R_l$ = Set of logs created through execution.
- $R_b$ = Bloom filter.
Details:
$R_x$ equal to the type of the corresponding transaction. $L_R$ → transform into RLP-serialized byte array. $R_z$ must be non negative number
|
|
$R_l$ is the series of log entries. $(O_0,O_1,….)$.
Log entries $O=tuple(O_a,(O_{t0},O_{t1},…),O_d)$
- $O_a$ = logger address $O_a \in ,mathbb{B}_{20}$
- $O_t$ = log topics (can be null) $\forall x \in O_t : x \in \mathbb{B}_{32}$
- $O_d$ = number of bytes of data. $O_d \in \mathbb{B}$.
$R_b$ is the Bloom filter which is a 256-byte hash value. Detail in Appendix.
The receipts for the above transactions.
|
|
Check valid of a block:
Must satisfied these constraints:
- Ommers field - $B_U$ must be empty array + Block Header -$B_H$ must be consistent with Transaction - $B_T$ and Withdrawal - $B_W$
- State Root - $H_r$ must match the resultant state after executing all transactions → all withdrawals in order from the based state $\sigma$
- $H_t,H_e,H_b,H_w$ - Transaction Root, receiptsRoot, logsBloom,withdrawalsRoot must be correctly derived from the transactions, transactions receipts, resulting logs, withdrawals respectively.
In formal definition:
$p_T,p_R,p_W$ each will be the pairwise RLP transformations but with special case for EIP-2718.
Block header Validity:
Define $P(B_H)$ = Parent block of $B$:
Block number $H_i$ will be
Starting with London release introduce baseFeePerGas $H_f$. $H_f$ = amount of wei burned per unit of gas consumed while executing transactions within the block. The function to calculate is define as $F(H)$.
- If current block = first block of the
Londonfork → base = 1000000000. - If the parent gas limit $P(H)_{H_g}$ = the target gas ($\tau$) → the base fee unchanged.
- If $P(H)_{H_g}$ > ($\tau$) → the base fee increase by $v$.
- If $P(H)_{H_g}$ < ($\tau$) → the base fee decrease by $v$.
Where $\tau$ and $v$ define:
- If increase:
- If decrease:
The concrete implementation is:
|
|
The canonical gas limit $H_l$ is defined to follow these constraints:
Where:
|
|
$H_s$ is the timestamp of block $H$ and must fulfill:
prevRando must be determined using information from the Beacon Chain (Consensus Layer). We define as PREVRANDAO().
Validity Function:
After define all this - How we define the validity function $V(H)$ → we can easily see that it must follow all the constraint of all function above.
Formally define:
Example for a block on main Ethereum network:
Main Net
baseFeePerGas 148677693
difficulty 0
extraData 0x546974616e2028746974616e6275696c6465722e78797a29
gasLimit 60000000
gasUsed 11473415
hash 0xa8b200fb93f6b36a21e636ae6c33e043e7aeec7179f89c0d1d855508eab7962f
logsBloom 0x0c20c24bc9220d94372c004bc4700092210410010c01ec08c28910c824261c19002c1540a108284700125f418a2345a01e50316c1c016c230ae923102b20a2069802b900108a122a183e0d8c0145a2300840633694f3c92882180404a0a404a09483000a1a202a0015448c1801a6a99b9261482062829600a7c01690681b23000c208e184ca1e50901417100040e134245038e854300180e46b18744241c04324622142902b02059665311e818c85428008242840609c0008f68a2024ea838404a5cc8ae40825c103001434220e618c505be62675c2290700885140a430fb30382343c2100d0212080e210c8098092012d1fb0504d90006a040a2282b6003101
miner 0x4838B106FCe9647Bdf1E7877BF73cE8B0BAD5f97
mixHash 0x9c7e169d9ff3e26a1ef455cffe13929ff6bf9f227c4b34a0766d8fe8333fd38c
nonce 0x0000000000000000
number 24747612
parentHash 0xd4d99eca1069bf6282b5fe58de1f77e336c75ebf88fb2ef01dad4ceaad9d412b
parentBeaconRoot 0x6e646d848224b6cce1b3b84b3e6b7b3e3a87fcfa4e3f019b1d2bf589df088215
transactionsRoot 0x15c4eee221a497d9da0bb24ff391c5812b9886f95adf86813a719f40837ee969
receiptsRoot 0xc2cc226bc26433034643217303118667416c2509b3ff5e292c0c1f26bb912e99
sha3Uncles 0x1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347
size 61503
stateRoot 0x5b4abbb7977b06f950b4a0cc6061713af7be6eb3bcdc7a3592c70f2e49c856fb
timestamp 1774598999 (Fri, 27 Mar 2026 08:09:59 +0000)
withdrawalsRoot 0x587c1fa4c5c79ed7b852a2d1b1245e2b34b8a5d8016e139c308b862806838639
totalDifficulty 0
blobGasUsed 1048576
excessBlobGas 184989393
requestsHash 0xe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
transactions: [
0x518337cfa421a43fd32bffbdb9050b20a86ce026a351d49e1b1fea45df1316d8
... truncated
0xceaf8ed0ea814dda246f6117c5fcda5a30e7ecf58fc843ccaeaf038f4998b41a
0xdb8e1be01d704a5452fda2f0a1d3e87f2ae16138c8b79e1a307d0690eb541f7a
0x8849e9c2f64b2943354421e2704149b3d2add962f621de21ac9409a78a7e71f7
]
Note: for backward compatibility: mixHash $H_m$ is prevRandao $H_a$.
Transision to Proof of Stake remove the miner. But in the block miner in here is the feeRecipient and collect this:
- The Base Fee is burned and disappears completely. The
mineraddress gets none - The Priority Fee (Tip) paid by users to get their transactions included faster goes directly to this
mineraddress. - MEV (Maximal Extractable Value) payouts also go to this address. If a searcher or builder rearranged transactions to make a profit (like an arbitrage trade), they pay a cut to the validator, which lands here.
Appendix:
Recursive Length Prefix:
Recursive Length Prefix (RLP) serialization is used extensively in Ethereum. RLP standardizes the transfer of data between nodes in a space-efficient format. It is a serialization method for encoding arbitrarily structured binary data.
RLP have these rule:
- Positive integer → convert to shortest byte array (big-endian interpretation = integer) → encode as a string
- Single byte value in
0x00 - 0x7f→ remain the same. - String is
0-55bytes long → RLP encoding = (0x80+len(string)).to_string()+string→ value of the first bytes[0x80,0xb7] - String > 55 bytes → RLP encoding = (
0xb7+len(to_binary(string))).to_string() +len(string)+string - String > $2^{64}$ → not encoded.
- List: if total length of RLP(list_ele) ≤ 55 bytes → RLP=(
0xc0+len(RLP(list_ele)) +RLP(list_ele) - List: RLP(list_ele) > 55 bytes → RLP = (
0xc0+len(to_binary(RLP(list_ele)))+len(RLP(list_ele))+RLP(list_ele)
RLP Encode:
|
|
RLP decode:
Reverse engineer the encode. Furthermore, see Ethereum RLP Definition.
|
|
Calculate Address:
How to derive public key from private key:
For Ethereum the curve is secp256k1. We denote the following: sk=private key, pk = public key
G is the base point for the Curve.
The Address of Wallet will be: keccak256(pk)[-40:] ← take the last 40 bytes.
The CheckSum - Encoding Rule can be find here: EIP-55
|
|
Bloom filter:
Bloom filter function $M$ → compress log into 256-byte hash
Formal definition in Ethereum:
Detail in GETH code:
|
|