This blog will explain what is Merkle and Merkle Tree Proofs and how can we leverage it in verifying the correctness of our transactions in blockchain.
Merkle Tree
A Merkle Tree is a tree data structure (typically a binary tree) of nodes, where each leaf node is data block, mostly it contains the hash of the original data stored somewhere else (eg: the transaction hash in a blockchain), and each parent node contains the hash of both of its children’s combined hash.
This data structure is often used in distributed systems, like in Bitcoin and other blockchains, to verify if a transaction hash belongs to the Merkle root of a block header, in a lightweight, efficient and fast way.
The use of Merkle Tree and proof in Bitcoin allows creates a technique called Simple Payment Verification (SPV), which is a way to check if a transaction is actually part of a block, without having to download the whole block of data or entire blockchain.
Merkle Tree Example:
If we are given list of data: [L1, L2, L3, L4], below is the merkle tree which can be formed using it.
If the number of data points are odd, meaning leaf nodes are odd, you repeat the last node again to make the number of data points to be even.
Merkle Proof
A Merkle proof or Merkle Path is a structure that holds the minimum needed nodes of the merkle tree to be able to prove that a data belongs to the Merkle tree.
Lets take the above example with data points: [L1, L2, L3, L4]. Now lets say we need to verify that L3 exits in the Merkle tree, then we would only need [Hash(1–1), Hash(0)] data points and merkle root obviously and we can verify that L3 is inside of the merkle tree or not.
How?
You will hash the L3 data, which gives you Hash(1–0).
You will do Hash (Hash(1–0), Hash(1–1)), which gives you Hash(1).
You will do Hash(Hash(1), Hash(0)), which gives you merkle root.
Verify the merkle root you got by doing above calculation is same as the merkle root you already have, if its same it means the transaction is inside of the tree else transaction is not.
We only need Log(n), where n is the total number of nodes in merkle tree, data points to verify if a particular node data is inside of the merkle tree or not.
For example, if a blockchain block has about 1000 transactions, we would only need to retrieve the Merkle proof with about 10 transaction hashes, instead of the whole block with the 1000 transactions.
The catch here is, you need to maintain the order of data in which you are perofrming the hash. In above example, in step 2, you can not do Hash(Hash(1–1), Hash(1–0)), it will change the resulting hash.
Stay Tuned and I will write a code implementation of it in coming days. I will update this article.
Let me know if you have any questions.
Comments