Everything you need to know before learning Solana (part two)

Hey everyone ~ welcome back you might be here after reading Part 1, or maybe this is your first time reading my blog either way, it’s totally fine. Today, we’re going to explore something super important: Hashing, Merkle Trees, and Cryptography the real tech that keeps blockchains safe.

Honestly a reality check if you don’t understand hashing and Merkle trees, you’re just copy-pasting stuff in Web3. This is Part 2 of the series where we truly understand what makes blockchain secure, untouchable, and decentralized. You’ll learn why no one can mess with your data, how blockchains verify massive amounts of transactions, and how Solana’s own magic (Proof of History) is built on these concepts.

Let’s dive into the brain and backbone of blockchain.

Hashes: The Digital Fingerprint

Before actually knowing hash functions imagine this you change just one comma in a 1000-word contract and suddenly, the entire document becomes invalid. How safe and scary right!

What is a Hash Function?

A hash function is a special kind of mathematical function that takes any data (a word, file, transaction list anything) and converts it into a fixed-length string called a hash or digest.

Even if the input is a sentence or an entire book, the output is always of the same length. No matter how big or small the data is, this function processes it and spits out a very specific, fixed-length string of characters. This unique string is what we call as hash.

The hash as a unique digital fingerprint for that specific piece of data. Just like every person has a unique fingerprint, every piece of digital data has a unique hash.

One of the most widely used hashing algorithms in blockchain is SHA-256 (Secure Hash Algorithm 256-bit), which Bitcoin and many other blockchains rely on.

If you hash “Hello” using SHA-256, you get a long string like:
185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969

If you hash hello (small h), the hash becomes completely different.2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

This is the nature of hash functions → tiny change → totally different output. Hashing is irreversible you can’t guess the original input from the hash. So in blockchain every transaction is hashed.

What Makes a Good Hash Function in Blockchain?

To keep blockchain secure, a hash function needs to have three superpowers:

Consistent Output (Deterministic):
Same input = same output every time. No randomness. This helps in verifying data later. If the hash changed, we couldn’t trust it anymore.
Collision Resistant:
It should be extremely hard to find two different inputs that give the same hash. If someone could do that, they could fake data and fool the system. That’s a big security risk.
One-Way Function (Irreversible):
You can go from input → hash easily, but not the other way around. It’s impossible to reverse the hash and find the original data. This protects things like passwords and sensitive info.
Avalanche Effect: Change one letter, and the entire hash output changes.

Done with the proper definition now lets see what all these are trying to say ??

Why Hashing Matters in Blockchain

So, how does this "digital fingerprinting" secure a block in a blockchain?

Every single block in a blockchain contains a collection of information: a list of transactions, a timestamp, a reference to the previous block, and more. Before a block is officially added to the chain, all of this data within that block is fed into a hash function (like SHA-256). The unique hash that comes out becomes that block's own unique identifier – its immutable digital seal.

Hashing gives each block a unique digital fingerprint based on its data like transactions, timestamp, and the previous block’s hash. This fingerprint (created using a hash function like SHA-256) becomes the block’s ID. Now here’s the powerful part:

If even one small change is made to the block — like changing "User A sent $50" to "$5000" — the entire hash changes completely. Not slightly completely. It’s like replacing a fingerprint with a random one.

This drastic change makes it impossible to tamper with data secretly. If a hacker changes anything, the hash won't match, and the block will be rejected. These hashes are then used to build bigger structures like Merkle Trees.

Understood ?? Hopefully Yes if not read again and again and again lets know understand about

Merkle Trees

So, we just learned how hashing gives us a unique digital fingerprint for any piece of data, and how that's vital for securing individual blocks. But what if a block contains thousands of transactions? How do we quickly and efficiently verify if a specific transaction is included in that massive list without having to download and re-hash every single transaction in the block?

Thats where Merkle tress come into picture

A Merkle Tree is a smart way to organize and verify a large number of transactions in a block.

Instead of checking every single transaction one by one, Merkle Trees use hashing in layers to combine them into one final hash called the Merkle Root.

This root acts like a summary of all the transactions inside the block.
With just a few hashes (not the whole block), you can prove if a transaction exists.

At its core, a Merkle Tree is a binary tree structure. It's built from the bottom up, by repeatedly hashing pairs of hashes until you're left with just one single hash at the very top.

To understand Merkle Trees, even more imagine this .You’re a detective. Someone shows you a final code and says, “This proves I have these 10,000 transactions.”

“You say: “Okay, prove one of them to me without showing me all 10,000.”

With just a few tiny pieces, Merkle Trees let you verify that a single transaction exists inside a massive block of data and they do this very fast and securely.

They are like magical trees that keep data safe and easy to check.

Understanding Merkle Root

Imagine you have 4 transactions:

T1 = Liya sends 1 SOL to Shika
T2 = Shika sends 2 SOL to Navya
T3 = Navya sends 3 SOL to Maha
T4 = Maha sends 4 SOL to Satya

Now we do this:

Hash each transaction:

H1 = hash(T1)

H2 = hash(T2)

H3 = hash(T3)

H4 = hash(T4)
Combine in pairs:

H12 = hash(H1 + H2)

H34 = hash(H3 + H4)
Combine those into one final root:

Merkle Root = hash(H12 + H34)

You now have a single final hash that represents all 4 transactions.

That is your Merkle Root. Why is this helpful?

Let’s say your mobile wallet wants to verify that T1 really existed in a block.

Instead of downloading the entire block, you only need:

H2
H34
Merkle Root

With these 3 things, you can prove that T1 existed without seeing T3 or T4.
This keeps things fast, lightweight, and secure.

How does this help in real blockchain?

In real-world blockchains like Bitcoin, Merkle Trees are used to:

Prove inclusion of a transaction inside a block.
Protect data if someone tries to tamper with a transaction, the root changes.
Allow light clients (like wallet apps) to operate without full blockchain data.

Even Git (used for coding) and IPFS (used in Web3 storage) use Merkle Trees.

What happens if data is changed?

If T1 changes (say Liya sends 2 SOL instead of 1), then:

H1 changes H12 changes Merkle Root changes. This makes tampering instantly detectable.

Here concludes the simple overview of Merkle Trees but lets learn how things work under the hood

Merkle tree and its components

→ Leaf Nodes (Individual Data Fingerprints):
Hashes of individual data (usually transactions).

→ Parent Nodes (Combined Fingerprints):
Hashes of combined child nodes.

→ Root Hash (The Master Fingerprint):
Final hash summarizing the entire data set.

Also Merkle Trees are so powerful. Merkle Trees make blockchain fast, secure, and efficient:

Space-Efficient:
You don’t need the whole block to prove a transaction exists just the Merkle Root and a few hashes. This saves storage and is perfect for light wallets or devices with limited space.

Tamper-Proof:
If even a single letter in any transaction changes, the hash changes, and this change goes all the way up to the root. If the Merkle Root changes, we know the data was tampered with.

Fast Verification:
You can quickly prove a transaction is in the block without checking or re-hashing everything this is called a Merkle Proof.

So what are Merkle Proofs then ?

(I have already explained above in easy way but here’s a another explanation )

A Merkle Proof is a cryptographic way to demonstrate that a particular piece of data (like your transaction) was definitely included in the dataset that generated a specific Merkle Root, without needing to see the entire dataset itself.

Here's a simple way to think about how it works:

To prove your transaction (let's call it TxA) is in a block, you provide:

→ Your transaction's original data (so it can be hashed to get its leaf hash).

→ The final Merkle Root of the block (which is stored in the block's header).

→ A small, specific set of "sibling hashes" from the Merkle Tree. These are the hashes of the other branches that combine with your transaction's path to eventually form the Merkle Root.

With these few pieces of information, anyone can independently re-calculate the path from your transaction's hash up to the Merkle Root. If the Merkle Root they calculate matches the official Merkle Root of the block, then your transaction is cryptographically confirmed to be part of that block – without having to download or process the other thousands of transactions in that block!

Example with a Diagram: (ChatGPT help)

Let's visualize this. Imagine a block has four transactions: TxA, TxB, TxC, TxD.

                 Merkle Root (HashABCD)
                      /     \
                     /       \
              HashAB          HashCD
             /    \          /    \
            /      \        /      \
        HashA    HashB    HashC    HashD
        (TxA)    (TxB)    (TxC)    (TxD)

Here’s the step-by-step process:

→ Individual Hashes (Leaf Nodes): Each transaction (TxA, TxB, TxC, TxD) is hashed individually using SHA-256 to produce its unique fingerprint: HashA, HashB, HashC, HashD. These are our "leaf nodes" at the bottom.

→ First Level Up (Parent Hashes):

→ HashA and HashB are combined and then hashed together to create HashAB.

→ HashC and HashD are combined and then hashed together to create HashCD.

→ To the Top (The Merkle Root):

→ Finally, HashAB and HashCD are combined and hashed to produce the ultimate Merkle Root (let's call it HashABCD). This HashABCD is then stored in the block header.

If Alice wants to prove that her transaction, TxA, is genuinely included in this block: She provides:

→ The raw data of TxA (so its HashA can be re-calculated).

→ HashB (the sibling of HashA). HashCD (the sibling of HashAB).

→ The overall Merkle Root (HashABCD) from the block header. Anyone can then verify this by:

Hashing TxA to get HashA. Combining HashA and HashB, then hashing them to get HashAB.

Combining HashAB and HashCD, then hashing them to get a final root hash.

If this calculated root hash matches the Merkle Root stored in the block header (HashABCD), then it is cryptographically proven that Alice's transaction (TxA) was indeed part of the original data set that formed that block. All without seeing TxB, TxC, or TxD! How cool is that for efficiency and security?

OK That alot of confusion but No if you observe its that not hard to understand as time goes u will get there
Ok now we learnt about hashing, Merkle trees but where are they even used lets know

Where Are Merkle Trees Used?

Merkle Trees aren’t just theory they’re used in many real-world systems, especially in blockchain:

→ Bitcoin: Every Bitcoin block has thousands of transactions. To manage and verify them efficiently, Bitcoin creates a Merkle Root from all the transactions and stores it in the block’s header. This helps quickly check if a transaction is real, without going through the whole block.

→ Ethereum: Ethereum also uses Merkle Trees, but in a slightly different form called a Merkle Patricia Trie. This helps manage not just transactions, but also the entire state of the network (like account balances, smart contract data, etc.).

→ IPFS (Inter Planetary File System): IPFS is a decentralized system for storing files across many computers. It uses Merkle Trees to make sure files aren’t changed or corrupted. Each file’s data is hashed and tracked using a Merkle Tree.

→ Solana: Solana uses its own version called Concurrent Merkle Trees for features like Account Compression and Compressed NFTs. This means Solana can store only the Merkle Root on-chain (which is very small), while the actual NFT data stays off-chain. This saves a lot of space and makes storing NFTs much cheaper, since you don’t need expensive on-chain accounts for every detail. (will deep dive when we start the Solana architecture study no worries if you didn’t understand this )

Conclusion - Now you finally understand what hashes and Merkle trees really are not just some random tech term, but the core reason why no one can tamper with your data on a blockchain. A hash is just a unique digital fingerprint of your data. Change even one small letter, and the entire fingerprint changes. That’s what gives blockchain its power you can't hide even the tiniest change. It’s permanent, it's secure, and it’s one-way. Once hashed, you can’t reverse it. And Merkle Trees they’re the reason blockchains can handle thousands of transactions and still prove things fast and securely. Instead of checking all data, Merkle Trees just give you the root and a few sibling hashes and you can prove your data exists. It’s super efficient, lightweight, and tamper-proof. Even if one transaction changes, the whole structure breaks, which helps blockchains detect any fraud instantly.

With that said yeah, I know this blog became long again (no sorries), but this was needed. Because if you don’t get hashing and Merkle Trees, you're just roaming around in Web3 with vibes only .
Where the hell is Cryptography? Yeah yeah I know I promised that too, but listen I already wrote a full detailed thesis on that too but Im stopping here, so let’s do that in the next part 3

Some resources - Blockchain Basics | Coursera
bitcoinbook/bitcoinbook: Mastering Bitcoin 3rd Edition - Programming the Open Blockchain

If it helped you understand things in a way that saves you from banging your head on the wall like I did mission complete. Welcome to the world of Web3 and blockchain with me. The journey starts here. Keep learning keep coding keep reading..

It doesn’t stop here ~ Maha ( Haha just kidding any doubt just a dm away)

Thankyou ~ ciaa in next part with cryptography

twitter github

Web3 Before Solana - 2