Blockchain technology has evolved from a niche cryptographic experiment into a foundational innovation reshaping finance, digital ownership, and decentralized systems. This comprehensive guide distills core principles from北京大学 (Peking University) Professor Xiao Zhen’s renowned open course on blockchain, updated with modern insights into Ethereum, smart contracts, and NFTs. Whether you're a developer, investor, or tech enthusiast, this article delivers a structured understanding of how blockchains work under the hood.
Bitcoin: The Foundation of Decentralized Trust
Bitcoin (BTC) introduced the world to a trustless, peer-to-peer digital currency system secured by cryptography and consensus. At its core, BTC relies on two fundamental pillars: cryptography and data structure design.
Cryptographic Foundations
Three essential cryptographic properties enable Bitcoin’s security:
- Collision resistance: It's computationally infeasible to find two different inputs that produce the same hash output.
- Hiding: Given a hash result, it's impossible to deduce the original input.
- Puzzle friendliness: Ensures no shortcuts exist in mining—only brute-force computation works, supporting Proof-of-Work (PoW).
These properties are implemented using SHA-256 hashing, which generates a fixed 256-bit output—large enough to prevent collisions even at global scale.
Public-key cryptography secures transactions: users sign with their private key, while others verify using the corresponding public key.
👉 Discover how blockchain security powers next-gen digital assets
Wallets, Private Keys, and Addresses
A Bitcoin wallet starts with a randomly generated private key. Using elliptic curve multiplication, this derives a public key. However, Bitcoin doesn’t use the public key directly as an address.
Instead, it applies:
Address = RIPEMD160(SHA256(Public Key))
This double-hashing enhances security. The final address is then encoded in Base58 (or Bech32 for SegWit), ensuring readability and checksum protection. Importantly, the process is one-way: you can derive an address from a public key, but never reverse it.
Core Data Structures: Blockchain and Merkle Trees
Blockchain as a Linked List of Hash Pointers
Each block contains a header with metadata and a list of transactions. The block header includes:
- Version number
- Previous block hash (linking blocks)
- Merkle root (hash of all transactions)
- Timestamp
- Difficulty target
- Nonce
The "hash pointer" design ensures immutability: altering any transaction changes the Merkle root, which invalidates the block hash—and every subsequent block.
The genesis block (height 0) is hardcoded into the software. Every new block increases the chain height by one.
Merkle Trees for Efficient Verification
Transactions are stored in a Merkle tree—a binary tree where leaf nodes contain transaction hashes, and parent nodes store hashes of their children. Only the root hash (Merkle root) is stored in the block header.
This enables Merkle proofs, allowing lightweight clients (SPV nodes) to verify whether a transaction exists in a block without downloading all data—ideal for mobile wallets.
Verification complexity is just O(log N), making it highly scalable.
Consensus and Mining in Bitcoin
Proof-of-Work and Consensus Mechanism
Bitcoin uses PoW to achieve decentralized consensus. Miners compete to solve a cryptographic puzzle: find a nonce such that:
SHA256(Block Header) ≤ Target
The lower the target, the harder the puzzle—measured by leading zeros in the hash.
Once solved, the miner broadcasts the block. Other nodes instantly verify it without redoing the work—easy verification is key to network efficiency.
Difficulty Adjustment
To maintain a ~10-minute block interval, Bitcoin adjusts difficulty every 2016 blocks:
New Difficulty = Old Difficulty × (20160 minutes / Actual Time)
If blocks were mined faster than expected, difficulty increases; otherwise, it decreases. Adjustments are capped at ±4× per cycle to prevent instability.
All nodes compute this independently—non-compliant nodes risk creating orphaned blocks.
Mining Hardware and Pools
From CPUs to GPUs and now ASICs (Application-Specific Integrated Circuits), mining has become increasingly specialized. ASICs dominate due to their efficiency.
Mining pools allow smaller miners to combine hash power and share rewards proportionally. They submit "almost valid" shares—blocks meeting a lower difficulty threshold—as proof of work.
However, pools raise centralization concerns:
- Lower barrier to 51% attacks
- Potential censorship of specific addresses
Forks: When Chains Split
Forks occur when multiple valid blocks exist at the same height.
- State forks: Temporary splits resolved when one chain becomes longer.
- Protocol forks: Permanent divergences due to rule changes.
Hard Forks vs Soft Forks
Type | Backward Compatible? | Outcome |
---|---|---|
Hard Fork | No | Creates two chains (e.g., BTC/BCH) |
Soft Fork | Yes | Tightens rules; old nodes accept new blocks |
Hard forks can lead to replay attacks, where identical transactions execute on both chains. Modern chains mitigate this using unique chain IDs embedded in signatures.
Ethereum: Beyond Currency to Programmable Money
While Bitcoin focuses on value transfer, Ethereum (ETH) introduces smart contracts—self-executing code on the blockchain.
Account-Based Model
Unlike Bitcoin’s UTXO model, Ethereum uses an account-based ledger:
- Externally Owned Accounts (EOAs): Controlled by private keys.
- Contract Accounts: Hold code and state, triggered by EOAs.
Each account has:
- A balance
- A nonce (transaction counter preventing replay attacks)
This model simplifies balance tracking and enables complex interactions.
Data Structures: The Merkle Patricia Trie (MPT)
Ethereum uses MPTs for three main trees per block:
- State Tree: Maps addresses to account states.
- Transaction Tree: Records all transactions.
- Receipt Tree: Logs execution outcomes.
Crucially, only changed nodes are updated—previous versions remain accessible for auditing and rollback. This immutability supports trustless verification across time.
GHOST Protocol and Uncle Blocks
Ethereum’s faster block time (~12 seconds) increases orphan rates. To incentivize inclusion of stale blocks, Ethereum rewards uncle blocks—valid blocks not on the main chain but within seven generations.
Benefits:
- Reduces centralization pressure
- Increases network security
- Rewards miners who would otherwise lose revenue
Uncles earn partial block rewards based on proximity to the main chain.
Mining Algorithm: Ethash
Ethash is designed to be ASIC-resistant and memory-hard, favoring GPUs over specialized hardware. It uses:
- A 16MB cache (for light client verification)
- A 1GB+ DAG (Directed Acyclic Graph) regenerated every 30,000 blocks (~5 days)
Miners perform 64 iterations over dataset elements derived from the nonce and header hash. High memory bandwidth requirements limit ASIC efficiency.
Note: Ethereum has since transitioned to Proof-of-Stake (PoS), but Ethash played a crucial role in its early decentralization.
👉 Learn how blockchain evolution is shaping Web3
From PoW to Proof-of-Stake (PoS)
Ethereum’s shift to PoS replaces mining with staking:
- Validators lock ETH as collateral.
- They propose and vote on blocks.
- Consensus requires ≥2/3 approval per epoch (~6.4 minutes).
- Misbehavior results in slashing—loss of staked funds.
This improves energy efficiency and scalability while maintaining security through economic incentives.
Smart Contracts and NFTs on Ethereum
Writing and Executing Smart Contracts
Smart contracts are written in Solidity—a statically typed language resembling JavaScript. After compilation to bytecode, they run on the Ethereum Virtual Machine (EVM).
To interact:
- Send ETH or trigger functions via
data
field - Specify
gas limit
to cap execution cost
Fallback functions handle unspecified calls—but must be carefully designed to avoid vulnerabilities like reentrancy attacks.
Gas Fees and Execution Limits
Gas prevents infinite loops:
- Each operation consumes gas
- Total gas used × gas price = fee
- Block-wide gas limit prevents bloat
Fees are deducted upfront during execution; unused gas is refunded after completion.
All full nodes execute every contract call—ensuring deterministic state transitions across the network.
Common Smart Contract Pitfalls
- Reentrancy Attack: Withdraw before balance update allows recursive draining.
- Fallback Failures: Missing fallback functions reject unintended transfers.
- Immutable Code: Bugs can't be patched post-deployment.
- Public Visibility: All code and storage are transparent—audit thoroughly.
Best practices:
- Test extensively on testnets
- Use formal verification tools
- Implement circuit breakers
- Support upgradeable patterns (e.g., proxy contracts)
Creating Tokens: ERC-20 and NFTs
ERC-20 for Fungible Tokens
Standardizes fungible tokens with functions like:
function totalSupply() external view returns (uint256);
function balanceOf(address account) external view returns (uint256);
function transfer(address recipient, uint256 amount) external returns (bool);
Underlying: mapping(address => uint256) balances
.
Used for stablecoins, utility tokens, governance tokens.
NFTs via ERC-721 and ERC-1155
ERC-721: One-of-a-kind tokens (e.g., digital art).
ERC-1155: Multi-token standard supporting both fungible and non-fungible types in one contract.
Metadata structure example:
{
"name": "Gymbo Collection 7",
"description": "A rare digital collectible",
"image": "https://example.com/gymbo7.png"
}
The uri(uint256 _id)
function returns metadata location. For true decentralization, store assets on IPFS or Arweave—not centralized servers.
👉 Start exploring decentralized applications today
Frequently Asked Questions
Q: What happens if I lose my private key?
A: Access to your funds is permanently lost. There's no recovery mechanism in decentralized systems. Always back up your seed phrase securely—preferably offline on paper or hardware.
Q: Can I split my private key for shared custody?
A: No—splitting keys compromises security exponentially. Instead, use multi-signature wallets requiring multiple approvals for transactions.
Q: How do I protect against smart contract exploits?
A: Audit code rigorously, use established libraries like OpenZeppelin, test on multiple environments, and consider bug bounty programs before launch.
Q: Why do NFTs rely on off-chain metadata?
A: On-chain storage is prohibitively expensive. However, linking to centralized servers creates risks—if the server goes down, metadata disappears. Decentralized storage solutions like IPFS offer better long-term reliability.
Q: Is blockchain truly immutable?
A: Yes—within practical limits. Altering historical data would require rewriting all subsequent blocks and outpacing the network’s hash power (in PoW) or controlling >66% of staked tokens (in PoS), making tampering economically unfeasible.
Q: How does Ethereum prevent infinite loops in smart contracts?
A: Through gas limits. Every computational step consumes gas; when gas runs out, execution halts immediately—even if mid-loop—preventing denial-of-service attacks.
Final Thoughts
Blockchain technology continues to mature—from Bitcoin’s robust monetary network to Ethereum’s programmable economy. Understanding these foundations empowers developers to build secure dApps, investors to evaluate projects critically, and users to navigate Web3 safely. As innovation accelerates, staying grounded in first principles remains essential.