System Design Essentials
Fundamental concepts and patterns for designing scalable, reliable systems.
Core Principles
Scalability
Vertical Scaling (Scale Up)
Horizontal Scaling (Scale Out)
Reliability
Availability = Uptime / (Uptime + Downtime)
| Availability | Downtime/Year | Use Case |
|---|---|---|
| 99% (2 nines) | 3.65 days | Internal tools |
| 99.9% (3 nines) | 8.76 hours | Standard services |
| 99.99% (4 nines) | 52.56 minutes | Critical services |
| 99.999% (5 nines) | 5.26 minutes | Mission critical |
Performance
Key Metrics:
System Components
Load Balancer
Distributes incoming traffic across multiple servers.
Algorithms:
Types:
Caching
Store frequently accessed data in fast storage.
Cache Levels:Client Cache → CDN → Application Cache → Database Cache
Strategies:
Cache-Aside (Lazy Loading)1. Check cache
2. If miss, query database
3. Store in cache
4. Return data
Write-Through
1. Write to cache
2. Write to database synchronously
3. Return success
Write-Behind (Write-Back)
1. Write to cache
2. Queue database write
3. Write to database asynchronously
Eviction Policies:
Database
SQL (Relational)
NoSQL
Types:
Message Queue
Asynchronous communication between services.
Benefits:
Patterns:
Examples:
Design Patterns
Microservices vs Monolith
Monolith:
Microservices:
API Gateway
Single entry point for all client requests.
Responsibilities:
Service Discovery
Client-Side Discovery:Client → Service Registry → Service Instance
Server-Side Discovery:
Client → Load Balancer → Service Registry → Service Instance
Tools:
Circuit Breaker
Prevent cascading failures.
States:
class CircuitBreaker:
def __init__(self, threshold=5, timeout=60):
self.failure_count = 0
self.threshold = threshold
self.timeout = timeout
self.state = "CLOSED"
self.last_failure_time = None
def call(self, func):
if self.state == "OPEN":
if time.time() - self.last_failure_time > self.timeout:
self.state = "HALF_OPEN"
else:
raise Exception("Circuit breaker is OPEN")
try:
result = func()
self.on_success()
return result
except Exception as e:
self.on_failure()
raise e
Rate Limiting
Control request rate to prevent abuse.
Algorithms:
Token Bucket- Bucket holds tokens
- Tokens added at fixed rate
- Request consumes token
- Reject if no tokens available
Leaky Bucket
- Requests enter bucket
- Process at fixed rate
- Overflow requests rejected
Fixed Window
- Count requests per time window
- Reset counter at window end
- Simple but has boundary issues
Sliding Window
- Track requests with timestamps
- Count in sliding time window
- More accurate, higher memory
Data Management
Database Sharding
Split data across multiple databases.
Strategies:
Horizontal Sharding (Range-Based)User ID 1-1000 → Shard 1
User ID 1001-2000 → Shard 2
Hash-Based Sharding
Shard = hash(user_id) % num_shards
Geographic Sharding
US users → US Shard
EU users → EU Shard
Challenges:
Database Replication
Master-Slave (Primary-Replica)Write → Master → Replicate → Slaves
Read → Slaves (load balanced)
Master-Master (Multi-Master)
Write → Master 1 ↔ Master 2
- Active-active setup
- Conflict resolution needed
Benefits:
CAP Theorem
You can only have 2 of 3:
C (Consistency)
A (Availability)
P (Partition Tolerance)
Real-world choices:
System Design Process
1. Requirements
Functional:
Non-Functional:
2. Capacity Estimation
Example: URL Shortener
Traffic:
- 100M new URLs per month
- Read:Write = 100:1
- Write: 100M / (30 days × 86400 sec) ≈ 40 URLs/sec
- Read: 40 × 100 = 4000 URLs/sec
Storage:
- 100M URLs × 12 months × 5 years = 6B URLs
- Average URL size: 500 bytes
- Total: 6B × 500 bytes = 3 TB
Bandwidth:
- Write: 40 URLs/sec × 500 bytes = 20 KB/sec
- Read: 4000 URLs/sec × 500 bytes = 2 MB/sec
Cache:
- 80-20 rule: 20% URLs = 80% traffic
- Cache: 4000 req/sec × 86400 sec = 345M requests/day
- 20% of daily: 69M URLs × 500 bytes = 35 GB
3. API Design
POST /api/v1/urls
Body: { "long_url": "https://example.com/very/long/url" }
Response: { "short_url": "https://short.ly/abc123" }
GET /api/v1/urls/{short_code}
Response: 302 Redirect to long_url
DELETE /api/v1/urls/{short_code}
Response: 204 No Content
4. High-Level Design
Client
↓
CDN / Load Balancer
↓
API Gateway
↓
Application Servers (Stateless)
↓
Cache (Redis)
↓
Database (Primary + Replicas)
↓
Object Storage (S3)
5. Detailed Design
Focus on:
6. Bottlenecks & Trade-offs
Identify:
Solutions:
Common System Designs
URL Shortener
Key Components:
1. Generate unique short code (hash or counter)
2. Store mapping in database
3. Cache popular URLs
4. Redirect with 301/302
Notification System
Types:
Event → Message Queue → Notification Service → Provider API
↓
User Preferences DB
Rate Limiter
Requirements:
Client → Rate Limiter (Redis) → API Server
- Store counters per user/IP
- Sliding window algorithm
- Return 429 if exceeded
News Feed
Components:
Fanout Approaches:
Fanout on Write (Push)Post created → Write to all followers' feeds
+ Fast reads
- Slow writes for popular users
Fanout on Read (Pull)
User requests feed → Fetch from followed users
+ Fast writes
- Slow reads
Hybrid
- Push for regular users
- Pull for celebrities
- Best of both worlds
Chat System
Features:
Technologies:
Best Practices
1. Start Simple
Begin with monolith, scale to microservices when needed.
2. Design for Failure
3. Monitor Everything
Metrics:
Tools: