Imagine thousands of users trying to access your website simultaneously. How do you ensure no single server gets overwhelmed whilst others sit idle? That’s where load balancers come in—they’re the traffic controllers of the internet.
Load balancers are crucial for building scalable, highly available systems. They’re one of the first components you’ll add when scaling from one server to many, and understanding them is essential for any system design interview.
What You’ll Learn
In this section, we’ll explore:
- What is a load balancer?: The basics of distributing traffic across multiple servers
- Load balancing algorithms: From simple round-robin to sophisticated least-connection strategies
- Global vs local load balancing: Distributing traffic across data centres vs within them
- Health checks: Ensuring traffic only goes to healthy servers
- Session persistence: Handling stateful applications
- Layer 4 vs Layer 7 load balancing: Understanding network layers and their trade-offs
Why Load Balancers Matter
Load balancers are everywhere in modern infrastructure:
- They enable horizontal scaling: add more servers instead of buying bigger ones
- They provide high availability: if one server fails, others keep serving requests
- They improve performance: distribute load to prevent any single server from becoming a bottleneck
- They enable zero-downtime deployments: route traffic away from servers being updated
In system design interviews, you’ll almost always need to include load balancers when designing systems for millions of users.
Real-World Usage
Every major web service uses load balancers:
- Google uses global load balancers to route you to the nearest data centre
- Netflix uses load balancers to distribute streaming requests across thousands of servers
- Your favourite online shop uses load balancers to handle traffic spikes during sales
Let’s explore how to design and use load balancers effectively!