The Only Guide You'd Ever Need for Load Balancers - 3
Dedicated Load Balancer - Proxy
Alright, if you’re coming from part 2, you just watched DNS Round Robin absolutely destroy our servers. We learned that rotating IP addresses in DNS responses is about as effective as you (not at all). It works in theory, but in practice? Big fat L. So we ended part 2 with a realization: we need something smarter that sits between clients and servers. Something that makes intelligent decisions in real time, not something that just rotates IPs and hopes for the best.
The Proxy Pattern
Before we finally dive into load balancers specifically, let’s talk about proxies in general. The concept is dead simple, but it’s the foundation of everything we’re about to build.
A proxy is just something that acts as a middleware between two parties
Think of it like this: You want to ask your crush out, but you’re a pus-…coward (same). So you get your friend to ask for you. Your friend is the proxy. They take your request, forward it to your crush, get the response (“ew no”), and bring it back to you.
In networking, the same concept applies:
Without Proxy:
Client ----directly----> Server
With Proxy:
Client ----> Proxy ----> Server
<---- <----
The client talks to the proxy, the proxy talks to the server. The client might not even know the server exists. The server might not know the real client exists. The proxy is the go to drug dealer for both.
Forward Proxy vs Reverse Proxy
Now here’s where people get confused, because there are two types of proxies and they do opposite things.
Forward Proxy (Client’s Dealer)
A forward proxy sits in front of clients and makes requests on their behalf.
[Client] ----> [Forward Proxy] ----> [Internet/Server]
The server sees the proxy’s IP, not the client’s IP. The proxy is hiding/representing the client.
For example, VPNs are forward proxies. When you use a VPN:
- You send your request to the VPN server (proxy)
- The VPN server forwards your request to the actual destination
- The destination sees the VPN’s IP, not yours
- The response comes back through the VPN to you
Your company’s web filter? Forward proxy. Your school’s firewall? Forward proxy. That thing blocking you from watching porno at work? Forward proxy.
Reverse Proxy (Server’s Dealer)
A reverse proxy sits in front of servers and receives requests on their behalf.
[Client] ----> [Reverse Proxy] ----> [Actual Servers]
The client sees the proxy’s IP, not the server’s IP. The proxy is hiding/representing the servers.
For example, when you visit www.google.com, you’re not actually talking directly to Google’s search servers. You’re talking to a reverse proxy that then forwards your request to one of thousands of backend servers. You have no idea which server actually processed your search.

Load balancers are reverse proxies. They sit in front of your backend servers and distribute incoming requests across them.
Forward Proxy: Hides the CLIENT from the server
Reverse Proxy: Hides the SERVER from the client
Forward Proxy: Client says "get this for me"
Reverse Proxy: Server says "handle this for me"
The Load Balancer Architecture
Remember our three servers from part 1?
Server 1: 192.168.1.10
Server 2: 192.168.1.11
Server 3: 192.168.1.12
And remember the question marks in the diagram? Now we’re replacing them with an actual load balancer:

Here’s what changes from the client’s perspective:
Before (DNS Round Robin):
Client: "Hey DNS, what's the IP for www.forwingmen.com?"
DNS: "Here's three IPs, pick one lol"
Client: "Uh okay, I'll use the first one I guess?"
Client connects to 192.168.1.10 directly
After (Load Balancer):
Client: "Hey DNS, what's the IP for www.forwingmen.com?"
DNS: "203.0.113.50"
Client: "Cool, just one IP, nice"
Client connects to 203.0.113.50 (the load balancer)
Load Balancer: *intelligently picks a server*
Load Balancer connects to 192.168.1.11

From the client’s point of view, they’re talking to one server at 203.0.113.50. They have no idea there are three backend servers. They don’t need to know. That’s the whole point.
The Load Balancer’s Job Description
Let’s break down exactly what the load balancer needs to do. If the load balancer was applying for a job, here’s what the job posting would look like:
Wanted: Load Balancer
Responsibilities:
- Listen for incoming client connections on port 80 (HTTP) or 443 (HTTPS)
- Accept the connection from the client
- Choose which backend server should handle this request (using some algorithm)
- Establish a connection to the chosen backend server
- Forward the client’s request to the backend server
- Receive the response from the backend server
- Forward the response back to the client
- Close connections properly when done
- Monitor backend server health continuously
- Exclude unhealthy servers from the pool
Required Skills:
- Network programming
- Concurrent connection handling
- Basic health checking
- Patience dealing with client timeouts
Nice to Have:
- Session persistence
- SSL/TLS termination
- Request logging
- Metrics collection
The basic flow:
Step 1: Client Connection
[Client] --SYN--> [Load Balancer]
[Client] <--SYN-ACK-- [Load Balancer]
[Client] --ACK--> [Load Balancer]
(TCP handshake complete)
Step 2: Client sends request
[Client] --HTTP GET /wingman--> [Load Balancer]
Step 3: Load balancer picks a server
[Load Balancer] internally: "Hmm, round robin says Server 2 is next"
Step 4: Load balancer connects to backend
[Load Balancer] --SYN--> [Server 2]
[Load Balancer] <--SYN-ACK-- [Server 2]
[Load Balancer] --ACK--> [Server 2]
Step 5: Load balancer forwards request
[Load Balancer] --HTTP GET /wingman--> [Server 2]
Step 6: Server processes and responds
[Server 2] --HTTP 200 + HTML--> [Load Balancer]
Step 7: Load balancer forwards response to client
[Load Balancer] --HTTP 200 + HTML--> [Client]
Step 8: Connections close
[Client] --FIN--> [Load Balancer]
[Load Balancer] --FIN--> [Server 2]
The load balancer is literally in the middle of everything. Every byte from the client goes through it, and every byte from the server goes through it.
Sockets 101
To build a load balancer, we need to understand how network programming actually works at a basic level. Don’t worry, I’m not going to make this a computer networks blog. Just the main stuff.
What’s a Socket?
A socket is basically an endpoint for network communication. Think of it like a phone number. If you want to talk to someone, you need their phone number. If you want to send data over the network, you need a socket.
A socket is identified by:
- IP Address: Which machine (e.g., 192.168.1.10)
- Port: Which application on that machine (e.g., 80 for HTTP, 443 for HTTPS)
So 192.168.1.10:80 is a socket address that means “the HTTP server running on the machine at 192.168.1.10”
A Socket’s Lifecycle
When you’re writing network code, you work with sockets in a specific sequence:
Server Side (what our load balancer does when listening for clients):
1. socket() - Create a socket
"Hey OS, give me a socket to work with"
2. bind() - Bind to an address and port
"I want to listen on 203.0.113.50:80"
3. listen() - Start listening for connections
"I'm ready to accept incoming connections"
4. accept() - Accept a client connection
"A client wants to connect, let them in"
(This blocks until a client connects)
5. recv()/send() - Receive/send data
"Read what the client sent"
"Send response back to client"
6. close() - Close the connection
"We're done, close the connection"
Client Side (what our load balancer does when connecting to backend servers):
1. socket() - Create a socket
"Hey OS, give me a socket"
2. connect() - Connect to the server
"I want to connect to 192.168.1.10:80"
3. send()/recv() - Send/receive data
"Send my request"
"Read the response"
4. close() - Close the connection
"We're done"
Our load balancer does both. It acts as a server to clients, and as a client to backend servers. It’s playing both roles.
The Simplest TCP Load Balancer (Pseudo-code)
Let’s write pseudo-code for the absolute simplest load balancer possible. This will just forward TCP connections.
// config
VARIABLE LOAD_BALANCER_IP = "203.0.113.50"
VARIABLE LOAD_BALANCER_PORT = 80
BACKEND_SERVERS = [
("192.168.1.10", 80),
("192.168.1.11", 80),
("192.168.1.12", 80)
]
VARIABLE current_server_index = 0 // for round robin
// step 1: create and configure the listening socket
VARIABLE listen_socket = socket.create()
BIND OUR LOAD_BALANCER_IP AND PORT WITH listen_socket
START ACCEPTING & LISTENING TO INCOMING REQUESTS
// step 2: main loop...accept connections forever
while true:
// accept a client connection
WAIT FOR CONNECTION FROM CLIENT
GET CLIENT'S ADDRESS
# step 3: pick a backend server (round robin)
VARIABLE backend_server = BACKEND_SERVERS[current_server_index]
VARIABLE current_server_index = (current_server_index + 1) % len(BACKEND_SERVERS)
FORWARDING TO...
# step 4: connect to the backend server
CREATE A SOCKET AND CONNECT TO THE CHOSEN BACKEND SERVER & ITS PORT
# step 5: forward data in both directions
# very simplified, in reality you'd do this concurrently
while true:
RECEIVE DATA FROM CLIENT'S SOCKET
FORWARD TO BACKEND
READ RESPONSE FROM BACKEND
FORWARD BACKEND'S RESPONSE TO CLIENT
# step 6: close connections
CLOSE CLIENT SOCKET
CLOSE BACKEND SOCKET
This is obviously oversimplified (we’re not handling errors, we’re blocking on reads, we’re only handling one connection at a time, etc.), but it captures the essence:
- Listen for clients
- Pick a backend server
- Forward request to backend
- Forward response to client
- Close everything
That’s literally it. That’s the core of what a load balancer does.
Request-Response Cycle
Let’s walk through what happens when Sydney visits www.forwingmen.com with our new load balancer in place.
Step 1: DNS Resolution
Sydney's browser: "What's the IP for www.forwingmen.com?"
DNS server: "203.0.113.50"
Sydney's browser: "ty!"
Just one IP this time, not three like with DNS Round Robin. Much cleaner.
Step 2: Sydney’s Browser Connects
Time: 10:00:00.000
Sydney's browser creates a socket
Sydney's browser connects to 203.0.113.50:80
Step 3: Load Balancer Picks a Server
Time: 10:00:00.001
Load balancer: "Round robin says Server 1 is next"
Current server index: 0 → 1 (for next connection)
Chosen server: 192.168.1.10:80
Step 4: Load Balancer Connects to Backend
Time: 10:00:00.002
Load balancer creates a new socket
Load balancer connects to 192.168.1.10:80
Connection established
Now the load balancer has TWO sockets:
- One connected to Sydney (client_socket)
- One connected to Server 1 (backend_socket)
Step 5: Sydney Sends HTTP Request
Time: 10:00:00.010
Sydney's browser sends:
---
GET /wingman HTTP/1.1
Host: www.forwingmen.com
User-Agent: Mozilla/5.0
---
This data arrives at the load balancer’s client_socket.
Step 6: Load Balancer Forwards Request
Time: 10:00:00.011
Load balancer reads from client_socket
Load balancer writes to backend_socket
---
GET /wingman HTTP/1.1
Host: www.forwingmen.com
User-Agent: Mozilla/5.0
---
Server 1 receives the exact same request Sydney sent.
Step 7: Server 1 Processes Request
Time: 10:00:00.050
Server 1 processes the request
Server 1 queries database
Server 1 generates HTML
Server 1 sends response:
---
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234
<html>
<h1>Welcome to Wingman Dating Losers!</h1>
...
</html>
---
This data arrives at the load balancer’s backend_socket.
Step 8: Load Balancer Forwards Response
Time: 10:00:00.051
Load balancer reads from backend_socket
Load balancer writes to client_socket
---
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234
<html>
<h1>Welcome to Wingman Dating Losers!</h1>
...
</html>
---
Sydney’s browser receives the exact same response Server 1 sent.
Step 9: Connections Close
Time: 10:00:00.100
Sydney's browser closes its connection
Load balancer detects connection close
Load balancer closes connection to Server 1
All sockets cleaned up
The Full Timeline
00.000ms: Sydney connects to load balancer
00.001ms: Load balancer picks Server 1
00.002ms: Load balancer connects to Server 1
00.010ms: Sydney sends HTTP request
00.011ms: Load balancer forwards to Server 1
00.050ms: Server 1 sends response
00.051ms: Load balancer forwards to Sydney
00.100ms: Connections close
Total time: 100ms
Load balancer overhead: ~1-2ms
The load balancer added barely any overhead. Sydney got her page in 100ms total, and she has no idea there are three servers behind the scenes.

Stateful vs Stateless Load Balancing (an Intro)
Now here’s an important concept we need to touch on before moving forward: state.
Stateless Load Balancing
Our simple load balancer above is stateless. What does that mean?
It means the load balancer doesn’t remember anything about previous requests. Every connection is independent:
Request 1 from Sydney → Load balancer picks Server 1 → Done
Request 2 from Sydney → Load balancer picks Server 2 → Done
Request 3 from Sydney → Load balancer picks Server 3 → Done
The load balancer has no memory. It doesn’t know that all three requests came from Sydney. It doesn’t care. Each request is treated fresh.
Pros:
- Simple to implement
- Easy to scale (can add more load balancers)
- No memory overhead
- Fast
Cons:
- Can break sessions (like we saw with DNS Round Robin)
- No awareness of user behavior
- Can’t make intelligent decisions based on history

Stateful Load Balancing
A stateful load balancer remembers things about previous requests:
Request 1 from Sydney → Load balancer picks Server 1 → Remembers: "Sydney → Server 1"
Request 2 from Sydney → Load balancer sees Sydney → Routes to Server 1 again
Request 3 from Sydney → Load balancer sees Sydney → Routes to Server 1 again
The load balancer maintains state. It knows Sydney should stick to Server 1.

Which should you use?
Most modern applications are moving towards stateless load balancing with stateless applications. Instead of having the load balancer or individual servers remember session state, you store it in a shared place like Redis or a database.
But we’ll cover session persistence in detail later. For now, just know the difference exists.
The Problems We Haven’t Solved Yet
Our simple TCP load balancer works, but it’s missing a ton of stuff:
1. Concurrency
Right now, we can only handle one connection at a time. If Sydney is downloading a large file and Sweeney tries to connect, Sweeney has to wait until Sydney is done. That’s trash.
We need: Multi-threading or async I/O to handle thousands of concurrent connections.
2. Health Checking
What if Server 2 crashes? Our load balancer will happily keep sending requests to it, and users will get errors.
We need: Active health checking to detect and exclude dead servers.
3. Algorithm Sophistication
Round robin is simple, but it treats all servers equally and doesn’t account for:
- Different server capacities
- Current server load
- Server response times
We need: Better algorithms (I’ll cover it later, dw).
4. Protocol Awareness
Our TCP load balancer is just forwarding bytes. It doesn’t understand HTTP. That means it can’t:
- Route based on URL path
- Add headers
- Terminate SSL
- Parse cookies
We need: Layer 7 (HTTP) load balancing.
5. Error Handling
What happens when:
- A backend server times out?
- A connection is interrupted mid-request?
- A server starts responding slowly?
We need: Proper timeout handling and retry logic.
6. Connection Management
Creating a new backend connection for every client connection is expensive. TCP handshake takes time. TLS handshake takes even more time.
We need: Connection pooling.
But here’s the thing, we’ve built the foundation. Everything else is just adding features on top of this basic pattern:
1. Accept client connection
2. Pick backend server
3. Connect to backend server
4. Forward request
5. Forward response
6. Close connections
This core flow never changes. We just make each step smarter, and we will later.
What Now?
In part 4, we’re going to actually implement this. Like, real code. Not pseudo stuff. We’ll build a working load balancer that can:
- Accept client connections
- Distribute traffic using round robin
- Handle multiple concurrent connections
- Forward HTTP requests and responses
- Be actually usable (not useful, just usable. There’s a difference.)
We’ll write code, we’ll test it, we’ll see it work, and we’ll identify its limitations. Then in the parts after that, we’ll make it better piece by piece. But the core pattern you learned today stays the same. Accept -> pick -> forward -> respond -> close. That’s the heart of every load balancer ever.
Feel free to DM me on X / Twitter if you have questions or feedback. I’d love to hear what your opinions, I take them really seriously
See you in part 4, where we actually write some code :)