November 26, 2025· 13 min read

The Only Guide You'd Ever Need for Load Balancers - 3

#load balancers #computer science #development #system design #proxy #networking #sockets

Dedicated Load Balancer - Proxy

Alright, if you’re coming from part 2, you just watched DNS Round Robin absolutely destroy our servers. We learned that rotating IP addresses in DNS responses is about as effective as you (not at all). It works in theory, but in practice? Big fat L. So we ended part 2 with a realization: we need something smarter that sits between clients and servers. Something that makes intelligent decisions in real time, not something that just rotates IPs and hopes for the best.

The Proxy Pattern

Before we finally dive into load balancers specifically, let’s talk about proxies in general. The concept is dead simple, but it’s the foundation of everything we’re about to build.

A proxy is just something that acts as a middleware between two parties

Think of it like this: You want to ask your crush out, but you’re a pus-…coward (same). So you get your friend to ask for you. Your friend is the proxy. They take your request, forward it to your crush, get the response (“ew no”), and bring it back to you.

In networking, the same concept applies:

Without Proxy:
Client ----directly----> Server

With Proxy:
Client ----> Proxy ----> Server
       <----        <----

The client talks to the proxy, the proxy talks to the server. The client might not even know the server exists. The server might not know the real client exists. The proxy is the go to drug dealer for both.

Forward Proxy vs Reverse Proxy

Now here’s where people get confused, because there are two types of proxies and they do opposite things.

Forward Proxy (Client’s Dealer)

A forward proxy sits in front of clients and makes requests on their behalf.

[Client] ----> [Forward Proxy] ----> [Internet/Server]

The server sees the proxy’s IP, not the client’s IP. The proxy is hiding/representing the client.

For example, VPNs are forward proxies. When you use a VPN:

You send your request to the VPN server (proxy)
The VPN server forwards your request to the actual destination
The destination sees the VPN’s IP, not yours
The response comes back through the VPN to you

Your company’s web filter? Forward proxy. Your school’s firewall? Forward proxy. That thing blocking you from watching porno at work? Forward proxy.

Reverse Proxy (Server’s Dealer)

A reverse proxy sits in front of servers and receives requests on their behalf.

[Client] ----> [Reverse Proxy] ----> [Actual Servers]

The client sees the proxy’s IP, not the server’s IP. The proxy is hiding/representing the servers.

For example, when you visit www.google.com, you’re not actually talking directly to Google’s search servers. You’re talking to a reverse proxy that then forwards your request to one of thousands of backend servers. You have no idea which server actually processed your search.

Forward vs Reverse Proxy

Load balancers are reverse proxies. They sit in front of your backend servers and distribute incoming requests across them.

Forward Proxy: Hides the CLIENT from the server
Reverse Proxy: Hides the SERVER from the client

Forward Proxy: Client says "get this for me"
Reverse Proxy: Server says "handle this for me"

The Load Balancer Architecture

Remember our three servers from part 1?

Server 1: 192.168.1.10
Server 2: 192.168.1.11
Server 3: 192.168.1.12

And remember the question marks in the diagram? Now we’re replacing them with an actual load balancer:

Load Balancer Architecture

Here’s what changes from the client’s perspective:

Before (DNS Round Robin):

Client: "Hey DNS, what's the IP for www.forwingmen.com?"
DNS: "Here's three IPs, pick one lol"
Client: "Uh okay, I'll use the first one I guess?"
Client connects to 192.168.1.10 directly

After (Load Balancer):

Client: "Hey DNS, what's the IP for www.forwingmen.com?"
DNS: "203.0.113.50"
Client: "Cool, just one IP, nice"
Client connects to 203.0.113.50 (the load balancer)
Load Balancer: *intelligently picks a server*
Load Balancer connects to 192.168.1.11

Client’s POV

From the client’s point of view, they’re talking to one server at 203.0.113.50. They have no idea there are three backend servers. They don’t need to know. That’s the whole point.

The Load Balancer’s Job Description

Let’s break down exactly what the load balancer needs to do. If the load balancer was applying for a job, here’s what the job posting would look like:

Wanted: Load Balancer

Responsibilities:

Listen for incoming client connections on port 80 (HTTP) or 443 (HTTPS)
Accept the connection from the client
Choose which backend server should handle this request (using some algorithm)
Establish a connection to the chosen backend server
Forward the client’s request to the backend server
Receive the response from the backend server
Forward the response back to the client
Close connections properly when done
Monitor backend server health continuously
Exclude unhealthy servers from the pool

Required Skills:

Network programming
Concurrent connection handling
Basic health checking
Patience dealing with client timeouts

Nice to Have:

Session persistence
SSL/TLS termination
Request logging
Metrics collection

The basic flow:

Step 1: Client Connection
[Client] --SYN--> [Load Balancer]
[Client] <--SYN-ACK-- [Load Balancer]
[Client] --ACK--> [Load Balancer]
(TCP handshake complete)

Step 2: Client sends request
[Client] --HTTP GET /wingman--> [Load Balancer]

Step 3: Load balancer picks a server
[Load Balancer] internally: "Hmm, round robin says Server 2 is next"

Step 4: Load balancer connects to backend
[Load Balancer] --SYN--> [Server 2]
[Load Balancer] <--SYN-ACK-- [Server 2]
[Load Balancer] --ACK--> [Server 2]

Step 5: Load balancer forwards request
[Load Balancer] --HTTP GET /wingman--> [Server 2]

Step 6: Server processes and responds
[Server 2] --HTTP 200 + HTML--> [Load Balancer]

Step 7: Load balancer forwards response to client
[Load Balancer] --HTTP 200 + HTML--> [Client]

Step 8: Connections close
[Client] --FIN--> [Load Balancer]
[Load Balancer] --FIN--> [Server 2]

The load balancer is literally in the middle of everything. Every byte from the client goes through it, and every byte from the server goes through it.

Sockets 101

To build a load balancer, we need to understand how network programming actually works at a basic level. Don’t worry, I’m not going to make this a computer networks blog. Just the main stuff.

What’s a Socket?

A socket is basically an endpoint for network communication. Think of it like a phone number. If you want to talk to someone, you need their phone number. If you want to send data over the network, you need a socket.

A socket is identified by:

IP Address: Which machine (e.g., 192.168.1.10)
Port: Which application on that machine (e.g., 80 for HTTP, 443 for HTTPS)

So 192.168.1.10:80 is a socket address that means “the HTTP server running on the machine at 192.168.1.10”

A Socket’s Lifecycle

When you’re writing network code, you work with sockets in a specific sequence:

Server Side (what our load balancer does when listening for clients):

1. socket() - Create a socket
   "Hey OS, give me a socket to work with"

2. bind() - Bind to an address and port
   "I want to listen on 203.0.113.50:80"

3. listen() - Start listening for connections
   "I'm ready to accept incoming connections"

4. accept() - Accept a client connection
   "A client wants to connect, let them in"
   (This blocks until a client connects)

5. recv()/send() - Receive/send data
   "Read what the client sent"
   "Send response back to client"

6. close() - Close the connection
   "We're done, close the connection"

Client Side (what our load balancer does when connecting to backend servers):

1. socket() - Create a socket
   "Hey OS, give me a socket"

2. connect() - Connect to the server
   "I want to connect to 192.168.1.10:80"

3. send()/recv() - Send/receive data
   "Send my request"
   "Read the response"

4. close() - Close the connection
   "We're done"

Our load balancer does both. It acts as a server to clients, and as a client to backend servers. It’s playing both roles.

The Simplest TCP Load Balancer (Pseudo-code)

Let’s write pseudo-code for the absolute simplest load balancer possible. This will just forward TCP connections.

// config
VARIABLE LOAD_BALANCER_IP = "203.0.113.50"
VARIABLE LOAD_BALANCER_PORT = 80

BACKEND_SERVERS = [
    ("192.168.1.10", 80),
    ("192.168.1.11", 80),
    ("192.168.1.12", 80)
]

VARIABLE current_server_index = 0  // for round robin

// step 1: create and configure the listening socket
VARIABLE listen_socket = socket.create()

BIND OUR LOAD_BALANCER_IP AND PORT WITH listen_socket
START ACCEPTING & LISTENING TO INCOMING REQUESTS

// step 2: main loop...accept connections forever
while true:
    // accept a client connection
    WAIT FOR CONNECTION FROM CLIENT
    GET CLIENT'S ADDRESS

    # step 3: pick a backend server (round robin)
    VARIABLE backend_server = BACKEND_SERVERS[current_server_index]
    VARIABLE current_server_index = (current_server_index + 1) % len(BACKEND_SERVERS)

    FORWARDING TO...

    # step 4: connect to the backend server
    CREATE A SOCKET AND CONNECT TO THE CHOSEN BACKEND SERVER & ITS PORT

    # step 5: forward data in both directions
    # very simplified, in reality you'd do this concurrently
    while true:
       RECEIVE DATA FROM CLIENT'S SOCKET

       FORWARD TO BACKEND

       READ RESPONSE FROM BACKEND

       FORWARD BACKEND'S RESPONSE TO CLIENT

    # step 6: close connections
    CLOSE CLIENT SOCKET
    CLOSE BACKEND SOCKET

This is obviously oversimplified (we’re not handling errors, we’re blocking on reads, we’re only handling one connection at a time, etc.), but it captures the essence:

Listen for clients
Pick a backend server
Forward request to backend
Forward response to client
Close everything

That’s literally it. That’s the core of what a load balancer does.

Request-Response Cycle

Let’s walk through what happens when Sydney visits www.forwingmen.com with our new load balancer in place.

Step 1: DNS Resolution

Sydney's browser: "What's the IP for www.forwingmen.com?"
DNS server: "203.0.113.50"
Sydney's browser: "ty!"

Just one IP this time, not three like with DNS Round Robin. Much cleaner.

Step 2: Sydney’s Browser Connects

Time: 10:00:00.000
Sydney's browser creates a socket
Sydney's browser connects to 203.0.113.50:80

Step 3: Load Balancer Picks a Server

Time: 10:00:00.001
Load balancer: "Round robin says Server 1 is next"
Current server index: 0 → 1 (for next connection)
Chosen server: 192.168.1.10:80

Step 4: Load Balancer Connects to Backend

Time: 10:00:00.002
Load balancer creates a new socket
Load balancer connects to 192.168.1.10:80
Connection established

Now the load balancer has TWO sockets:

One connected to Sydney (client_socket)
One connected to Server 1 (backend_socket)

Step 5: Sydney Sends HTTP Request

Time: 10:00:00.010
Sydney's browser sends:
---
GET /wingman HTTP/1.1
Host: www.forwingmen.com
User-Agent: Mozilla/5.0
---

This data arrives at the load balancer’s client_socket.

Step 6: Load Balancer Forwards Request

Time: 10:00:00.011
Load balancer reads from client_socket
Load balancer writes to backend_socket
---
GET /wingman HTTP/1.1
Host: www.forwingmen.com
User-Agent: Mozilla/5.0
---

Server 1 receives the exact same request Sydney sent.

Step 7: Server 1 Processes Request

Time: 10:00:00.050
Server 1 processes the request
Server 1 queries database
Server 1 generates HTML
Server 1 sends response:
---
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234

<html>
  <h1>Welcome to Wingman Dating Losers!</h1>
  ...
</html>
---

This data arrives at the load balancer’s backend_socket.

Step 8: Load Balancer Forwards Response

Time: 10:00:00.051
Load balancer reads from backend_socket
Load balancer writes to client_socket
---
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1234

<html>
  <h1>Welcome to Wingman Dating Losers!</h1>
  ...
</html>
---

Sydney’s browser receives the exact same response Server 1 sent.

Step 9: Connections Close

Time: 10:00:00.100
Sydney's browser closes its connection
Load balancer detects connection close
Load balancer closes connection to Server 1
All sockets cleaned up

The Full Timeline

00.000ms: Sydney connects to load balancer
00.001ms: Load balancer picks Server 1
00.002ms: Load balancer connects to Server 1
00.010ms: Sydney sends HTTP request
00.011ms: Load balancer forwards to Server 1
00.050ms: Server 1 sends response
00.051ms: Load balancer forwards to Sydney
00.100ms: Connections close

Total time: 100ms
Load balancer overhead: ~1-2ms

The load balancer added barely any overhead. Sydney got her page in 100ms total, and she has no idea there are three servers behind the scenes.

Request-response cycle

Stateful vs Stateless Load Balancing (an Intro)

Now here’s an important concept we need to touch on before moving forward: state.

Stateless Load Balancing

Our simple load balancer above is stateless. What does that mean?

It means the load balancer doesn’t remember anything about previous requests. Every connection is independent:

Request 1 from Sydney → Load balancer picks Server 1 → Done
Request 2 from Sydney → Load balancer picks Server 2 → Done
Request 3 from Sydney → Load balancer picks Server 3 → Done

The load balancer has no memory. It doesn’t know that all three requests came from Sydney. It doesn’t care. Each request is treated fresh.

Pros:

Simple to implement
Easy to scale (can add more load balancers)
No memory overhead
Fast

Cons:

Can break sessions (like we saw with DNS Round Robin)
No awareness of user behavior
Can’t make intelligent decisions based on history

Stateless load balancing

Stateful Load Balancing

A stateful load balancer remembers things about previous requests:

Request 1 from Sydney → Load balancer picks Server 1 → Remembers: "Sydney → Server 1"
Request 2 from Sydney → Load balancer sees Sydney → Routes to Server 1 again
Request 3 from Sydney → Load balancer sees Sydney → Routes to Server 1 again

The load balancer maintains state. It knows Sydney should stick to Server 1.

Stateful load balancing

Which should you use?

Most modern applications are moving towards stateless load balancing with stateless applications. Instead of having the load balancer or individual servers remember session state, you store it in a shared place like Redis or a database.

But we’ll cover session persistence in detail later. For now, just know the difference exists.

The Problems We Haven’t Solved Yet

Our simple TCP load balancer works, but it’s missing a ton of stuff:

1. Concurrency

Right now, we can only handle one connection at a time. If Sydney is downloading a large file and Sweeney tries to connect, Sweeney has to wait until Sydney is done. That’s trash.

We need: Multi-threading or async I/O to handle thousands of concurrent connections.

2. Health Checking

What if Server 2 crashes? Our load balancer will happily keep sending requests to it, and users will get errors.

We need: Active health checking to detect and exclude dead servers.

3. Algorithm Sophistication

Round robin is simple, but it treats all servers equally and doesn’t account for:

Different server capacities
Current server load
Server response times

We need: Better algorithms (I’ll cover it later, dw).

4. Protocol Awareness

Our TCP load balancer is just forwarding bytes. It doesn’t understand HTTP. That means it can’t:

Route based on URL path
Add headers
Terminate SSL
Parse cookies

We need: Layer 7 (HTTP) load balancing.

5. Error Handling

What happens when:

A backend server times out?
A connection is interrupted mid-request?
A server starts responding slowly?

We need: Proper timeout handling and retry logic.

6. Connection Management

Creating a new backend connection for every client connection is expensive. TCP handshake takes time. TLS handshake takes even more time.

We need: Connection pooling.

But here’s the thing, we’ve built the foundation. Everything else is just adding features on top of this basic pattern:

1. Accept client connection
2. Pick backend server
3. Connect to backend server
4. Forward request
5. Forward response
6. Close connections

This core flow never changes. We just make each step smarter, and we will later.

What Now?

In part 4, we’re going to actually implement this. Like, real code. Not pseudo stuff. We’ll build a working load balancer that can:

Accept client connections
Distribute traffic using round robin
Handle multiple concurrent connections
Forward HTTP requests and responses
Be actually usable (not useful, just usable. There’s a difference.)

We’ll write code, we’ll test it, we’ll see it work, and we’ll identify its limitations. Then in the parts after that, we’ll make it better piece by piece. But the core pattern you learned today stays the same. Accept -> pick -> forward -> respond -> close. That’s the heart of every load balancer ever.

Feel free to DM me on X / Twitter if you have questions or feedback. I’d love to hear what your opinions, I take them really seriously

See you in part 4, where we actually write some code :)