The Only Guide You'd Ever Need for Load Balancers - 4
Building Our First Load Balancer - Round Robin Implementation
Alright, if you made it through part 3, you now understand the proxy pattern, you know what sockets are, yada yada. We’ve been talking theory for three parts now. Things are easier said than done.
It’s time to actually write some mf code.
In this part, we’re going to build a real, working load balancer. Not pseudo shit. Actual code that you can run on your machine right now. We’ll start simple and gradually add complexity, because that’s how you actually learn this stuff IMO.
By the end of this post, you’ll have a load balancer that:
- Accepts real HTTP connections
- Distributes traffic across multiple backend servers using Round Robin
- Forwards requests and responses correctly
- Handles multiple concurrent connections
- Actually works (I promise)
Let’s go.
Setup
Why Go?
I’m using Go for this series because:
- It’s built for this - Go was literally designed for network programming and concurrent systems
- Goroutines are magic - Lightweight concurrency that makes handling thousands of connections trivial
- Shit is fast - Do I need to say more?
- And because it’s cool :)
But Sanchit, I don’t know Go. And that’s fine. I’ll explain everything as we go along with Go. Go is stupid simple to learn TBH. And MB for the amount of go’s in this sentence
Anyway, that being said, just have Go installed and set up. This is not a series where I teach you how to setup a programming language.
What You Need
Nothing, but if you’re following alone, just ensure you have Go installed and set up. Oh, and you’ve read the previous parts.
Our Pilot :)
We’ll start off simple. Just create a directory and initialize the Go module:
mkdir load-balancer && cd load-balancer
go mod init loadbalancer
Server Pool
Before we write any network code, we need to figure out how to store our backend servers. This is called the “server pool.”
What Do We Need to Track?
For each backend server, we need:
- Host (IP address or hostname)
- Port
- Status (is it up or down? We’ll add this later though)
For the pool as a whole, we need:
- List of servers
- Current index (for Round Robin)
- Thread safety (so multiple goroutines don’t mess up the index)
The Basic Structure
type Backend struct {
Host string
Port int
}
type ServerPool struct {
backends []Backend
current int
mux sync.Mutex
}
Simple, right? We have a list of backends and an index pointing to the current one. The sync.Mutex will handle thread safety (or “goroutine safety”).
Creating a New Pool
func NewServerPool() *ServerPool {
return &ServerPool{
backends: make([]Backend, 0),
current: 0,
}
}
Adding Servers
func (p *ServerPool) AddBackend(host string, port int) {
backend := Backend{
Host: host,
Port: port,
}
p.backends = append(p.backends, backend)
log.Printf("[POOL] Added server: %s:%d", host, port)
}
Getting the Next Server (Round Robin)
This is where the magic happens:
func (p *ServerPool) GetNextBackend() *Backend {
p.mux.Lock()
defer p.mux.Unlock()
if len(p.backends) == 0 {
return nil
}
// current backend
backend := &p.backends[p.current]
// next (with wraparound)
p.current = (p.current + 1) % len(p.backends)
return backend
}
Let’s visualize what’s happening:

The % len(self.servers) is what makes it wrap around. When we hit the end, we go back to the beginning.
Thread Safety Problem
But wait. What if two goroutines call GetNextBackend() at the exact same time?

This is a race condition. Multiple goroutines accessing shared data simultaneously is a mess.
The Fix: Mutex
Go has a simple solution, a mutex:
func (p *ServerPool) GetNextBackend() *Backend {
p.mux.Lock() // lock it as soon as one goroutine's here
defer p.mux.Unlock() // unlock when function returns
if len(p.backends) == 0 {
return nil
}
backend := &p.backends[p.current]
p.current = (p.current + 1) % len(p.backends)
return backend
}
The p.mux.Lock() ensures only one goroutine can execute this code at a time. The defer p.mux.Unlock() ensures the mutex gets unlocked even if we return early or panic. Problem solved.
The Complete Server Pool
Here’s our full server pool implementation:
package main
import (
"log"
"sync"
)
type Backend struct {
Host string
Port int
}
type ServerPool struct {
backends []Backend
current int
mux sync.Mutex
}
func NewServerPool() *ServerPool {
return &ServerPool{
backends: make([]Backend, 0),
current: 0,
}
}
func (p *ServerPool) AddBackend(host string, port int) {
backend := Backend{
Host: host,
Port: port,
}
p.backends = append(p.backends, backend)
log.Printf("[POOL] Added server: %s:%d", host, port)
}
func (p *ServerPool) GetNextBackend() *Backend {
p.mux.Lock()
defer p.mux.Unlock()
if len(p.backends) == 0 {
return nil
}
backend := &p.backends[p.current]
p.current = (p.current + 1) % len(p.backends)
return backend
}
func (p *ServerPool) Size() int {
return len(p.backends)
}
That’s it. Simple & clean.
Building the Load Balancer
Now for the main event. Let’s build the actual load balancer.
The Architecture (Reminder)

The load balancer needs to:
- Listen for client connections
- Accept client connections
- Pick a backend server
- Connect to backend server
- Forward client request to backend
- Forward backend response to client
- Close both connections
The Main Structure
type LoadBalancer struct {
host string
port int
serverPool *ServerPool
}
func NewLoadBalancer(host string, port int, pool *ServerPool) *LoadBalancer {
return &LoadBalancer{
host: host,
port: port,
serverPool: pool,
}
}
Starting the Load Balancer
func (lb *LoadBalancer) Start() error {
address := fmt.Sprintf("%s:%d", lb.host, lb.port)
// tcp listener
listener, err := net.Listen("tcp", address)
if err != nil {
return fmt.Errorf("failed to start listener: %v", err)
}
defer listener.Close()
log.Printf("[LB] Load Balancer started on %s", address)
log.Printf("[LB] Backend servers: %d", lb.serverPool.Size())
// we accept connections here...forever
for {
conn, err := listener.Accept()
if err != nil {
log.Printf("[LB] Failed to accept connection: %v", err)
continue
}
// handlin this connection in a new goroutine
go lb.handleConnection(conn)
}
}
Let’s break this down:
net.Listen("tcp", address)creates a TCP listener on the specified address. This is like creating and binding a socket.listener.Accept()blocks until a client connects. When one does, we get a net.Conn for that specific client.go lb.handleConnection(conn)handle each client in a separate goroutine. Goroutines are so lightweight. You can have thousands of them running simultaneously with very little overhead.
Handling a Client Connection
Now the real work…forwarding traffic:
func (lb *LoadBalancer) handleConnection(clientConn net.Conn) {
defer clientConn.Close()
backend := lb.serverPool.GetNextBackend()
if backend == nil {
log.Printf("[LB] No backend servers available!")
return
}
backendAddress := fmt.Sprintf("%s:%d", backend.Host, backend.Port)
log.Printf("[LB] Forwarding %s → %s", clientConn.RemoteAddr(), backendAddress)
// connect to the backend server
backendConn, err := net.Dial("tcp", backendAddress)
if err != nil {
log.Printf("[LB] Failed to connect to backend %s: %v", backendAddress, err)
return
}
defer backendConn.Close()
// now we have two connections:
// clientConn -> connected to the client
// backendConn -> connected to the backend server
// forward data in both directions
lb.forwardTraffic(clientConn, backendConn)
log.Printf("[LB] Closed connection from %s", clientConn.RemoteAddr())
}
The flow:
- Get next backend server from the pool (Round Robin)
- Connect to that backend server using
net.Dial - Forward data between client and backend
- Defer ensures both connections close when done
Forwarding Data Bidirectionally
This is the trickiest part. We need to forward data in BOTH directions:
- Client → Load Balancer → Backend
- Backend → Load Balancer → Client
We need both directions simultaneously. Solution? More goroutines!

func (lb *LoadBalancer) forwardTraffic(client, backend net.Conn) {
// WaitGroup to wait for both goroutines
var wg sync.WaitGroup
wg.Add(2)
// copy (client to backend)
go func() {
defer wg.Done()
io.Copy(backend, client)
}()
// copy (backend to client)
go func() {
defer wg.Done()
io.Copy(client, backend)
}()
wg.Wait()
}
There’s actually a much better approach for this (TCP half close, if anyone’s interested), but I won’t be implementing that yet. We need to get our logics right first, optimization can be done later when needed.
Now continuing with our actual code, Go’s io.Copy does all the heavy lifting:
- Reads from source
- Writes to destination
- Handles buffering
- Returns when connection closes
And sync.WaitGroup ensures we wait for both goroutines to finish before closing connections.
So clean, I love Go <3
The Complete Load Balancer Code
Here’s the full main.go (explained a bit as well in the comments):
package main
import (
"fmt"
"io"
"log"
"net"
"sync"
)
// this is ONE backend server
type Backend struct {
Host string
Port int
}
// manages the POOL of backend servers
type ServerPool struct {
backends []Backend
current int
mux sync.Mutex
}
// self explanatory...creates a new server pool
func NewServerPool() *ServerPool {
return &ServerPool{
backends: make([]Backend, 0),
current: 0,
}
}
// adds a backend server to the pool
func (p *ServerPool) AddBackend(host string, port int) {
backend := Backend{
Host: host,
Port: port,
}
p.backends = append(p.backends, backend)
log.Printf("[POOL] Added server: %s:%d", host, port)
}
// returns the next backend (using RR)
func (p *ServerPool) GetNextBackend() *Backend {
p.mux.Lock()
defer p.mux.Unlock()
if len(p.backends) == 0 {
return nil
}
backend := &p.backends[p.current]
p.current = (p.current + 1) % len(p.backends)
return backend
}
// total no. of backends
func (p *ServerPool) Size() int {
return len(p.backends)
}
// main LB
type LoadBalancer struct {
host string
port int
serverPool *ServerPool
}
// creates a new LB
func NewLoadBalancer(host string, port int, pool *ServerPool) *LoadBalancer {
return &LoadBalancer{
host: host,
port: port,
serverPool: pool,
}
}
func (lb *LoadBalancer) Start() error {
address := fmt.Sprintf("%s:%d", lb.host, lb.port)
listener, err := net.Listen("tcp", address)
if err != nil {
return fmt.Errorf("failed to start listener: %v", err)
}
defer listener.Close()
log.Printf("[LB] Load Balancer started on %s", address)
log.Printf("[LB] Backend servers: %d", lb.serverPool.Size())
for {
conn, err := listener.Accept()
if err != nil {
log.Printf("[LB] Failed to accept connection: %v", err)
continue
}
log.Printf("[LB] New connection from %s", conn.RemoteAddr())
go lb.handleConnection(conn)
}
}
// handles a SINGLE client conn
func (lb *LoadBalancer) handleConnection(clientConn net.Conn) {
defer clientConn.Close()
backend := lb.serverPool.GetNextBackend()
if backend == nil {
log.Printf("[LB] No backend servers available!")
return
}
backendAddress := fmt.Sprintf("%s:%d", backend.Host, backend.Port)
log.Printf("[LB] Forwarding %s → %s", clientConn.RemoteAddr(), backendAddress)
backendConn, err := net.Dial("tcp", backendAddress)
if err != nil {
log.Printf("[LB] Failed to connect to backend %s: %v", backendAddress, err)
return
}
defer backendConn.Close()
lb.forwardTraffic(clientConn, backendConn)
log.Printf("[LB] Closed connection from %s", clientConn.RemoteAddr())
}
// forwards traffic b/w client & backend
func (lb *LoadBalancer) forwardTraffic(client, backend net.Conn) {
var wg sync.WaitGroup
wg.Add(2)
// client to backend
go func() {
defer wg.Done()
io.Copy(backend, client)
}()
// backend to client
go func() {
defer wg.Done()
io.Copy(client, backend)
}()
wg.Wait()
}
func main() {
pool := NewServerPool()
pool.AddBackend("127.0.0.1", 8081)
pool.AddBackend("127.0.0.1", 8082)
pool.AddBackend("127.0.0.1", 8083)
lb := NewLoadBalancer("0.0.0.0", 8080, pool)
if err := lb.Start(); err != nil {
log.Fatalf("Load balancer failed: %v", err)
}
}
Less than 150 lines (if you remove my comments :p). Simple and clean, and it works.
Creating Test Backend Servers
We need actual servers to test our load balancer. Let’s create simple HTTP servers that tell us which server handled the request.
Create backend/server.go:
package main
import (
"fmt"
"log"
"net/http"
"os"
"strconv"
)
func main() {
if len(os.Args) != 2 {
fmt.Println("Usage: go run server.go <port>")
os.Exit(1)
}
port, err := strconv.Atoi(os.Args[1])
if err != nil {
log.Fatalf("Invalid port: %v", err)
}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
log.Printf("[SERVER %d] Handled request for %s", port, r.URL.Path)
response := fmt.Sprintf(`
<!DOCTYPE html>
<html>
<head>
<title>Backend Server %d</title>
</head>
<body>
<h1>Backend Server %d</h1>
<p>Request was handled by server on port %d</p>
<p>Path: %s</p>
<p>Method: %s</p>
</body>
</html>
`, port, port, port, r.URL.Path, r.Method)
fmt.Fprint(w, response)
})
address := fmt.Sprintf(":%d", port)
log.Printf("[SERVER] Backend server started on port %d", port)
if err := http.ListenAndServe(address, nil); err != nil {
log.Fatalf("Server failed: %v", err)
}
}
This creates a simple HTTP server that tells you which port handled the request.
Testing Our Load Balancer
Step 1: Start the Backend Servers
Open 3 terminals and run:
# Terminal 1
go run server.go 8081
# Terminal 2
go run server.go 8082
# Terminal 3
go run server.go 8083
You should see:
[SERVER] Backend server started on port 8081
[SERVER] Backend server started on port 8082
[SERVER] Backend server started on port 8083
Step 2: Start the Load Balancer
In a 4th terminal:
go run main.go
You should see:
[POOL] Added server: 127.0.0.1:8081
[POOL] Added server: 127.0.0.1:8082
[POOL] Added server: 127.0.0.1:8083
[LB] Load Balancer started on 0.0.0.0:8080
[LB] Backend servers: 3
Step 3: Make Requests
Now open your browser and go to http://localhost:8080
First request:
- Browser shows: “Backend Server 8081”
- Load balancer terminal:
[LB] Forwarding ... → 127.0.0.1:8081 - Server 8081 terminal:
[SERVER 8081] Handled request for /
Refresh the page (second request):
- Browser shows: “Backend Server 8082”
- Load balancer terminal:
[LB] Forwarding ... → 127.0.0.1:8082 - Server 8082 terminal:
[SERVER 8082] Handled request for /
Refresh again (third request):
- Browser shows: “Backend Server 8083”
- Load balancer terminal:
[LB] Forwarding ... → 127.0.0.1:8083 - Server 8083 terminal:
[SERVER 8083] Handled request for /
Refresh again (fourth request):
- Browser shows: “Backend Server 8081” (back to the first server!)

Ez, and it works. W for us.
Testing with curl
You can also test with curl to see it more clearly:
for i in {1..6}; do
echo "Request $i:"
curl -s http://localhost:8080 | grep "Backend Server"
done
Output:
Request 1:
<h1>Backend Server 8081</h1>
Request 2:
<h1>Backend Server 8082</h1>
Request 3:
<h1>Backend Server 8083</h1>
Request 4:
<h1>Backend Server 8081</h1>
Request 5:
<h1>Backend Server 8082</h1>
Request 6:
<h1>Backend Server 8083</h1>
The Architecture We Built

The Limitations (What’s Still Broken)
Our load balancer works, but it’s far from done…
1. No Health Checking
Kill one of your backend servers:
# kill server 8082
^C
[SERVER] Server shutting down...
Now make requests to the load balancer. Every third request fails

33% of requests are failing…
The load balancer has no idea Server 8082 is down. It keeps trying to send requests to it, and those requests fail. We need active health checking to detect dead servers and remove them from rotation.
2. No Smart Algorithm Selection
All servers are treated equally. If you had:
- Server 1: Powerful machine (16 cores, 32GB RAM)
- Server 2: Medium machine (4 cores, 8GB RAM)
- Server 3: Weak machine (2 cores, 4GB RAM)
They’d still get equal traffic. Server 3 would be overwhelmed while Server 1 is yawning (like you). We need weighted algorithms that account for server capacity.
3. No Connection Pooling
Every request creates a new connection to the backend. There’s always a 3 way TCP handshake, no connection is reused. Creating connections is expensive (TCP handshake, TLS handshake if HTTPS). We’re wasting time.
What we need: Connection pooling to reuse backend connections.
4. No Protocol Awareness
Our load balancer is just forwarding bytes. It doesn’t understand HTTP. That means it can’t:
- Route based on URL path (
/apivs/static) - Add custom headers
- Terminate SSL/TLS
- Parse cookies for session persistence
- Compress responses
We need layer 7 (HTTP) awareness.
5. No Timeout Handling
If a backend server hangs and never responds, our load balancer will wait forever. The client connection stays open indefinitely.
Client → Load Balancer → Backend (no response)
↓
(waiting forever...)
What we need: Proper timeout configuration with context.Context.
6. No Metrics/Monitoring
We have no idea:
- How many requests per second we’re handling
- What the response times are
- Which servers are getting the most traffic
- When errors happen
7. No Graceful Shutdown
Try stopping the load balancer. It just dies. Any in-flight requests get dropped.
We need to implement graceful shutdown that finishes current requests before dying.
But Still…
Despite all these limitations, what we built is actually quite good. It’s the foundation. Everything else is just making it better.
Testing Edge Cases
I want you guys to play around with this and test different cases to see what works and where our load balancer currently lacks. For example:
Testing High Concurrent Load
Let’s slam the load balancer with many simultaneous requests. First, install a load testing tool:
go install github.com/rakyll/hey@latest
Now test:
# send 1000 requests with 100 concurrent connections
hey -n 1000 -c 100 http://localhost:8080/
Output:
Summary:
Total: 0.5234 secs
Requests/sec: 1910.56
Status code distribution:
[200] 1000 responses
Watch your load balancer terminal. You’ll see it handling these many connections, distributing them across all three servers. Neat, right? :)
Goroutine Model Visualization
I couldn’t get it quite like how I wanted it to, but here’s a decent try (IMO) visualizing how goroutines handle concurrent connections:

Each goroutine initially uses ~2KB of stack and adjusts accordingly over time, so it can handle 10,000+ concurrent connections easily
What Now?
Pat yourself on the back. Seriously. Most developers never do this. You’re getting ahead in the rat race, yay go you
Anyways, in the next part, we’re tackling our next big problem: health checking. Our load balancer will finally stop sending traffic to dead servers.
After that, we’ll improve our algorithms (Part 6), add session persistence (Part 7), implement Layer 7 features (Part 8), and keep building until we have something actually worth our time. At least that’s the high level plan I have for now, it might change, who knows?
But for now, enjoy your working RR load balancer
Feel free to hit me up on X / Twitter if you have questions, found bugs, or just want to show off your implementation. I read everything and I’d genuinely love to see what you build.
See you in the next part, where we focus more on making this shit reliable :)