November 30, 2025· 14 min read

The Only Guide You'd Ever Need for Load Balancers - 4

#load balancers #computer science #development #system design #Go #networking #implementation

Building Our First Load Balancer - Round Robin Implementation

Alright, if you made it through part 3, you now understand the proxy pattern, you know what sockets are, yada yada. We’ve been talking theory for three parts now. Things are easier said than done.

It’s time to actually write some mf code.

In this part, we’re going to build a real, working load balancer. Not pseudo shit. Actual code that you can run on your machine right now. We’ll start simple and gradually add complexity, because that’s how you actually learn this stuff IMO.

By the end of this post, you’ll have a load balancer that:

Accepts real HTTP connections
Distributes traffic across multiple backend servers using Round Robin
Forwards requests and responses correctly
Handles multiple concurrent connections
Actually works (I promise)

Let’s go.

Setup

Why Go?

I’m using Go for this series because:

It’s built for this - Go was literally designed for network programming and concurrent systems
Goroutines are magic - Lightweight concurrency that makes handling thousands of connections trivial
Shit is fast - Do I need to say more?
And because it’s cool :)

But Sanchit, I don’t know Go. And that’s fine. I’ll explain everything as we go along with Go. Go is stupid simple to learn TBH. And MB for the amount of go’s in this sentence

Anyway, that being said, just have Go installed and set up. This is not a series where I teach you how to setup a programming language.

What You Need

Nothing, but if you’re following alone, just ensure you have Go installed and set up. Oh, and you’ve read the previous parts.

Our Pilot :)

We’ll start off simple. Just create a directory and initialize the Go module:

mkdir load-balancer && cd load-balancer
go mod init loadbalancer

Server Pool

Before we write any network code, we need to figure out how to store our backend servers. This is called the “server pool.”

What Do We Need to Track?

For each backend server, we need:

Host (IP address or hostname)
Port
Status (is it up or down? We’ll add this later though)

For the pool as a whole, we need:

List of servers
Current index (for Round Robin)
Thread safety (so multiple goroutines don’t mess up the index)

The Basic Structure

type Backend struct {
    Host string
    Port int
}

type ServerPool struct {
    backends []Backend
    current  int
    mux      sync.Mutex
}

Simple, right? We have a list of backends and an index pointing to the current one. The sync.Mutex will handle thread safety (or “goroutine safety”).

Creating a New Pool

func NewServerPool() *ServerPool {
    return &ServerPool{
        backends: make([]Backend, 0),
        current:  0,
    }
}

Adding Servers

func (p *ServerPool) AddBackend(host string, port int) {
    backend := Backend{
        Host: host,
        Port: port,
    }
    p.backends = append(p.backends, backend)
    log.Printf("[POOL] Added server: %s:%d", host, port)
}

Getting the Next Server (Round Robin)

This is where the magic happens:

func (p *ServerPool) GetNextBackend() *Backend {
    p.mux.Lock()
    defer p.mux.Unlock()

    if len(p.backends) == 0 {
        return nil
    }

    // current backend
    backend := &p.backends[p.current]

    // next (with wraparound)
    p.current = (p.current + 1) % len(p.backends)

    return backend
}

Let’s visualize what’s happening:

Round robin visualization

The % len(self.servers) is what makes it wrap around. When we hit the end, we go back to the beginning.

Thread Safety Problem

But wait. What if two goroutines call GetNextBackend() at the exact same time?

Race condition problem

This is a race condition. Multiple goroutines accessing shared data simultaneously is a mess.

The Fix: Mutex

Go has a simple solution, a mutex:

func (p *ServerPool) GetNextBackend() *Backend {
    p.mux.Lock()         // lock it as soon as one goroutine's here
    defer p.mux.Unlock() // unlock when function returns

    if len(p.backends) == 0 {
        return nil
    }

    backend := &p.backends[p.current]
    p.current = (p.current + 1) % len(p.backends)

    return backend
}

The p.mux.Lock() ensures only one goroutine can execute this code at a time. The defer p.mux.Unlock() ensures the mutex gets unlocked even if we return early or panic. Problem solved.

The Complete Server Pool

Here’s our full server pool implementation:

package main

import (
    "log"
    "sync"
)

type Backend struct {
    Host string
    Port int
}

type ServerPool struct {
    backends []Backend
    current  int
    mux      sync.Mutex
}

func NewServerPool() *ServerPool {
    return &ServerPool{
        backends: make([]Backend, 0),
        current:  0,
    }
}

func (p *ServerPool) AddBackend(host string, port int) {
    backend := Backend{
        Host: host,
        Port: port,
    }
    p.backends = append(p.backends, backend)
    log.Printf("[POOL] Added server: %s:%d", host, port)
}

func (p *ServerPool) GetNextBackend() *Backend {
    p.mux.Lock()
    defer p.mux.Unlock()

    if len(p.backends) == 0 {
        return nil
    }

    backend := &p.backends[p.current]
    p.current = (p.current + 1) % len(p.backends)

    return backend
}

func (p *ServerPool) Size() int {
    return len(p.backends)
}

That’s it. Simple & clean.

Building the Load Balancer

Now for the main event. Let’s build the actual load balancer.

The Architecture (Reminder)

Basic architecture

The load balancer needs to:

Listen for client connections
Accept client connections
Pick a backend server
Connect to backend server
Forward client request to backend
Forward backend response to client
Close both connections

The Main Structure

type LoadBalancer struct {
    host       string
    port       int
    serverPool *ServerPool
}

func NewLoadBalancer(host string, port int, pool *ServerPool) *LoadBalancer {
    return &LoadBalancer{
        host:       host,
        port:       port,
        serverPool: pool,
    }
}

Starting the Load Balancer

func (lb *LoadBalancer) Start() error {
    address := fmt.Sprintf("%s:%d", lb.host, lb.port)

    // tcp listener
    listener, err := net.Listen("tcp", address)
    if err != nil {
        return fmt.Errorf("failed to start listener: %v", err)
    }
    defer listener.Close()

    log.Printf("[LB] Load Balancer started on %s", address)
    log.Printf("[LB] Backend servers: %d", lb.serverPool.Size())

    // we accept connections here...forever
    for {
        conn, err := listener.Accept()
        if err != nil {
            log.Printf("[LB] Failed to accept connection: %v", err)
            continue
        }

        // handlin this connection in a new goroutine
        go lb.handleConnection(conn)
    }
}

Let’s break this down:

net.Listen("tcp", address) creates a TCP listener on the specified address. This is like creating and binding a socket.
listener.Accept() blocks until a client connects. When one does, we get a net.Conn for that specific client.
go lb.handleConnection(conn) handle each client in a separate goroutine. Goroutines are so lightweight. You can have thousands of them running simultaneously with very little overhead.

Handling a Client Connection

Now the real work…forwarding traffic:

func (lb *LoadBalancer) handleConnection(clientConn net.Conn) {
    defer clientConn.Close()

    backend := lb.serverPool.GetNextBackend()
    if backend == nil {
        log.Printf("[LB] No backend servers available!")
        return
    }

    backendAddress := fmt.Sprintf("%s:%d", backend.Host, backend.Port)
    log.Printf("[LB] Forwarding %s → %s", clientConn.RemoteAddr(), backendAddress)

    // connect to the backend server
    backendConn, err := net.Dial("tcp", backendAddress)
    if err != nil {
        log.Printf("[LB] Failed to connect to backend %s: %v", backendAddress, err)
        return
    }
    defer backendConn.Close()

    // now we have two connections:
    // clientConn -> connected to the client
    // backendConn -> connected to the backend server

    // forward data in both directions
    lb.forwardTraffic(clientConn, backendConn)

    log.Printf("[LB] Closed connection from %s", clientConn.RemoteAddr())
}

The flow:

Get next backend server from the pool (Round Robin)
Connect to that backend server using net.Dial
Forward data between client and backend
Defer ensures both connections close when done

Forwarding Data Bidirectionally

This is the trickiest part. We need to forward data in BOTH directions:

Client → Load Balancer → Backend
Backend → Load Balancer → Client

We need both directions simultaneously. Solution? More goroutines!

Bidirectional data forwarding

func (lb *LoadBalancer) forwardTraffic(client, backend net.Conn) {
    // WaitGroup to wait for both goroutines
    var wg sync.WaitGroup
    wg.Add(2)

    // copy (client to backend)
    go func() {
        defer wg.Done()
        io.Copy(backend, client)
    }()

    // copy (backend to client)
    go func() {
        defer wg.Done()
        io.Copy(client, backend)
    }()

    wg.Wait()
}

There’s actually a much better approach for this (TCP half close, if anyone’s interested), but I won’t be implementing that yet. We need to get our logics right first, optimization can be done later when needed.

Now continuing with our actual code, Go’s io.Copy does all the heavy lifting:

Reads from source
Writes to destination
Handles buffering
Returns when connection closes

And sync.WaitGroup ensures we wait for both goroutines to finish before closing connections.

So clean, I love Go <3

The Complete Load Balancer Code

Here’s the full main.go (explained a bit as well in the comments):

package main

import (
    "fmt"
    "io"
    "log"
    "net"
    "sync"
)

// this is ONE backend server
type Backend struct {
    Host string
    Port int
}

// manages the POOL of backend servers
type ServerPool struct {
    backends []Backend
    current  int
    mux      sync.Mutex
}

// self explanatory...creates a new server pool
func NewServerPool() *ServerPool {
    return &ServerPool{
        backends: make([]Backend, 0),
        current:  0,
    }
}

// adds a backend server to the pool
func (p *ServerPool) AddBackend(host string, port int) {
    backend := Backend{
        Host: host,
        Port: port,
    }
    p.backends = append(p.backends, backend)
    log.Printf("[POOL] Added server: %s:%d", host, port)
}

// returns the next backend (using RR)
func (p *ServerPool) GetNextBackend() *Backend {
    p.mux.Lock()
    defer p.mux.Unlock()

    if len(p.backends) == 0 {
        return nil
    }

    backend := &p.backends[p.current]
    p.current = (p.current + 1) % len(p.backends)

    return backend
}

// total no. of backends
func (p *ServerPool) Size() int {
    return len(p.backends)
}

// main LB
type LoadBalancer struct {
    host       string
    port       int
    serverPool *ServerPool
}

// creates a new LB
func NewLoadBalancer(host string, port int, pool *ServerPool) *LoadBalancer {
    return &LoadBalancer{
        host:       host,
        port:       port,
        serverPool: pool,
    }
}

func (lb *LoadBalancer) Start() error {
    address := fmt.Sprintf("%s:%d", lb.host, lb.port)

    listener, err := net.Listen("tcp", address)
    if err != nil {
        return fmt.Errorf("failed to start listener: %v", err)
    }
    defer listener.Close()

    log.Printf("[LB] Load Balancer started on %s", address)
    log.Printf("[LB] Backend servers: %d", lb.serverPool.Size())

    for {
        conn, err := listener.Accept()
        if err != nil {
            log.Printf("[LB] Failed to accept connection: %v", err)
            continue
        }

        log.Printf("[LB] New connection from %s", conn.RemoteAddr())
        go lb.handleConnection(conn)
    }
}

// handles a SINGLE client conn
func (lb *LoadBalancer) handleConnection(clientConn net.Conn) {
    defer clientConn.Close()

    backend := lb.serverPool.GetNextBackend()
    if backend == nil {
        log.Printf("[LB] No backend servers available!")
        return
    }

    backendAddress := fmt.Sprintf("%s:%d", backend.Host, backend.Port)
    log.Printf("[LB] Forwarding %s → %s", clientConn.RemoteAddr(), backendAddress)

    backendConn, err := net.Dial("tcp", backendAddress)
    if err != nil {
        log.Printf("[LB] Failed to connect to backend %s: %v", backendAddress, err)
        return
    }
    defer backendConn.Close()

    lb.forwardTraffic(clientConn, backendConn)

    log.Printf("[LB] Closed connection from %s", clientConn.RemoteAddr())
}

// forwards traffic b/w client & backend
func (lb *LoadBalancer) forwardTraffic(client, backend net.Conn) {
    var wg sync.WaitGroup
    wg.Add(2)

    // client to backend
    go func() {
        defer wg.Done()
        io.Copy(backend, client)
    }()

    // backend to client
    go func() {
        defer wg.Done()
        io.Copy(client, backend)
    }()

    wg.Wait()
}

func main() {
    pool := NewServerPool()
    pool.AddBackend("127.0.0.1", 8081)
    pool.AddBackend("127.0.0.1", 8082)
    pool.AddBackend("127.0.0.1", 8083)

    lb := NewLoadBalancer("0.0.0.0", 8080, pool)

    if err := lb.Start(); err != nil {
        log.Fatalf("Load balancer failed: %v", err)
    }
}

Less than 150 lines (if you remove my comments :p). Simple and clean, and it works.

Creating Test Backend Servers

We need actual servers to test our load balancer. Let’s create simple HTTP servers that tell us which server handled the request.

Create backend/server.go:

package main

import (
    "fmt"
    "log"
    "net/http"
    "os"
    "strconv"
)

func main() {
    if len(os.Args) != 2 {
        fmt.Println("Usage: go run server.go <port>")
        os.Exit(1)
    }

    port, err := strconv.Atoi(os.Args[1])
    if err != nil {
        log.Fatalf("Invalid port: %v", err)
    }

    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        log.Printf("[SERVER %d] Handled request for %s", port, r.URL.Path)

        response := fmt.Sprintf(`
<!DOCTYPE html>
<html>
<head>
    <title>Backend Server %d</title>
</head>
<body>
    <h1>Backend Server %d</h1>
    <p>Request was handled by server on port %d</p>
    <p>Path: %s</p>
    <p>Method: %s</p>
</body>
</html>
`, port, port, port, r.URL.Path, r.Method)

        fmt.Fprint(w, response)
    })

    address := fmt.Sprintf(":%d", port)
    log.Printf("[SERVER] Backend server started on port %d", port)

    if err := http.ListenAndServe(address, nil); err != nil {
        log.Fatalf("Server failed: %v", err)
    }
}

This creates a simple HTTP server that tells you which port handled the request.

Testing Our Load Balancer

Step 1: Start the Backend Servers

Open 3 terminals and run:

# Terminal 1
go run server.go 8081

# Terminal 2
go run server.go 8082

# Terminal 3
go run server.go 8083

You should see:

[SERVER] Backend server started on port 8081
[SERVER] Backend server started on port 8082
[SERVER] Backend server started on port 8083

Step 2: Start the Load Balancer

In a 4th terminal:

go run main.go

You should see:

[POOL] Added server: 127.0.0.1:8081
[POOL] Added server: 127.0.0.1:8082
[POOL] Added server: 127.0.0.1:8083
[LB] Load Balancer started on 0.0.0.0:8080
[LB] Backend servers: 3

Step 3: Make Requests

Now open your browser and go to http://localhost:8080

First request:

Browser shows: “Backend Server 8081”
Load balancer terminal: [LB] Forwarding ... → 127.0.0.1:8081
Server 8081 terminal: [SERVER 8081] Handled request for /

Refresh the page (second request):

Browser shows: “Backend Server 8082”
Load balancer terminal: [LB] Forwarding ... → 127.0.0.1:8082
Server 8082 terminal: [SERVER 8082] Handled request for /

Refresh again (third request):

Browser shows: “Backend Server 8083”
Load balancer terminal: [LB] Forwarding ... → 127.0.0.1:8083
Server 8083 terminal: [SERVER 8083] Handled request for /

Refresh again (fourth request):

Browser shows: “Backend Server 8081” (back to the first server!)

Request distribution

Ez, and it works. W for us.

Testing with curl

You can also test with curl to see it more clearly:

for i in {1..6}; do
  echo "Request $i:"
  curl -s http://localhost:8080 | grep "Backend Server"
done

Output:

Request 1:
    <h1>Backend Server 8081</h1>
Request 2:
    <h1>Backend Server 8082</h1>
Request 3:
    <h1>Backend Server 8083</h1>
Request 4:
    <h1>Backend Server 8081</h1>
Request 5:
    <h1>Backend Server 8082</h1>
Request 6:
    <h1>Backend Server 8083</h1>

The Architecture We Built

Our architecture

The Limitations (What’s Still Broken)

Our load balancer works, but it’s far from done…

1. No Health Checking

Kill one of your backend servers:

# kill server 8082
^C
[SERVER] Server shutting down...

Now make requests to the load balancer. Every third request fails

No health checking

33% of requests are failing…

The load balancer has no idea Server 8082 is down. It keeps trying to send requests to it, and those requests fail. We need active health checking to detect dead servers and remove them from rotation.

2. No Smart Algorithm Selection

All servers are treated equally. If you had:

Server 1: Powerful machine (16 cores, 32GB RAM)
Server 2: Medium machine (4 cores, 8GB RAM)
Server 3: Weak machine (2 cores, 4GB RAM)

They’d still get equal traffic. Server 3 would be overwhelmed while Server 1 is yawning (like you). We need weighted algorithms that account for server capacity.

3. No Connection Pooling

Every request creates a new connection to the backend. There’s always a 3 way TCP handshake, no connection is reused. Creating connections is expensive (TCP handshake, TLS handshake if HTTPS). We’re wasting time.

What we need: Connection pooling to reuse backend connections.

4. No Protocol Awareness

Our load balancer is just forwarding bytes. It doesn’t understand HTTP. That means it can’t:

Route based on URL path (/api vs /static)
Add custom headers
Terminate SSL/TLS
Parse cookies for session persistence
Compress responses

We need layer 7 (HTTP) awareness.

5. No Timeout Handling

If a backend server hangs and never responds, our load balancer will wait forever. The client connection stays open indefinitely.

Client → Load Balancer → Backend (no response)
              ↓
      (waiting forever...)

What we need: Proper timeout configuration with context.Context.

6. No Metrics/Monitoring

We have no idea:

How many requests per second we’re handling
What the response times are
Which servers are getting the most traffic
When errors happen

7. No Graceful Shutdown

Try stopping the load balancer. It just dies. Any in-flight requests get dropped.

We need to implement graceful shutdown that finishes current requests before dying.

But Still…

Despite all these limitations, what we built is actually quite good. It’s the foundation. Everything else is just making it better.

Testing Edge Cases

I want you guys to play around with this and test different cases to see what works and where our load balancer currently lacks. For example:

Testing High Concurrent Load

Let’s slam the load balancer with many simultaneous requests. First, install a load testing tool:

go install github.com/rakyll/hey@latest

Now test:

# send 1000 requests with 100 concurrent connections
hey -n 1000 -c 100 http://localhost:8080/

Output:

Summary:
  Total:        0.5234 secs
  Requests/sec: 1910.56

Status code distribution:
  [200] 1000 responses

Watch your load balancer terminal. You’ll see it handling these many connections, distributing them across all three servers. Neat, right? :)

Goroutine Model Visualization

I couldn’t get it quite like how I wanted it to, but here’s a decent try (IMO) visualizing how goroutines handle concurrent connections:

Threading model

Each goroutine initially uses ~2KB of stack and adjusts accordingly over time, so it can handle 10,000+ concurrent connections easily

What Now?

Pat yourself on the back. Seriously. Most developers never do this. You’re getting ahead in the rat race, yay go you

Anyways, in the next part, we’re tackling our next big problem: health checking. Our load balancer will finally stop sending traffic to dead servers.

After that, we’ll improve our algorithms (Part 6), add session persistence (Part 7), implement Layer 7 features (Part 8), and keep building until we have something actually worth our time. At least that’s the high level plan I have for now, it might change, who knows?

But for now, enjoy your working RR load balancer

Feel free to hit me up on X / Twitter if you have questions, found bugs, or just want to show off your implementation. I read everything and I’d genuinely love to see what you build.

See you in the next part, where we focus more on making this shit reliable :)