Introduction

Hi, folks. Have you ever wondered how massive platforms like Netflix, Amazon, or Instagram handle millions of requests every second without crashing?

How do they make sure your requests reach the right server at the right time, even when some servers are down?

The secret is in a crucial piece of infrastructure called the load balancer a system so essential, yet so invisible, that most users never even know it exists.

Now, what if I told you…

You don’t need a multi-million-dollar data center to understand it.
You don’t need fancy tools or certifications.
You just need Go, a few lines of code, and a desire to learn.

In this article, I’ll guide you through:

✅ How I built a working HTTP load balancer in Go
✅ The types of load balancers: hardware vs. software
✅ How tools like NGINX and Docker function as load balancers
✅ The difference between load balancers and reverse proxies
✅ And a detailed look at my actual Go code – section by section

By the end of this, you’ll not only understand what a load balancer does—you’ll know how to build one yourself.

“The best way to understand how things work is to build them yourself”

Let’s dive in. 🚀

What is a Load Balancer?

Imagine a busy intersection in a city during rush hour. Cars are lining up from every direction, and chaos is about to happen unless there’s a traffic cop directing vehicles to different lanes and side roads.

That’s exactly what a load balancer does—but instead of cars, it manages incoming HTTP requests, and instead of lanes, it directs them to a pool of backend servers.

At its core, a Load Balancer:

Distributes traffic across multiple servers (also called nodes or replicas)
Prevents overload on any single server
Increases availability by skipping over unhealthy or down servers
Improves response time and system reliability

Types of Load Balancers

Load balancers come in different shapes and sizes—some live in physical hardware racks, while others run on the same machines as your app. But all share one goal: efficiently distribute traffic to multiple backend servers.

Let’s break down the two main categories

Hardware Load Balancer

These are dedicated physical devices built specifically for handling network traffic at massive scale. You’ll often find them in large enterprise environments, banks, telecoms, and data centers.

Examples: F5 BIG-IP, Citrix NetScaler, Cisco ACE

Key features:

🚀 High throughput and ultra-low latency
🔐 Built-in security, SSL offloading, DDoS protection
🧠 Handles both Layer 4 (TCP/UDP) and Layer 7 (HTTP/HTTPS) traffic
🏢 Often installed at the edge of the network

Software Load Balancer

These are programs you can install and configure on general-purpose servers or cloud instances. They are much more flexible and accessible, especially for startups, SaaS apps, and DevOps teams.

Examples: NGINX, HAProxy, Envoy, Traefikm,etc.
Cloud-native options: AWS Elastic Load Balancer (ELB), Google Cloud Load Balancer.

Key features:

💻 Runs on regular servers (virtual or bare metal)
🔧 Highly configurable with scripting and APIs
📦 Easily integrates with container platforms like Docker and Kubernetes
💡 Ideal for microservices, CI/CD, and autoscaling workloads.

Today’s Focus: Software Load Balancers

Now that you know the different types of load balancers, let’s zoom in on the one that’s closest to every developer’s heart—software load balancers.

Why?

Because they’re easy to experiment with, cheap to run, and incredibly powerful when you combine them with modern tools like Docker, Kubernetes, and cloud platforms like AWS.

In this article, we’re going to build our own software load balancer using Go—from scratch.
No black boxes. No magic. Just pure code and concepts.

Along the way, we’ll also explore:

🔄 How NGINX acts as a smart load balancer
🐳 How Docker replicas distribute traffic using service discovery
🔁 The difference between load balancers and reverse proxies
🧵 And how to manage connection pools, handle health checks, and route traffic like a pro.

So roll up your sleeves—we're not just reading docs today. We're writing the kind of code that keeps real-world systems alive.

How NGINX and Docker Replicas Work as Load Balancers

Before we dive deeper into our Go implementation, let’s understand how two of the most popular developer tools—NGINX and Docker—can function as load balancers in real-world scenarios.

NGINX – More Than Just a Web Server

Most developers know NGINX as a blazing-fast web server. But it’s also a powerful software load balancer that can handle massive traffic at scale with very little configuration.

With just a few lines in a config file, you can turn NGINX into:

A Reverse Proxy: Passes client requests to backend services (e.g., Node.js, PHP, Python)
A Load Balancer: Distributes requests across multiple backend servers.

Docker Replicas – Load Balancing With Minimal Effort

Did you know Docker itself can act as a load balancer? You don’t need any external tools—just a docker service with multiple replicas, and Docker takes care of routing for you.

Here’s how it works:

✅ Docker Swarm Mode
When you deploy a service with multiple replicas in Docker Swarm:

docker service create --name myapp --replicas 3 -p 8080:80 myapp-image

Docker automatically:

Registers each replica in a built-in service registry
Routes incoming traffic on port 8080 to any of the healthy replicas
Balances load using round-robin by default

✅ Built-in DNS and Health Checking
Docker uses an internal DNS system and can perform basic health checks to ensure traffic doesn’t go to failed containers.

Load Balancer vs Reverse Proxy — What’s the Difference?

A load balancer and a reverse proxy both manage network traffic, but they have distinct roles.

Reverse Proxy:

Function: Acts as an intermediary between clients and servers, forwarding client requests to the appropriate backend server.

Load Balancer:

Function: Distributes incoming traffic across multiple servers to ensure no single server is overwhelmed.

Let start implementing Load Balancer

Understanding Your Load Balancer Components

Core Components

Server: Represents a backend server with health check capabilities and its own connection pool.
ConnectionPool: Manages HTTP clients for reusability and timeout management.
LoadBalancer: The core manager that routes requests and checks server health.

Server Struct

The Server struct represents an individual backend server. It includes:

type Server struct {
    name                string
    url                 string
    healthy             bool
    healthCheckEndPoint string
    cp                  ConnectionPool
}

name: A unique identifier for the server.
url: The server's URL.
healthy: A boolean indicating the server's health status.
healthCheckEndPoint: The endpoint used to check the server's health.
cp: A ConnectionPool instance for managing HTTP connections.

LoadBalancer Struct

The LoadBalancer struct is responsible for managing multiple servers and routing requests. It includes:

type LoadBalancer struct {
    servers []*Server
    idx     int
    mu      sync.Mutex
}

servers: A slice of pointers to Server instances.
idx: An index to track the next server for request forwarding.
mu: A mutex for synchronizing access to shared resources.

ConnectionPool Struct

The ConnectionPool struct manages reusable HTTP clients for each server. It includes:

type ConnectionPool struct {
    connections map[string][]*http.Client
    *Opts
}

connections: A map of server names to slices of HTTP clients.
Opts: A pointer to an Opts struct containing configuration options like maxConnection and timeout.

Implementing the Load Balancer

Connection Management

The ConnectionPool struct provides methods to get and push HTTP clients:

func (cp *ConnectionPool) Get(server string) *http.Client {
    if clients, ok := cp.connections[server]; ok && len(clients) > 0 {
        client := clients[len(clients)-1]
        clients = clients[:len(clients)-1]
        cp.connections[server] = clients
        return client
    }
    return &http.Client{
        Timeout: cp.timeout,
    }
}

func (cp *ConnectionPool) Push(server string, client *http.Client) error {
    if len(cp.connections[server]) > cp.maxConnection {
        return fmt.Errorf("connection pool limit exceeded for server '%s'", server)
    }
    cp.connections[server] = append(cp.connections[server], client)
    return nil
}

Get(server string) *http.Client: Retrieves an available client from the pool or creates a new one if none are available.
Push(server string, client *http.Client) error: Returns a client to the pool, ensuring the pool size does not exceed maxConnection.

Server Health Check

The LoadBalancer struct includes methods to monitor server health:

func (lb *LoadBalancer) hasUnhealthy() bool {
    hasUnhealthyServer := false
    for _, server := range lb.servers {
        if !server.healthy {
            hasUnhealthyServer = true
        }
    }
    return hasUnhealthyServer
}

func (lb *LoadBalancer) HealthCheck() {
    for _, server := range lb.servers {
        res, err := http.Get(server.url + server.healthCheckEndPoint)
        if err != nil || res.StatusCode != http.StatusOK {
            server.healthy = false
            fmt.Printf("Server [%s] is down\n", server.url)
        } else {
            server.healthy = true
            fmt.Printf("Server [%s] is up\n", server.url)
        }
    }
}

func (lb *LoadBalancer) RunHealthCheck() {
    ticker := time.NewTicker(10 * time.Second)
    go func() {
        for {
            <-ticker.C
            lb.HealthCheck()
        }
    }()
}

hasUnhealthy() bool: Checks if there are any unhealthy servers.
HealthCheck(): Performs health checks on all servers and updates their status.
RunHealthCheck(): Periodically runs health checks using a ticker.

Request Routing

The LoadBalancer struct routes requests to healthy servers:

func (lb *LoadBalancer) NextServer() (*Server, error) {
    lb.mu.Lock()
    defer lb.mu.Unlock()
    if lb.hasUnhealthy() {
        idx := 0
        for ; idx < len(lb.servers); idx++ {
            if lb.servers[idx].healthy {
                break
            }
        }
        lb.idx = idx
    }
    if lb.idx == len(lb.servers) {
        lb.idx = 0
        return nil, errors.New("no healthy servers")
    }
    server := lb.servers[lb.idx]
    lb.idx = (lb.idx + 1) % len(lb.servers)
    return server, nil
}

func (lb *LoadBalancer) ForwardRequest(client *http.Client, serverUrl string, uri string) (*http.Response, error) {
    u, err := url.Parse(serverUrl)
    if err != nil {
        return nil, err
    }
    fullUrl := u.ResolveReference(&url.URL{Path: uri})
    res, err := client.Get(fullUrl.String())
    if err != nil {
        return nil, err
    }
    return res, nil
}

func (lb *LoadBalancer) ServeHTTP(response http.ResponseWriter, request *http.Request) {
    nextServer, _ := lb.NextServer()
    client := nextServer.cp.Get(nextServer.name)
    res, err := lb.ForwardRequest(client, nextServer.url, request.RequestURI)
    fmt.Println("Request Forwarded to " + nextServer.name)
    nextServer.cp.Push(nextServer.name, client)
    if err != nil {
        panic(err)
    }
    defer res.Body.Close()
    body, err := io.ReadAll(res.Body)
    if err != nil {
        panic(err)
    }
    _, err = response.Write(body)
    if err != nil {
        panic(err)
    }
}

NextServer() (*Server, error): Selects the next healthy server for request forwarding.
ForwardRequest(client *http.Client, serverUrl string, uri string) (*http.Response, error): Forwards a request to the specified server and returns the response.
ServeHTTP(response http.ResponseWriter, request *http.Request): Handles incoming HTTP requests, forwards them to the appropriate server, and writes the response back to the client.

Main Function

The main function sets up the load balancer and starts the HTTP server:

func main() {
    connectionOpts := Opts{
        maxConnection: 10,
        timeout:       60 * time.Second,
    }
    connection := ConnectionPool{
        connections: make(map[string][]*http.Client),
        Opts:        &connectionOpts,
    }
    loadBalancer := LoadBalancer{
        servers: []*Server{
            {
                name:                "server-1 port : 8000",
                url:                 "http://localhost:8000",
                healthCheckEndPoint: "/healthcheck",
                cp:                  connection,
            },
            {
                name:                "server-2 port : 8001",
                url:                 "http://localhost:8001",
                healthCheckEndPoint: "/healthcheck",
                cp:                  connection,
            },
        },
    }
    loadBalancer.RunHealthCheck()
    err := http.ListenAndServe(":8080", &loadBalancer)
    if err != nil {
        fmt.Println("Load balancer started @port 8080")
    }
}

Define connection options with maxConnection and timeout.
Initialize a ConnectionPool with the defined options.
Create a LoadBalancer with a list of Server instances.
Start periodic health checks with RunHealthCheck().
Start the HTTP server on port 8080 using http.ListenAndServe.

Conclusion

Building a load balancer from scratch in Go is a rewarding experience that helps you understand how traffic is managed across servers. By learning about key parts like server health checks, connection pools, and request routing, you get practical insights into how load balancers work. This hands-on project not only makes the technology clearer but also gives you the skills to create scalable and reliable systems. Whether you're a developer wanting to improve your backend infrastructure or just curious about load balancing, this project teaches valuable lessons in software architecture and system design. As you keep experimenting and improving your setup, you'll be better equipped to handle real-world challenges in distributed computing.

For the complete code of the load balancer project, you can visit the GitHub repository at https://github.com/saravanasai/loadbalancer. Feel free to explore the code and share any updates or improvements you make.

How I Built a Load Balancer in Go From Scratch & What I Learned