How I Built a Load Balancer in Go From Scratch & What I Learned


Introduction
Hi, folks. Have you ever wondered how massive platforms like Netflix, Amazon, or Instagram handle millions of requests every second without crashing?
How do they make sure your requests reach the right server at the right time, even when some servers are down?
The secret is in a crucial piece of infrastructure called the load balancer a system so essential, yet so invisible, that most users never even know it exists.
Now, what if I told you…
You don’t need a multi-million-dollar data center to understand it.
You don’t need fancy tools or certifications.
You just need Go, a few lines of code, and a desire to learn.
In this article, I’ll guide you through:
✅ How I built a working HTTP load balancer in Go
✅ The types of load balancers: hardware vs. software
✅ How tools like NGINX and Docker function as load balancers
✅ The difference between load balancers and reverse proxies
✅ And a detailed look at my actual Go code – section by section
By the end of this, you’ll not only understand what a load balancer does—you’ll know how to build one yourself.
“The best way to understand how things work is to build them yourself”
Let’s dive in. 🚀
What is a Load Balancer?
Imagine a busy intersection in a city during rush hour. Cars are lining up from every direction, and chaos is about to happen unless there’s a traffic cop directing vehicles to different lanes and side roads.
That’s exactly what a load balancer does—but instead of cars, it manages incoming HTTP requests, and instead of lanes, it directs them to a pool of backend servers.
At its core, a Load Balancer:
Distributes traffic across multiple servers (also called nodes or replicas)
Prevents overload on any single server
Increases availability by skipping over unhealthy or down servers
Improves response time and system reliability
Types of Load Balancers
Load balancers come in different shapes and sizes—some live in physical hardware racks, while others run on the same machines as your app. But all share one goal: efficiently distribute traffic to multiple backend servers.
Let’s break down the two main categories
Hardware Load Balancer
These are dedicated physical devices built specifically for handling network traffic at massive scale. You’ll often find them in large enterprise environments, banks, telecoms, and data centers.
Examples: F5 BIG-IP, Citrix NetScaler, Cisco ACE
Key features:
🚀 High throughput and ultra-low latency
🔐 Built-in security, SSL offloading, DDoS protection
🧠 Handles both Layer 4 (TCP/UDP) and Layer 7 (HTTP/HTTPS) traffic
🏢 Often installed at the edge of the network
Software Load Balancer
These are programs you can install and configure on general-purpose servers or cloud instances. They are much more flexible and accessible, especially for startups, SaaS apps, and DevOps teams.
Examples: NGINX, HAProxy, Envoy, Traefikm,etc.
Cloud-native options: AWS Elastic Load Balancer (ELB), Google Cloud Load Balancer.
Key features:
💻 Runs on regular servers (virtual or bare metal)
🔧 Highly configurable with scripting and APIs
📦 Easily integrates with container platforms like Docker and Kubernetes
💡 Ideal for microservices, CI/CD, and autoscaling workloads.
Today’s Focus: Software Load Balancers
Now that you know the different types of load balancers, let’s zoom in on the one that’s closest to every developer’s heart—software load balancers.
Why?
Because they’re easy to experiment with, cheap to run, and incredibly powerful when you combine them with modern tools like Docker, Kubernetes, and cloud platforms like AWS.
In this article, we’re going to build our own software load balancer using Go—from scratch.
No black boxes. No magic. Just pure code and concepts.
Along the way, we’ll also explore:
🔄 How NGINX acts as a smart load balancer
🐳 How Docker replicas distribute traffic using service discovery
🔁 The difference between load balancers and reverse proxies
🧵 And how to manage connection pools, handle health checks, and route traffic like a pro.
So roll up your sleeves—we're not just reading docs today. We're writing the kind of code that keeps real-world systems alive.
How NGINX and Docker Replicas Work as Load Balancers
Before we dive deeper into our Go implementation, let’s understand how two of the most popular developer tools—NGINX and Docker—can function as load balancers in real-world scenarios.
NGINX – More Than Just a Web Server
Most developers know NGINX as a blazing-fast web server. But it’s also a powerful software load balancer that can handle massive traffic at scale with very little configuration.
With just a few lines in a config file, you can turn NGINX into:
A Reverse Proxy: Passes client requests to backend services (e.g., Node.js, PHP, Python)
A Load Balancer: Distributes requests across multiple backend servers.
Docker Replicas – Load Balancing With Minimal Effort
Did you know Docker itself can act as a load balancer? You don’t need any external tools—just a docker service
with multiple replicas, and Docker takes care of routing for you.
Here’s how it works:
✅ Docker Swarm Mode
When you deploy a service with multiple replicas in Docker Swarm:
docker service create --name myapp --replicas 3 -p 8080:80 myapp-image
Docker automatically:
Registers each replica in a built-in service registry
Routes incoming traffic on
port 8080
to any of the healthy replicasBalances load using round-robin by default
✅ Built-in DNS and Health Checking
Docker uses an internal DNS system and can perform basic health checks to ensure traffic doesn’t go to failed containers.
Load Balancer vs Reverse Proxy — What’s the Difference?
A load balancer and a reverse proxy both manage network traffic, but they have distinct roles.
Reverse Proxy:
- Function: Acts as an intermediary between clients and servers, forwarding client requests to the appropriate backend server.
Load Balancer:
Function: Distributes incoming traffic across multiple servers to ensure no single server is overwhelmed.
Let start implementing Load Balancer
Understanding Your Load Balancer Components
Core Components
Server
: Represents a backend server with health check capabilities and its own connection pool.ConnectionPool
: Manages HTTP clients for reusability and timeout management.LoadBalancer
: The core manager that routes requests and checks server health.
Server Struct
The Server
struct represents an individual backend server. It includes:
type Server struct {
name string
url string
healthy bool
healthCheckEndPoint string
cp ConnectionPool
}
name
: A unique identifier for the server.url
: The server's URL.healthy
: A boolean indicating the server's health status.healthCheckEndPoint
: The endpoint used to check the server's health.cp
: AConnectionPool
instance for managing HTTP connections.
LoadBalancer Struct
The LoadBalancer
struct is responsible for managing multiple servers and routing requests. It includes:
type LoadBalancer struct {
servers []*Server
idx int
mu sync.Mutex
}
servers
: A slice of pointers toServer
instances.idx
: An index to track the next server for request forwarding.mu
: A mutex for synchronizing access to shared resources.
ConnectionPool Struct
The ConnectionPool
struct manages reusable HTTP clients for each server. It includes:
type ConnectionPool struct {
connections map[string][]*http.Client
*Opts
}
connections
: A map of server names to slices of HTTP clients.Opts
: A pointer to anOpts
struct containing configuration options likemaxConnection
andtimeout
.
Implementing the Load Balancer
Connection Management
The ConnectionPool
struct provides methods to get and push HTTP clients:
func (cp *ConnectionPool) Get(server string) *http.Client {
if clients, ok := cp.connections[server]; ok && len(clients) > 0 {
client := clients[len(clients)-1]
clients = clients[:len(clients)-1]
cp.connections[server] = clients
return client
}
return &http.Client{
Timeout: cp.timeout,
}
}
func (cp *ConnectionPool) Push(server string, client *http.Client) error {
if len(cp.connections[server]) > cp.maxConnection {
return fmt.Errorf("connection pool limit exceeded for server '%s'", server)
}
cp.connections[server] = append(cp.connections[server], client)
return nil
}
Get(server string) *http.Client
: Retrieves an available client from the pool or creates a new one if none are available.Push(server string, client *http.Client) error
: Returns a client to the pool, ensuring the pool size does not exceedmaxConnection
.
Server Health Check
The LoadBalancer
struct includes methods to monitor server health:
func (lb *LoadBalancer) hasUnhealthy() bool {
hasUnhealthyServer := false
for _, server := range lb.servers {
if !server.healthy {
hasUnhealthyServer = true
}
}
return hasUnhealthyServer
}
func (lb *LoadBalancer) HealthCheck() {
for _, server := range lb.servers {
res, err := http.Get(server.url + server.healthCheckEndPoint)
if err != nil || res.StatusCode != http.StatusOK {
server.healthy = false
fmt.Printf("Server [%s] is down\n", server.url)
} else {
server.healthy = true
fmt.Printf("Server [%s] is up\n", server.url)
}
}
}
func (lb *LoadBalancer) RunHealthCheck() {
ticker := time.NewTicker(10 * time.Second)
go func() {
for {
<-ticker.C
lb.HealthCheck()
}
}()
}
hasUnhealthy() bool
: Checks if there are any unhealthy servers.HealthCheck()
: Performs health checks on all servers and updates their status.RunHealthCheck()
: Periodically runs health checks using a ticker.
Request Routing
The LoadBalancer
struct routes requests to healthy servers:
func (lb *LoadBalancer) NextServer() (*Server, error) {
lb.mu.Lock()
defer lb.mu.Unlock()
if lb.hasUnhealthy() {
idx := 0
for ; idx < len(lb.servers); idx++ {
if lb.servers[idx].healthy {
break
}
}
lb.idx = idx
}
if lb.idx == len(lb.servers) {
lb.idx = 0
return nil, errors.New("no healthy servers")
}
server := lb.servers[lb.idx]
lb.idx = (lb.idx + 1) % len(lb.servers)
return server, nil
}
func (lb *LoadBalancer) ForwardRequest(client *http.Client, serverUrl string, uri string) (*http.Response, error) {
u, err := url.Parse(serverUrl)
if err != nil {
return nil, err
}
fullUrl := u.ResolveReference(&url.URL{Path: uri})
res, err := client.Get(fullUrl.String())
if err != nil {
return nil, err
}
return res, nil
}
func (lb *LoadBalancer) ServeHTTP(response http.ResponseWriter, request *http.Request) {
nextServer, _ := lb.NextServer()
client := nextServer.cp.Get(nextServer.name)
res, err := lb.ForwardRequest(client, nextServer.url, request.RequestURI)
fmt.Println("Request Forwarded to " + nextServer.name)
nextServer.cp.Push(nextServer.name, client)
if err != nil {
panic(err)
}
defer res.Body.Close()
body, err := io.ReadAll(res.Body)
if err != nil {
panic(err)
}
_, err = response.Write(body)
if err != nil {
panic(err)
}
}
NextServer() (*Server, error)
: Selects the next healthy server for request forwarding.ForwardRequest(client *http.Client, serverUrl string, uri string) (*http.Response, error)
: Forwards a request to the specified server and returns the response.ServeHTTP(response http.ResponseWriter, request *http.Request)
: Handles incoming HTTP requests, forwards them to the appropriate server, and writes the response back to the client.
Main Function
The main
function sets up the load balancer and starts the HTTP server:
func main() {
connectionOpts := Opts{
maxConnection: 10,
timeout: 60 * time.Second,
}
connection := ConnectionPool{
connections: make(map[string][]*http.Client),
Opts: &connectionOpts,
}
loadBalancer := LoadBalancer{
servers: []*Server{
{
name: "server-1 port : 8000",
url: "http://localhost:8000",
healthCheckEndPoint: "/healthcheck",
cp: connection,
},
{
name: "server-2 port : 8001",
url: "http://localhost:8001",
healthCheckEndPoint: "/healthcheck",
cp: connection,
},
},
}
loadBalancer.RunHealthCheck()
err := http.ListenAndServe(":8080", &loadBalancer)
if err != nil {
fmt.Println("Load balancer started @port 8080")
}
}
Define connection options with
maxConnection
andtimeout
.Initialize a
ConnectionPool
with the defined options.Create a
LoadBalancer
with a list ofServer
instances.Start periodic health checks with
RunHealthCheck()
.Start the HTTP server on port 8080 using
http.ListenAndServe
.
Conclusion
Building a load balancer from scratch in Go is a rewarding experience that helps you understand how traffic is managed across servers. By learning about key parts like server health checks, connection pools, and request routing, you get practical insights into how load balancers work. This hands-on project not only makes the technology clearer but also gives you the skills to create scalable and reliable systems. Whether you're a developer wanting to improve your backend infrastructure or just curious about load balancing, this project teaches valuable lessons in software architecture and system design. As you keep experimenting and improving your setup, you'll be better equipped to handle real-world challenges in distributed computing.
For the complete code of the load balancer project, you can visit the GitHub repository at https://github.com/saravanasai/loadbalancer. Feel free to explore the code and share any updates or improvements you make.
Subscribe to my newsletter
Read articles from Saravana Sai directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Saravana Sai
Saravana Sai
I am a self-taught web developer interested in building something that makes people's life awesome. Writing code for humans not for dump machine