Real-time applications have become fundamental in modern web and mobile apps — think live chats, multiplayer games, collaboration tools, notifications, and more. Socket.IO is one of the most popular libraries for building real-time bidirectional communication between clients and servers using WebSockets and fallback transports.

But what happens when your app grows beyond a single server? How do you horizontally scale Socket.IO to handle thousands or millions of concurrent connections without losing messages or client state?

In this post, we’ll dive deep into the challenges of horizontally scaling Socket.IO and explore practical solutions for building robust, scalable real-time systems.

💡

Note: I haven’t implemented scaling for the Socket.IO server in my Twitter clone yet, but I wanted to share what I’ve learned after thorough research in case I need to do it in the future. For more details, you can also refer to the official docs here.

The Challenge of Scaling Socket.IO Horizontally

What Does Horizontal Scaling Mean? Read more

Horizontal scaling in this context means running multiple instances of your Socket.IO server across multiple machines or containers. This is essential for:

Increasing capacity to handle more concurrent connections.
Improving availability and fault tolerance.
Load balancing clients across servers.

Why Is It Hard to Scale Socket.IO?

The core challenge comes down to state and message synchronization. When your clients connect to different Socket.IO instances, how do those instances:

Know which client is connected where?
Emit events to the correct sockets, even if they are connected to a different server?
Share rooms and namespaces info?
Handle broadcasts and private messaging consistently?

Without synchronization, a message sent from one server instance won't reach clients connected to another.

Understanding Socket.IO’s Architecture

Socket.IO runs on top of the WebSocket protocol but provides many extra features like:

Namespaces (logical channels)
Rooms (groups of sockets)
Automatic reconnection
Event acknowledgments

Each connected client is represented by a socket object on the server. These sockets are stored in-memory in the Socket.IO server instance.

When you run multiple instances, these in-memory stores are isolated per instance — thus the need for inter-server communication.

The Key: A Shared Adapter for Socket.IO Instances

Socket.IO provides an adapter interface responsible for managing rooms and broadcasting events.

By default, the adapter is in-memory and local to a single instance, which is why it doesn't scale out of the box.

The Solution: Use a Pub/Sub Adapter

To scale Socket.IO horizontally, you replace the default adapter with one that uses a central message broker, typically Redis.

Each Socket.IO instance subscribes to Redis channels.
When one instance emits to a room or broadcasts, it publishes the event to Redis.
Other instances receive the event and forward it to their connected clients.

This ensures messages, rooms, and broadcasts are synchronized across all server instances.

Step-by-Step: Scaling Socket.IO with Redis Adapter

1. Set up Redis

You need a Redis server accessible to all Socket.IO instances.

2. Install Redis Adapter

npm install @socket.io/redis-adapter

3. Configure Your Socket.IO Server

import { createClient } from "redis";
import { Server } from "socket.io";
import { createAdapter } from "@socket.io/redis-adapter";

const pubClient = createClient({ url: "redis://localhost:6379" });
const subClient = pubClient.duplicate();

await Promise.all([
  pubClient.connect(),
  subClient.connect()
]);

const io = new Server({
  adapter: createAdapter(pubClient, subClient)
});

io.listen(3000);

4. Run Multiple Instances

Run your app on multiple ports or machines:

node server.js --port=3000
node server.js --port=3001

Use a load balancer (e.g., NGINX, AWS ELB) to distribute clients among these instances.

5. Test Cross-Instance Messaging

Clients connected to different instances will receive broadcasts and room events transparently.

NGINX Load Balancer with Sticky Sessions

To distribute incoming WebSocket connections across multiple Socket.IO instances, we’ll use NGINX as a reverse proxy and load balancer.

Step 1: Install NGINX

Install it via your package manager or using a container.

sudo apt install nginx

Step 2: Enable Sticky Sessions (IP Hashing)

Sticky sessions are important if you're not using Redis yet or want consistent routing based on client IP.

Here’s an example nginx.conf snippet:

http {
  upstream socketio_backend {
    ip_hash;  # Enables sticky sessions based on client IP
    server 127.0.0.1:3000;
    server 127.0.0.1:3001;
  }

  server {
    listen 80;

    location /socket.io/ {
      proxy_pass http://socketio_backend;

      proxy_http_version 1.1;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection "upgrade";
      proxy_set_header Host $host;

      proxy_read_timeout 60s;
      proxy_send_timeout 60s;
    }
  }
}

Beyond Redis: Other Scaling Considerations

Scaling Redis

Redis can become a bottleneck if your app has extremely high message volumes. Consider Redis clustering or using other message brokers like Kafka or NATS if needed.

An example using the Redis cluster:

import { createCluster } from "redis";
import { Server } from "socket.io";
import { createAdapter } from "@socket.io/redis-adapter";

const pubClient = createCluster({
  rootNodes: [
    {
      url: "redis://localhost:7000",
    },
    {
      url: "redis://localhost:7001",
    },
    {
      url: "redis://localhost:7002",
    },
  ],
});
const subClient = pubClient.duplicate();

await Promise.all([
  pubClient.connect(),
  subClient.connect()
]);

const io = new Server({
  adapter: createAdapter(pubClient, subClient)
});

io.listen(3000);

Using Other Adapters

Socket.IO supports custom adapters, so you can build adapters for different brokers or databases.

Deploying on Kubernetes

Use StatefulSets or Deployments with multiple replicas. Ensure Redis is highly available and the adapter is configured accordingly.

Additional Tips for Scaling Socket.IO

Monitor and profile your Redis and server instances.
Use namespaces and rooms wisely to limit message propagation.
Compress messages to reduce bandwidth.
Handle reconnections gracefully.
Consider rate limiting and authentication at the gateway layer.
Implement health checks and circuit breakers to detect failing instances.

Conclusion

Scaling Socket.IO horizontally involves addressing the core challenge of synchronizing socket state and events across multiple server instances. By leveraging a centralized Pub/Sub adapter, most commonly Redis, you enable Socket.IO servers to broadcast and emit events seamlessly across instances.

With this approach and the right infrastructure (load balancer, Redis cluster, monitoring), your real-time app can scale to handle millions of concurrent users and complex event flows with minimal latency.

If you’re building real-time features for a fast-growing product, mastering horizontal scaling with Socket.IO is essential.

For more such articles, follow the DevHub blog and do join our free of cost Tech community on Discord here.

Horizontally Scaling Socket.IO: Architecting Real-Time Apps for Scale

Table of contents