Part 1: Building Health-Aware Load Balancing for IBM Storage Ceph Object Gateway with Consul

Part 1: Dynamic Load Balancing for S3 with IBM Storage Ceph Ingress Concentrators and Consul
As object storage deployments grow, maintaining high availability and resilient access to S3-compatible endpoints becomes increasingly complex. Traditional load balancers are often centralized, statically configured, and unaware of backend health; they can introduce performance bottlenecks, operational overhead, and single points of failure. These limitations are especially problematic in distributed or scale-out environments, where services must adapt to topology changes and failures in real time.
IBM Storage Ceph 8.1 introduces ingress concentrators: per-node HAProxy services deployed via cephadm
. Each terminator fronts a local pool of Object Gateway (RGW) daemons, allowing every Ceph node to act as an S3 ingress point. This distributed design improves fault tolerance and simplifies scaling by localizing traffic handling.
But Ingress Concentrators alone don’t provide dynamic routing or automatic failover. That’s where HashiCorp Consul and CoreDNS come in. By integrating Consul’s service discovery and health checks with CoreDNS’s programmable DNS, we can enable intelligent, health-aware load balancing — all without relying on static configuration or external DNS infrastructure.
In this post, you'll learn how to:
Understand and deploy Ceph ingress Concentrators.
Register and monitor them dynamically using Consul.
Integrate CoreDNS to enable DNS-based health-aware routing.
Validate routing logic and failover scenarios.
Disclaimer:
HashiCorp Consul and CoreDNS are not included components of IBM Storage Ceph and are not officially supported by IBM. While this blog demonstrates how to integrate them usingcephadm
custom containers for ease of deployment, their configuration, operation, and lifecycle management remain outside the scope of IBM Storage Ceph support.
Why Static Load Balancing Falls Short
A common starting point for Ceph Object Gateway (RGW) deployments involves placing one or two HAProxy nodes in front of a set of gateway daemons. Often, these HAProxy services are paired with Keepalived to create a shared virtual IP (VIP), providing basic high availability. While this approach is functional, it quickly reveals limitations as object storage environments grow:
Centralized bottleneck: With a single VIP, all S3 traffic is funneled through one node at a time. This becomes a scalability and performance bottleneck, particularly under heavy workloads.
Manual load distribution: While you can configure multiple VIPs across nodes, clients must be explicitly directed to spread requests across them. This typically requires either custom logic in clients or static DNS round-robin entries, neither of which adapts automatically to node health or cluster changes.
Lack of service awareness: Traditional DNS and VIP failover mechanisms operate at the network layer. They don’t natively detect whether individual IBM Storage Ceph Object Gateway (RGW) daemons are responsive or healthy. If a node is reachable but the RGW service is degraded or unresponsive, clients may still be routed to it.
Operational friction: Adding or removing gateway nodes often involves reconfiguring HAProxy and DNS records, which introduces downtime risk and complicates automation workflows.
To scale efficiently and provide resilient S3 access, we need a distributed ingress model that can adapt to topology changes, automatically detect service health, and dynamically direct client requests. This is where the combination of Ceph ingress Concentrators, Consul, and CoreDNS provides a modern, flexible alternative.
Cephadm Ingress and Concentrators
The Traditional IBM Storage Ceph Ingress Architecture
As part of the orchestration layer in IBM Storage Ceph, cephadm
it offers built-in support for deploying ingress services. This provides a convenient and production-ready way to place a load balancer, typically HAProxy, in front of a set of IBM Storage Ceph Object Gateway (RGW) daemons.
IBM Storage Ceph Ingress Service behaves like a standard load balancer setup: it runs a centralized HAProxy service (paired with Keepalived for high availability) that distributes traffic across multiple RGWs. Ceph automatically generates the HAProxy configuration and targets all RGW instances registered to a specific service. This model works well in many environments. For example:
When you prefer centralized traffic control and don’t need node-level ingress.
When you want to expose a single external VIP to clients and manage TLS termination centrally.
When your number of RGWs is relatively static, or when changes are well managed through automation.
Here’s an example of such a setup:
service_type: rgw
service_id: foo
placement:
label: rgw
count_per_host: 2
spec:
rgw_frontend_port: 8080
This configuration will deploy two RGW daemons on each node with the rgw
Label. The first will listen on port: 8080
, the second on 8081
. An ingress HAProxy service (defined separately) can be layered on top, forwarding client traffic to all backend RGWs in the cluster.
However, as object storage environments grow, especially in large-scale or edge-distributed scenarios, centralized load balancing can introduce operational complexity:
Network traffic must traverse multiple hops, reducing locality.
A single VIP means a single logical bottleneck unless you introduce multiple entry points and external load balancing logic.
Dynamic scaling of RGWs may still require updating client-facing load balancers or external DNS.
IBM Storage Ceph 8.1 new feature. Ingress Concentrators
IBM Storage Ceph 8.1 now supports concentrators, a new feature in cephadm
that deploys per-node ingress concentrators. A concentrator is a local HAProxy instance that runs directly on each Ceph node and acts as a concentrator for the node's local RGW daemons. It provides a single IP:PORT
endpoint per host and automatically balances traffic across the RGWs on that same node.
This brings several advantages:
Traffic locality: Each HAProxy only routes to local RGWs.
Simplified Gateway scaling: Increase the
count_per_host
and the local HAProxy updates automatically.No need for Keepalived: Each node serves its traffic. Failover happens naturally at the DNS layer (more on this when we integrate Consul).
Here’s an example of a terminator-enabled RGW service configuration:
service_type: rgw
service_id: client
service_name: rgw.client
placement:
label: rgw
count_per_host: 2
networks:
- 192.168.122.0/24
spec:
rgw_frontend_port: 8088
rgw_realm: multisite
rgw_zone: madrid
rgw_zonegroup: europe
zonegroup_hostnames:
- s3.cephlab.com
concentrator: haproxy
concentrator_frontend_port: 8080
concentrator_monitor_port: 1967
concentrator_monitor_user: admin
With this configuration:
Each node in the
rgw
placement group runs two Object Gateway daemons: one on8088
, another on8089
.A local HAProxy concentrator listens on:
8080
, forwarding traffic only to the local node RGWs.
The HAProxy monitor port (1967
) is used to expose local health status, which we’ll later integrate into Consul for DNS-aware load balancing.
Note:
This setup focuses on DNS-based routing and service awareness. It does not cover SSL/TLS termination. In a production deployment, TLS should be terminated at the IBM Storage Ceph Object Gateway (RGW) level.To support virtual-host-style S3 access (e.g.,
https://bucket1.s3.cephlab.com
), your RGW TLS certificate must include a Subject Alternative Name (SAN) entry for*.
s3.cephlab.com
.
Dynamic DNS-Based Load Balancing with Consul
To route S3 traffic intelligently based on service health and node availability, we need more than traditional DNS. This is where HashiCorp Consul comes in.
Consul is a distributed service mesh and service discovery tool. At its core, it provides:
A control plane for registering services
Built-in health checks
DNS-based service discovery with support for real-time updates
By integrating Consul into our Ceph cluster, we can dynamically answer DNS queries like: s3.cephlab.com
With a list of healthy ingress nodes, updated in real-time. This enables distributed, load-balanced access to S3 endpoints, without the need for an external load balancer or static DNS configuration.
Why Consul?
In our case, each node in the Ceph cluster runs a local HAProxy concentrator as a terminator for its IBM Storage Ceph Object Gateway (RGW) daemons. We want these endpoints to be:
Automatically registered as services
Continuously monitored for health
Returned via DNS only when healthy
Consul handles all of this through its agent-based architecture.
Control Plane and Quorum
Consul operates in a distributed mode. It requires a quorum of server agents (typically 3 or 5) to function as the control plane. These servers maintain cluster state and perform leader elections.
Every participating node (e.g. an RGW node) also runs a Consul agent, which communicates with the control plane and performs local tasks such as registering services and running health checks.
In small clusters, the same nodes can act as both server and agent. For our example, we will deploy 3 Consul agents in server mode, each running inside a cephadm-managed container. These will form the core control plane and also register the local HAProxy ingress services.
We will use the following nodes for Consul:
ceph-node-00
→192.168.122.12
ceph-node-01
→192.168.122.179
ceph-node-02
→192.168.122.94
Each of these nodes is also running a local Ceph Object Gateway (RGW) and a corresponding HAProxy concentrator. So each node will:
Run a Consul server agent.
Register its own local HAProxy (ingress terminator).
Perform a health check on the HAProxy monitor port.
Join the Consul cluster for service propagation.
Next, we’ll walk through configuring the Consul agents, deploying them as containers using cephadm
, and validating that they’re correctly advertising healthy ingress services.
Step 1: Configure Consul on our IBM Storage Ceph Nodes with Cephadm
On each of the three Ceph nodes, place the following file at /etc/consul.d/consul.hcl
.
/etc/consul.d/consul.hcl
Example for ceph-node-00
(192.168.122.12):
datacenter = "madrid"
data_dir = "/opt/consul"
bind_addr = "192.168.122.12"
client_addr = "0.0.0.0"
retry_join = [
"192.168.122.12",
"192.168.122.179",
"192.168.122.94"
]
server = true
bootstrap_expect = 3
services = [
{
name = "ingress-rgw-s3"
port = 8080
check = {
id = "http-check"
name = "HAProxy Check"
http = "http://192.168.122.12:1967/health"
interval = "10s"
timeout = "2s"
}
}
]
Make sure to adjust bind_addr
and the HAProxy health check URL for each node.
Step 2: Deploy Consul via Cephadm custom containers
Create the following consul.yml
cephadm spec file in ceph-node-00:
service_type: container
service_id: consul
placement:
hosts:
- ceph-node-00
- ceph-node-01
- ceph-node-02
spec:
image: docker.io/hashicorp/consul:latest
entrypoint: '["consul", "agent", "-config-dir=/consul/config"]'
args:
- "--net=host"
ports:
- 8500
- 8600
- 8300
- 8301
- 8302
bind_mounts:
- ['type=bind','source=/etc/consul.d','destination=/consul/config', 'ro=false']
And apply it using the ceph orch apply
Command:
$ ceph orch apply -i consul.yml
Verify the Consul services/containers have started:
$ ceph orch ps --daemon_type container
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
container.consul.ceph-node-00 ceph-node-00 *:8500,8600,8300,8301,8302,8500,8600,8300,8301,8302 running (20h) 5m ago 20h 35.8M - <unknown> ee6d75ac9539 78753a2f9a4a
container.consul.ceph-node-01 ceph-node-01 *:8500,8600,8300,8301,8302,8500,8600,8300,8301,8302 running (20h) 7m ago 20h 37.6M - <unknown> ee6d75ac9539 68bb749e0022
container.consul.ceph-node-02 ceph-node-02 *:8500,8600,8300,8301,8302,8500,8600,8300,8301,8302 running (20h) 7m ago 20h 35.5M - <unknown> ee6d75ac9539 b42ea94cd403
Check that all Consul containers are running and have joined the cluster:
$ podman exec -it $(podman ps |grep consul| awk '{print $1}') consul members
Node Address Status Type Build Protocol DC Partition Segment
ceph-node-00 192.168.122.12:8301 alive server 1.21.4 2 madrid default <all>
ceph-node-01 192.168.122.179:8301 alive server 1.21.4 2 madrid default <all>
ceph-node-02 192.168.122.94:8301 alive server 1.21.4 2 madrid default <all>
Confirm consul registered services:
$ podman exec -it $(podman ps |grep consul| awk '{print $1}') consul catalog services
consul
ingress-rgw-s3
Check which nodes are serving the ingress service:
$ podman exec -it $(podman ps |grep consul| awk '{print $1}') consul catalog nodes -service ingress-rgw-s3
Node ID Address DC
ceph-node-00 cdd2e96c 192.168.122.12 madrid
ceph-node-01 ac868e52 192.168.122.179 madrid
ceph-node-02 d16446b3 192.168.122.94 madrid
Conclusion. A Foundation for Health-Aware Ingress with Consul
In this first part, we addressed a critical challenge for scalable S3 deployments on IBM Storage Ceph: how to build a fault-tolerant, dynamic, and decentralized ingress layer that adapts automatically to node and service health.
By leveraging:
Ingress terminators, deployed per node via
cephadm
HAProxy concentrators, routing only to local RGW daemons
HashiCorp Consul, providing distributed service registration and health monitoring
…we’ve laid the groundwork for a robust, health-aware access layer. Each Ceph node now serves as an intelligent ingress point, dynamically tracked by Consul and exposed through its built-in DNS interface.
But Consul alone isn’t enough. Most applications and S3 clients don’t search for a .consul
DNS domain or port 8600, and they need to resolve names like: mybucket.s3.cephlab.com
Using standard DNS mechanisms. Without a DNS bridge, this dynamic service awareness remains confined within the Consul ecosystem.
Coming Up in Part 2. Bridging DNS and Enabling Virtual-Host Buckets with CoreDNS
In the next part of this blog series, we’ll:
Introduce CoreDNS as a programmable DNS bridge between clients and Consul
Deploy CoreDNS containers using
cephadm
custom containers on each node for high availabilityConfigure DNS rewrites to support virtual-host-style S3 access
Integrate enterprise DNS through stub zone delegation
Validate that clients receive only healthy endpoints using DNS tools
Simulate a node failure and demonstrate transparent access for the clients.
Continue to Part 2: Bridging DNS to Ceph Ingress with CoreDNS
Subscribe to my newsletter
Read articles from Daniel Parkes directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
