↪️Achieving True DNS Resilience with DNSMASQ & the All-Servers Parameter↩️

Ronald BartelsRonald Bartels
4 min read

When it comes to ensuring reliable and resilient name resolution, traditional approaches to DNS often fall short. Many businesses and service providers still rely on outdated methods such as configuring two or three static resolvers, typically pointing to an ISP’s DNS or a single public resolver. This legacy setup introduces a single point of failure, and when one of these resolvers has issues, it can cripple network connectivity—even when alternative network paths exist.

The DNS Problem | A Single Point of Failure Hiding in Plain Sight

Most traditional DNS setups work as follows:

  • A client queries its local resolver (often a router or a firewall).

  • The local resolver has a list of two or three upstream DNS servers configured.

  • If the first resolver does not respond, the client will retry using the second, and if necessary, the third.

The problem is that many resolvers (especially those built into routers, firewalls, or even Windows) treat a slow response as a failure and do not retry alternative resolvers unless the primary one completely times out. Worse, some devices will "stick" to a failing resolver for an extended period, even if it is responding with broken results.

This means:

  1. Random Connectivity Issues – If one resolver is broken or hijacked, devices will still send queries to it, leading to intermittent failures.

  2. Unnecessary Outages – A business might experience a full-site outage if the configured resolvers are unreachable, even if the internet connection is working fine.

  3. Slow Failover – Some resolvers introduce failover delays, causing DNS resolution to slow down significantly.

The Solution | DNSMASQ and the All-Servers Parameter

To eliminate these DNS-related pitfalls, a far superior approach is to use DNSMASQ as a local caching resolver with the all-servers parameter. This setup queries multiple upstream DNS resolvers simultaneously, taking the fastest valid response.

How It Works

  • Instead of relying on one DNS server at a time, all-servers queries all configured resolvers at once.

  • The fastest response wins, eliminating delays caused by a slow or failing resolver.

  • If one DNS provider is experiencing issues (e.g., a bad cache entry, hijacked results, or upstream routing problems), DNSMASQ will take a valid response from another provider without the client even noticing.

Example Configuration for DNSMASQ

iniCopyEditserver=9.9.9.9  # Quad9 Primary (Filtered)
server=149.112.112.112  # Quad9 Secondary
server=208.67.222.222  # OpenDNS Primary
server=208.67.220.220  # OpenDNS Secondary
server=1.1.1.1  # Cloudflare DNS
server=1.0.0.1  # Cloudflare Secondary
server=8.8.8.8  # Google (Last Resort)
server=8.8.4.4  # Google Secondary
all-servers

This setup ensures that:
Redundant Resolvers – Quad9, OpenDNS, Cloudflare, and Google provide multiple independent resolution paths.
Load Balancing – The fastest valid response is always used.
Failure Mitigation – If one DNS provider is suffering from downtime, hijacks, or slow responses, another will seamlessly take over.

Why This Approach Works

  1. True Resilience – Traditional DNS redundancy is an illusion if one resolver is degraded but not fully down. With all-servers, you get true, real-time redundancy.

  2. Faster Resolution – Instead of waiting for timeouts, DNSMASQ gets the first valid answer almost instantly.

  3. Better Security – By using multiple providers, you reduce the risk of a single DNS provider being compromised.

Legacy DNS | The Cause of Unnecessary Site Outages

Many businesses have experienced mysterious connectivity failures, where internet browsing or critical services stop working despite the link being active. Upon closer inspection, the cause is often DNS failure rather than network failure.

  • If your DNS resolver is down, websites and cloud services won’t load—even if your internet is fine.

  • If a resolver is returning stale or hijacked results, you might suffer invisible outages or security risks.

  • If a resolver is misconfigured by an ISP, your users will experience random failures with no clear explanation.

Instead of relying on traditional resolver redundancy, businesses should be proactively preventing DNS failures by implementing all-servers with DNSMASQ.

Wrap | SD-WAN and DNSMASQ – The Perfect Last-Mile Duo

While SD-WAN optimises the last mile, it does not fix DNS failures by itself. The combination of SD-WAN for path resilience and DNSMASQ for DNS resilience ensures that both connectivity and name resolution remain rock solid.

If your business is still relying on traditional DNS redundancy, it’s time to switch to DNSMASQ + all-servers and eliminate DNS failures for good. 🚀

10
Subscribe to my newsletter

Read articles from Ronald Bartels directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ronald Bartels
Ronald Bartels

Driving SD-WAN Adoption in South Africa