How I Debugged a Live Issue with No Logs

Bluell ABBluell AB
3 min read

I still remember staring at an empty log file, heart pounding.

Our users were locked out, and I had nothing but silence from our monitoring tools.

That’s when I discovered: you don’t always need logs to debug a live problem.

You need creativity, the right questions, and a few handy techniques.

Stick with me, and by the end you’ll know exactly how to:

  • Spot anomalies in your metrics without a single log line.

  • Narrow down causes with feature‑flag binary search.

  • Deploy temporary instrumentation safely in production.

I’ll share my real‑world story, break down each method step by step, and give you concrete next steps you can use today.

I was on call and sleeping until my phone buzzed.

The alert said “500 errors spiking,” but our logs were blank.

No exceptions…

No stack dumps…

Nothing…

Panic set in.

Without logs, how do you even start?

I took a deep breath and remembered something my mentor once said:

“When logs fail you, your metrics and flags become your compass.”

So that’s exactly where I began.

1. Follow the Metrics Breadcrumbs

First, I jumped into our dashboards.

I compared request rates, error percentages, and latency graphs across services.

Within minutes, I saw our cache‑service latency tripling, while all other services looked normal.

That told me where the problem lived.

Action Step:

  • Plot error rate vs. request volume over time.

  • Overlay service‑specific latency charts.

  • Watch for outliers, those spikes are your first clue.

2. Binary Search with Feature Flags

Once I knew the cache service was the culprit, I needed to find out what changed.

We had a recent config rollout behind a flag.

I flipped the flag off in half our pods, with a single click in our feature‑flag dashboard.

Error rates in that group plummeted.

Bingo: the new config was broken.

Action Step:

  • Protect new code behind flags.

  • Toggle flags in small batches to isolate issues.

  • Automate rollback triggers when errors exceed thresholds.

3. Live‑Patch with Temporary Instrumentation

To confirm my theory, I needed more detail, but still no logs.

I pushed a tiny, safe instrumentation snippet, straight into our running service, with our hot‑patch tool.

Within seconds, I had real‑time counters showing which code path was failing.

I tweaked the config, watched the counters normalize, and our errors vanished.

Action Step:

  • Build a minimal hot patch or sidecar tool for quick instrumentation.

  • Log only essential data, avoid performance hits.

  • Remove the patch immediately once you’re done.

Wrapping Up & Next Steps

Logs are great, but they can fail.

When they do, rely on:

  1. Metrics as your map.

  2. Feature flags for fast binary search.

  3. Safe, temporary instrumentation.

Give these methods a try:

  • Audit your dashboards, and add granular metrics today.

  • Protect new changes with flags before deployment.

  • Invest in a hot‑patch tool or sidecar for emergency instrumentation.

With these tricks in your toolkit, you’ll conquer live issues even when your logs go silent.

0
Subscribe to my newsletter

Read articles from Bluell AB directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Bluell AB
Bluell AB