Journey Down Memory Lane: How I Misused Java Streams and Wrecked My Tracker API

HarshavardhananHarshavardhanan
5 min read

There was a time I was completely in love with Java Streams.
They looked so clean, so elegant, so functional and so new.
Until I used them lavishly… and wrecked a real-time tracking system in production.


The Setup

At a large logistics company, we built a real-time shipment tracking system. Here's how it worked:

  • Pull live tender IDs from the database.

  • Enrich each with live location using an external API.

  • Push the enriched data to a Kafka topic.

  • A downstream service consumes that topic and calls customer webhook APIs.

A heartbeat-style pipeline. Every few seconds, the system had to hum.


The First Version (Looked Beautiful, Ran Ugly)

📝 Note:
The code shown here is adapted and simplified to focus on the core lesson. It’s not a copy of the production implementation, which was more complex and context-specific.

This is what I originally wrote:

tenderIds.stream()
    .map(this::getLiveLocation)
    .map(this::buildKafkaEvent)
    .forEach(this::pushToKafka);

It passed tests.
It ran clean in dev.
I felt like a functional programming god.

Until the latency metrics came in.


The Problem

What used to take 5–10ms per tracking cycle…
...spiked to 200–300ms under load.

Why?

  • getLiveLocation() was making external HTTP calls.

  • .map() was lazy and sequential.

  • pushToKafka() was also running one event at a time.

Result: A pipeline that looked clean but silently serialized every single I/O call.

In a system where every second counts, I had accidentally created a traffic jam at the heart of our real-time architecture.


The Fix

We restructured the flow:

Step 1: Get out of the stream for I/O

Instead of chaining everything, we parallelized just the enrichment step — and wrapped the risky parts.

List<LiveLocation> enriched = tenderIds.parallelStream()
    .map(this::getLiveLocationWithTimeout)
    .filter(Objects::nonNull)
    .map(this::buildKafkaEvent)
    .collect(Collectors.toList());
  • getLiveLocationWithTimeout() used CompletableFuture.supplyAsync() which gave each API call its own thread and timeout — meaning no single tender ID could hold the rest of the system hostage anymore.

  • We enforced strict timeouts and fallback behavior.

  • We collected events into a buffer before publishing.


Step 2: Push to Kafka cleanly

for (LiveLocationEvent event : enriched) {
    kafkaProducer.send(buildKafkaRecord(event));
}
  • Used Kafka’s batch configuration to group sends efficiently.

  • Added retry and failure hooks.

  • Logged critical errors separately for visibility.


What I Learned

Clean code can be dangerous when you don’t fully understand how it flows.

I learned (the hard way) that Java Streams are not just a fancy loop replacement. They come with powerful abstractions, but also with quiet traps — especially in production systems.

Here’s what actually broke us, and what I carry forward now:

Streams Are Lazy

Operations like .map() and .filter() don’t execute when you write them — they execute only when the terminal operation (collect, forEach, etc.) is reached.

In my original code, this meant none of the HTTP calls ran until .forEach() started — and then they all ran in strict, one-by-one order. I wasn’t building a pipeline. I had unknowingly created a serialized job queue hidden behind “elegant” lambdas.

Streams Are Synchronous by Default

Unless you explicitly use .parallelStream() (which comes with its own risks), your stream is strictly sequential. I had 10, 50, sometimes 100 tender IDs — and each one waited for the previous API call to return before continuing.

In real-time systems, waiting is a tax.
I had coded a beautiful pipeline that taxed itself with every ID.

Streams Are Not for Side Effects

This is the big one.

Streams were built to transform data, not handle side effects like HTTP calls, Kafka writes, or logging. When I placed getLiveLocation() (an external API call) inside .map(), I lost all control:

  • No visibility into response times

  • No easy way to apply timeouts

  • No retry or fallback

  • No logging without dirtying the chain

I wasn’t just coding in a functional style — I was burying side effects inside abstractions that didn’t want them there.

The Real Risk: Silent Failures Under Load

Under low load, this design passed tests. It even looked fast. But the moment our system scaled, and a few API calls got slow, the whole chain crawled.

No one saw it coming. There were no exceptions, no logs, no red flags — just latency bleeding into every call.

The Fix Taught Me More Than the Failure

When I pulled those side effects out, wrapped them with CompletableFuture.supplyAsync(), and enforced timeouts and retries — the system stabilized.

Yes, it still made an external call — it’s still a side effect.
But now it was a controlled side effect:

  • It ran on a separate thread

  • It had a timeout

  • It failed fast and returned null if needed

  • It didn’t block the rest of the system

  • It logged slow calls and gave us visibility

But more importantly, I finally understood the difference between clean code and responsible code.

Final Thought

I still use Streams. They’re great for transforming collections, chaining filters, and building expressive logic.

But in real-time systems where:

  • Every millisecond matters

  • IO is unpredictable

  • Failure must be isolated

…I don’t put side-effects inside streams anymore.
I treat Streams like math.
And I treat side-effects like fire — beautiful, but handled with care.

Write what you mean. Understand what runs.
Clean code is only powerful when it’s also clear.


Summary: A Few Hard-Learned Rules

  • Don't .map() network calls unless you're handling timeouts and retries.

  • Don’t stream IO unless you understand the cost of every line.

  • Avoid hiding latency in “clean” code.

  • Separate transformation from side-effects.

  • Always instrument real-time pipelines — logs, metrics, traceability.


Want More?

I'm thinking of doing a Part 2:

  • How we used CompletableFuture for controlled parallelism

  • How we wrapped retries with back-off and timeouts

  • Our Resilience4J setup and what actually worked

Let me know if you’d read that.

1
Subscribe to my newsletter

Read articles from Harshavardhanan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Harshavardhanan
Harshavardhanan