When an Index Isn’t Enough!!

Rahul MRahul M
3 min read

Welcome to Patch Monkey! If you’re reading this, you’ve probably hit that classic moment:
"Why is my query so slow even after I added an index?"
Let’s break down how I hit that wall, why “just index it” didn’t work, and the little tweak that made everything fast.

The Setup: Simple Feed, Not-So-Simple Query

I was building a feed where users see all relevant items, newest first. The query looked like this:

qs = (
    Item.objects.annotate(
        sort_timestamp=db_functions.Coalesce('published_at', 'created_at')
    )
    .filter(
        Q(published_at__isnull=True) | Q(published_at__lte=timezone.now()),
        category=category,
        region=region_config['region_code'].lower(),
    )
    .order_by('-sort_timestamp', 'id')
)
SELECT *,
       COALESCE(published_at, created_at) AS sort_timestamp
FROM item
WHERE (published_at IS NULL OR published_at <= $1)
  AND category = $2
  AND region = $3
ORDER BY sort_timestamp DESC, id ASC;

What’s going on?

  • Show items if published_at is null (never scheduled) or in the past.

  • Sort by a display timestamp (either published or created).

  • And it took almost 799ms (even with a limit).

Classic Move: Add an Index

Of course, I did what everyone says add an index!
So I threw this into my model:

models.Index(
    fields=['region', 'category', 'published_at'],
    name='idx_region_category_pub',
)
CREATE INDEX idx_region_category_pub
ON item (region, category, published_at);

Did it help?
Barely.

Looking Deeper: The Query Plan (The “Aha” Moment)

When I checked the query plan (EXPLAIN always do this!), I saw:

Limit  (cost=20049.49..20057.11 rows=3041 width=265)
  ->  Sort  (cost=20049.49..20125.68 rows=30409 width=265)
        Sort Key: (COALESCE(published_at, created_at)) DESC, id
        ->  Bitmap Heap Scan on item  (cost=10318.84..19823.06 rows=30409 width=265)
              Recheck Cond: (((region = $2) AND (category = $3) AND (published_at IS NULL)) OR ((region = $2) AND (category = $3) AND (published_at <= $1)))
              ->  BitmapOr  (cost=10318.84..10318.84 rows=30683 width=0)
                    ->  Bitmap Index Scan on idx_region_category_pub  (cost=0.00..93.81 rows=628 width=0)
                          Index Cond: ((region = $2) AND (category = $3) AND (published_at IS NULL))
                    ->  Bitmap Index Scan on idx_region_category_pub  (cost=0.00..1030.08 rows=30055 width=0)
                          Index Cond: ((region = $2) AND (category = $3) AND (published_at <= $1))
  • BitmapOr

  • Bitmap Heap Scan

  • A huge “Recheck” cost

Turns out, that filter with published_at IS NULL OR published_at <= NOW() messes things up.
Postgres has to grab two separate index slices (one for IS NULL, one for <= NOW()), merge them (BitmapOr), and then bounce around the table to double-check all the matches (“recheck”).
No straight walk through the index = not fast.

The Fix: Change the Query, Not Just the Index

Here’s where the “patch monkey” spirit kicked in.
Instead of trying to index around the filter, I rewrote the query so filter and sort use the same value:

qs = (
    Item.objects.annotate(
        sort_timestamp=Coalesce('published_at', 'created_at')
    )
    .filter(
        category=category,
        region=region_config['region_code'].lower(),
        sort_timestamp__lte=timezone.now(),
    )
    .order_by('-sort_timestamp', 'id')
)
SELECT *,
       COALESCE(published_at, created_at) AS sort_timestamp
FROM item
WHERE category = $1
  AND region = $2
  AND COALESCE(published_at, created_at) <= $3
ORDER BY sort_timestamp DESC, id ASC;

No more OR—just a single, straightforward condition on sort_timestamp. Now both filter and sort use the same expression, which means...

Now the Index Actually Works For You

Here is an updated index to cover the sort_timestamp

Index(
    "region",
    "category",
    Coalesce("published_at", "created_at").desc(),  # matches -sort_timestamp
    F("id"),
    name="idx_region_category_sort",
)
CREATE INDEX idx_region_category_sort
ON item (
    region,
    category,
    COALESCE(published_at, created_at) DESC,
    id
);

Finally,

With the updated query & index, query plan looks like:

Limit  (cost=0.56..3197.94 rows=3014 width=265)
  ->  Index Scan using idx_region_category_sort on item  (cost=0.56..31978.64 rows=30144 width=265)
        Index Cond: ((region = $2) AND (category = $3) AND (COALESCE(published_at, created_at) <= $1))

Now it’s an Index Scan directly matches our filters and sort—no BitmapOr, no heap scan, no expensive sort step.

And we brought down the query execution time from ~799ms to 5ms 🎉

8
Subscribe to my newsletter

Read articles from Rahul M directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rahul M
Rahul M

Founding Eng at herostuff.com . Past: Product Hunt . Apple WWDC scholar . Found security bugs in Facebook, Google, Yahoo, Apple, United Airlines & more.. Newbie ML Eng