Is your PostgreSQL Analytics Slow? Here’s Why and How to Fix It

This is part of blog Series “Database to Analytics Ready Data

If you are reading this, there is a high chance you are frustrated with slow postgres analytics queries on your PostgreSQL database. Especially when you have gigabytes (or even terabytes) of data, running complex joins and aggregations directly on Postgres for analytics (or Postgres real time analytics) can sometimes feel like watching paint dry.

And if you are using BI tools like Metabase or Tableau directly on top of your Postgres instance, you might even see dashboards that refuse to load or time out.

In this blog, we’ll cover:

  1. Why Postgres often struggles with large-scale analytical queries. [Result: Slow postgres analytics]

  2. Common methods people use to run analytics on Postgres (direct queries, data virtualization, data warehouses, etc.).

  3. The limitations of these traditional approaches.

  4. A comparison between querying data directly on Postgres vs. Datazip’s OneStack Data (powered by ClickHouse).

  5. How to connect your BI tools to these two setups and see for yourself the difference in speed.

By the end, you’ll understand exactly where Postgres fits, why it’s not built for some advanced analytical use cases, and how you can supercharge your analytics using a modern, columnar database under the hood.

The Pain Points of Running Analytics on Postgres

Postgres is a fantastic relational database. It’s open-source, reliable, and known for its strong support of ACID transactions. Many developers love it for their online transaction processing (OLTP) workloads. However, when it comes to analytics, several pain points arise:

  1. Row-Oriented Storage:

    • Postgres stores data in a row-oriented manner. For analytical queries that typically scan large portions of your dataset (e.g., SELECT SUM(...) FROM ... WHERE ... GROUP BY ...), row-based storage means the engine reads unnecessary columns for every row. This can be slow when you only need data from specific columns.
  2. No Built-In Horizontal Scaling

    • Postgres scales “up” by adding more CPU and RAM to a single node, but it doesn’t natively distribute queries across multiple nodes. For large-scale analytics, this becomes a bottleneck and requires complex sharding or third-party extensions.
  3. Index & Vacuum Overheads:

    • Analytical queries often do large table scans. You can’t just rely on indexes for everything (especially if your query filters are on many columns). Also, Postgres requires periodic VACUUM operations to clean up dead rows. This overhead can introduce more complexity and might slow down queries during maintenance windows. VACUUM ANALYZE takes somewhere from 30 mins to even days depending on the size of the database and how often you perform this operation.
  4. Concurrency Bottlenecks:

    • Postgres handles transactional concurrency really well. But for heavy, long-running analytics queries, concurrency can suffer. If multiple analysts run big queries at the same time, you may see significant slowdown or even locks.
  5. High CPU Costs for Aggregations:

    • As your data size grows, complex queries that involve multiple joins and aggregations (like SUM, COUNT DISTINCT, etc.) can take minutes or hours. Postgres was not designed to handle these queries at massive scale with sub-second response times which you get with an OLAP type data warehouse.
  6. Materialized Views Are Not Real-Time

    • Updating materialized views involves a resource-heavy refresh, and it’s not automatic. For real-time or near-real-time dashboards, repeatedly refreshing views can become impractical at scale.
  7. No Native Vectorization

    • Postgres processes data row-by-row rather than in vectorized batches. Modern OLAP engines use vectorization (SIMD) to greatly speed up analytical workloads.

Devs on reddit talk in length why and when you should not use postgres for analytics.

Traditional Ways to Run Analytics on Postgres

Over the years, teams have tackled these performance issues using different strategies:

  1. Running Queries Directly on Postgres

    • Easiest approach: just point your BI tool to your production Postgres.

    • Pain points: performance slowdowns, risks to production stability, no real optimization for analytics.

  2. Data Virtualization Tools

    • Tools like Denodo or Data Virtuality (now CData) can create a virtual layer on top of multiple databases (including Postgres).

    • Advantage: no heavy data movement; you can query across sources.

    • Disadvantage: the actual query workload is still happening on Postgres, so large initial queries still become slow if you have not developed any model (nitpicked selected tables, selected columns to run analytics query on top of it, that updates on a periodic basis) on top of it and stored it on some OLAP warehouse.

  3. Data Warehouses via ETL

    • Popular data warehouses: Clickhouse, Snowflake, BigQuery, Redshift.

    • You extract data from Postgres (using Debezium + Kafka or ETL / ELT tools), transform it for analytics, and load it into a warehouse.

    • Pros: columnar storage, massively parallel processing, optimized for large-scale analytics.

    • Cons: setting up ETL pipelines can be time-consuming, requiring specialized knowledge. Use managed services (e.g., Datazip, Fivetran, Airbyte).

Fig: Offloading analytics use-case to a different Warehouse or using a read replica (avoid loan on prod DB) and then CDCing to a warehouse.

  1. Data Lakes or Lakehouses

    • Use systems like Apache Hadoop, Spark, or Delta Lake or Iceberg

    • Great for unstructured data and large-scale processing.

    • Typically more complex to set up and maintain. Overkill if you only have relational data and want a straightforward SQL analytics experience.

  2. Other Ways

    • Some teams create custom scripts or use open-source replication tools, then run analytics on a separate system.

    • Others try sharding or partitioning Postgres, but that can get complicated quickly.

Where a Database Outperforms a Warehouse (and Vice Versa)

Let’s be real: a transactional database like Postgres is not a silver bullet for big, complex analytics. However, it’s perfect for:

  • High-speed transactions (INSERT, UPDATE, DELETE) in real-time applications involving a couple of rows.

  • Ensuring data integrity with ACID compliance.

  • Handling small to medium analytical tasks, especially under a few GBs of data (Note: For use cases other than analytics, Postgres works great for a few 100s of GBs as well).

A specialized data warehouse, on the other hand, excels at:

  • Massively parallel processing of large datasets. [Note: ClickHouse does MPP differently]

  • Batch or streaming ingestion without impacting production environments.

  • Columnar storage that speeds up analytical queries.

  • Techniques to skip data scanning using Secondary indexes like MinMax, Bloom filters. Partitioning techniques makes the Warehouses much faster to scan, join & group.

If your primary goal is real-time analytics or large-scale aggregation queries, a warehouse or columnar store is a far better choice.

Read more here: Why Move from a Database to a Data Warehouse for Analytics?

Enter Datazip’s OneStack Data (Powered by ClickHouse)

a) What Is OneStack?

OneStack Data is our solution to making Data Engineers 10x More Productive. It offer to Ingest, store, organize, and query all your data at one place while it takes care of the infra scale, enable data quality and data access

It uses ClickHouse under the hood—an open-source, columnar database designed for speed. Because of its vectorized execution, near-real time data ingestion and focus on large-scale aggregations, ClickHouse excels at OLAP workloads.

Key Features of Datazip’s OneStack:

  1. Columnar Storage: Reads only the specific columns needed, greatly speeding up aggregations.

  2. Vectorized Execution: Processes data in batches, leveraging modern CPU instructions.

  3. On Demand Warehouse Horizontal Scaling using OpenEngine.

  4. Efficient Compression: Minimizes storage cost while speeding up I/O operations.

  5. Designed for Concurrency: Handles multiple analysts running queries without bogging down.

  6. Faster Ingestion

b) Replicating Data from Postgres to OneStack

OneStack provides 150+ connectors that let you replicate data from various sources—including Postgres—into ClickHouse.

We can sync data in near-real time with frequency (as low as 1 min) which is a big advantage for you as 95% of your analytics use cases can be solved with this.

Here’s how postgres real time analytics works:

  1. Full Load: On initial setup, OneStack extracts all your existing data from Postgres.

  2. CDC (Change Data Capture): After the full load, we capture any incremental changes (INSERTs, UPDATEs, DELETEs) and apply them to ClickHouse. This keeps your analytics data fresh.

  3. No Downtime: The replication runs without disrupting your production Postgres workload. As it reads data from WAL (Write ahead logs), the effect on Postgres is negligible.

In just a few clicks, you can connect your Postgres database, select the tables you want to replicate, and set up an automatic or scheduled sync. Once the data lands in ClickHouse, it’s ready for near real-time analytics.

Demonstration of Slow Queries on Postgres

Imagine you have a 50 GB Postgres database, with a table containing 50 million rows of sales data. You connect to your Postgres instance using DBeaver or DataGrip and run a complex query:

Let’s assume you run a query that joins multiple tables (like “sales,” “customers,” and “products”), aggregates sales amounts by region, and filters by a date range of the last 12 months. On top of that, you might need a couple of subqueries or window functions.

Even with indexes, Postgres will often do a sequential scan on multiple tables (especially if you’re scanning millions of rows or have written an unoptimised query). This might take anywhere from 30 seconds to several minutes—depending on your hardware, indexing strategy, and concurrency load.

Some common complaints from the community:

  • “We use Metabase on Postgres. Our dashboards load very slowly. Often, queries timeout.”

  • “Even with indexes and partitioning, big group-by queries can lock our tables.(creating index locks the table in some cases)”

  • “We tried materialized views, but refreshing them regularly is a pain.”

At the end of the day, Postgres does a great job at OLTP. But it’s not optimized for large-scale OLAP (Online Analytical Processing).

We took TPCH data and generated data using this.

We ran a complex join across multiple tables (e.g., orders joined other tables, filtered by region and date, grouped by month, and aggregated by revenue and count of orders).

  • Directly on Postgres:
    The query might take 60+ seconds for most queries. You might see CPU spiking on the Postgres server (the server crashed in some queries). Your BI dashboards might even time out or show partial results.

  • On Datazip’s OneStack (ClickHouse):
    With the same logical query, we often see results in seconds. By leveraging columnar storage, vectorized execution, and MPP capabilities, ClickHouse reduces the query time drastically. Now your analysts can actually explore data interactively, drill down into details, and run ad-hoc queries without dreading slow response times.

Query1: Top 5 Customers by Revenue Contribution Over a Year

1. Direct run on Postgres [with 2 indexed columns, orderdate and custkey]

2. Run on Datazip’s OneStack

RESULT: Postgres ran the query in 2 minutes and 49 seconds.

An Unoptimised clickhouse query for the same took 1minute to execute but optimised ClickHouse query (shown above) did that in 14.2 seconds. With better query writing, you can get blazing fast analytics with ClickHouse.

Query2: Top 5 Parts by Average Extended Price

1. Direct run on Postgres

2. Run on Datazip’s OneStack

RESULT: 33x faster results in ClickHouse with one INNER JOIN, GROUP BY, ORDER BY.

Query3: Total Revenue per Region for Orders in 1995

1. Direct run on Postgres [with 3 indexes on custkey, nationkey and regionkey]

2. Run on Datazip’s OneStack

RESULT: 6x faster results in ClickHouse when involved multiple JOINS, GROUP BY’s, and various filtering.

We also did look into ClickBench: a Benchmark For Analytical Databases. Take a look.

Source - ClickBench (Q0, Q1, Q2, etc are standard queries, hovering on them will give you the exact query that was being tested)

So, you did execute your queries and got the intended result, now what? Yeah getting the data to a BI so you can visualise the data.

Assuming you cleaned up the data, applied some transformations, got the data from 🥉Bronze -> 🥈Silver -> 🥇Gold following the medallion architecture, now it's time you make sense of this data.

Connecting BI Tools (Metabase / Tableau / Superset / PowerBI / Redash) and Seeing Immediate Improvements in chart rendering times

Refer to our guide on “Connecting your BI tool directly on Postgres vs with OneStack Data for faster chart loading times” here.

Running Queries with SaaS data source and multiple Database data (MongoDB + Postgres + Google Analytics)

Modern analytics often requires combining data from multiple sources—maybe you want to join mongoDB data with a SQL database or with a SaaS application’s API data.

Since Datazip supports 150+ source connectors, you can pull data from multiple DBs and SaaS sources into a unified analytical layer. Suddenly, joining across sources and building holistic dashboards becomes trivial and fast.

If your final query requires you to aggregate data from multiple database sources, a SaaS source, you can perform those queries here as well.

  1. Sync Database1 with OneStack

Fig: 4 of 150+ connectors

  1. Sync Database2 with OneStack

Fig: Add database details to connect and sync data instantly.

  1. Sync SaaS source data with OneStack (click here to see all the SaaS source connectors we support [oh, we support many])

  2. Decide the joining key (assuming your data has some common key to make a JOIN)

  3. Run the query.

  4. Yeah that’s basically it.

Refer the below picture

The query:

SELECT *
FROM
(
    SELECT *
    FROM
    (
        -- Data from MongoDB
        SELECT
            _id,
            _timestamp AS timestamp,
            delete,
            entities
        FROM ch_db.otterdb_stream_3
    ) t1
    JOIN
    (
        -- Data from Postgres DB
        SELECT *
        FROM ch_db.ankittable_dz_stag_plan_raw
    ) t2
        ON DATE(t1.timestamp) = DATE(t2._timestamp)
) t3  -- Data after joining different tables from two databases
JOIN
(
    -- Google Analytics data
    SELECT *
    FROM ch_db.daily_active_users_raw
) t4
    ON DATE(t3.timestamp) = DATE(t4._timestamp);

Here, we used the Data from MongoDB + Postgres and Google Analytics to show even joining multiple Databases together and on top of that joining another SaaS source and running queries under 1 second is also possible with OneStack Data. So, what are you waiting for?

Summary Postgres vs. OneStack Data

TASKPostgres (Row Store)OneStack (ClickHouse)
StorageRow-orientedColumnar storage
Initial LoadAlready in PostgresFull load + CDC replication
Query 1: Top 5 Customers by Revenue Contribution Over a Year~ 120 seconds +~ 14.2 seconds
Query 2: Top 5 Parts by Average Extended Price~ 80 seconds~ 2.7 seconds
Query 3: Total Revenue per Region for Orders in 1995~ 78 seconds~ 13.8 seconds
Materialized ViewsManual refresh, not real-timeNear real-time if needed
ConcurrencySlows down with big queriesScales better for analytics
Ideal Use CaseOLTP, small analyticsOLAP, large-scale analytics

(These are indicative numbers. Actual performance may vary based on your schema, hardware, and concurrency.)

Conclusion: Your Postgres is now Analytics ready

If you’re only dealing with a few million rows and your queries are straightforward, Postgres might do fine. But once your data grows into gigabytes or your queries get more complex, you’ll feel the heat.

That’s where Datazip’s OneStack Data steps in:

  • Leverage the power of ClickHouse without needing deep expertise in columnar databases.

  • Simple replication from Postgres (full load + CDC).

  • Fast query performance that can handle real-time or near real-time analytics.

  • Seamless integration with BI tools like Metabase, Tableau, and more.

Ready to Try It?

If you’re tired of waiting on queries and want to see your dashboards load in seconds, give OneStack a shot. You’ll still keep Postgres for your transactional needs, but now you’ll have a specialized engine for all your analytic workloads.

In our next post, we’ll do a hands-on benchmarking comparison (with real SQL queries and execution times). Stay tuned and see how you can level up your data game with Datazip.

Final Thoughts

  • Postgres is a sturdy OLTP database, but large-scale analytics isn’t its forte.

  • Slow dashboards and long-running queries can kill productivity.

  • OneStack Data (powered by ClickHouse) can turn minutes-long queries into sub-second responses.

  • Easy replication ensures your analytics data remains up-to-date.

  • No more frustrated data teams—just fast insights at your fingertips.

If you have any questions or want to learn more, drop us a line at hello@datazip.io or book a quick demo meeting with us.

We’re here to help you make the most of your data, from Postgres to OneStack and beyond.

P.S. If you’re reading this, there’s a high chance you’re frustrated with slow queries on your PostgreSQL data. We get it—nobody likes staring at a loading spinner. Check out Datazip’s OneStack and let your queries breathe again.


Disclaimer: Actual query speeds depend on data size, cluster configuration, and query complexity. The examples given are based on common industry patterns and ClickHouse’s known performance characteristics.

0
Subscribe to my newsletter

Read articles from Priyansh Khodiyar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Priyansh Khodiyar
Priyansh Khodiyar

Building Composable Lakehouse | DevRel at Datazip. Linkedin - https://www.linkedin.com/in/zriyansh