Efficient Pagination for Large Datasets in Rails: Avoiding the COUNT Trap


When building applications with infinite scroll or large data tables, pagination is essential. However, the default pagination methods in Rails (like will_paginate
or kaminari
) often rely on running a COUNT(*)
query to determine the total number of entries. For large datasets, this can become a significant performance bottleneck.
The Problem with COUNT in Pagination
By default, paginating a query in Rails will execute a SELECT COUNT(*) FROM ...
to determine the total number of records. This is fine for small tables, but as your data grows, this count query can become slow and resource-intensive, especially if your query involves joins or complex filters.
A Smarter Approach: Loading Just Enough (Similar approach used in Cursor-Based Pagination, a.k.a "keyset pagination")
Instead of running a full count, you can optimize pagination by only loading the records you need for the current page, plus one extra record to check if there’s a next page. Here’s how it works:
Calculate the number of records that should be loaded for the current page:
current_records_loaded_count = page * per_page
Check if there is a next page by fetching one extra record:
next_page_first_element = records.offset(current_records_loaded_count).limit(1).first has_next_page = next_page_first_element.present?
Set the total_entries for pagination:
If there is a next page, set total_entries to the current loaded count plus one.
If not, set it to the actual number of records loaded.
total_entries = has_next_page ? current_records_loaded_count + 1 : records.size
Paginate using the calculated total_entries:
records = records.paginate(page: page, per_page: per_page, total_entries: total_entries)
Why This Works
Performance: You avoid the expensive
COUNT(*)
query, which can be slow on large tables.User Experience: For infinite scroll or "load more" interfaces, you don’t need the exact total count just whether there’s another page.
Simplicity: The logic is easy to maintain.
Example Implementation
Here’s a simplified version of the approach:
def records_search_improved
records = Records.valid_and_active
# ... apply filters and sorting ...
current_records_loaded_count = page * per_page
next_page_first_element = records.offset(current_records_loaded_count).limit(1).first
has_next_page = next_page_first_element.present?
total_entries = has_next_page ? current_records_loaded_count + 1 : records.size
records = records.paginate(page: page, per_page: per_page, total_entries: total_entries)
# ... further processing ...
end
When to Use This Pattern
When you have very large datasets.
When the exact total count is not critical for the UI.
When you want to improve response times for paginated endpoints.
Alternative Solutions
While the above method is a great fit for many use cases, there are other strategies you can consider for efficient pagination with large datasets:
1. Cursor-Based Pagination
Instead of using page numbers and offsets, cursor-based pagination uses a unique, sequential value (like an ID or timestamp) to fetch the next set of records. This method is highly performant for large or frequently changing datasets, as it avoids the performance issues of large offsets.
Example:
# Fetch records after a given ID
Record.where('id > ?', last_seen_id).limit(per_page)
Cursor-based pagination is ideal for APIs and feeds where you only need to move forward (or backward) through a dataset.
2. Estimate Count Instead of Exact Count
If you still need to display an approximate total, you can use PostgreSQL’s system tables to get a fast estimate:
result = ActiveRecord::Base.connection.execute("SELECT reltuples::bigint FROM pg_class WHERE relname = 'models';")
approximate_count = result[0]['reltuples']
This query returns an estimated row count for the table, which is much faster than a full COUNT(*)
. While not exact, it’s often “good enough” for user interfaces that just need to show a rough idea of the total.
Conclusion
By rethinking how you calculate the total number of entries for pagination, you can dramatically improve the performance of your Rails applications, especially as your data grows. This approach is particularly well-suited for infinite scroll and similar interfaces where knowing the exact total isn’t necessary.
Subscribe to my newsletter
Read articles from Alejandro Monasterios Rios directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Alejandro Monasterios Rios
Alejandro Monasterios Rios
Passionate Software Engineer with over 12 years of experience in software development across diverse projects and industries, including SaaS, B2B, WMF, Social Platforms, Startups, and the Gaming Industry. In recent years, my focus has revolved around full-stack engineering. My responsibilities encompass crucial aspects of software development, including architectural decision-making, starting projects from scratch with scalability in mind, code quality maintenance, analyzing and setting up UI testing strategies, and implementing code standard enforcement tools. On the front end, I worked with JavaScript/TypeScript and frameworks/libraries such as React.js and Next.js. Simultaneously, I have demonstrated proficiency on the server, working with SQL and NoSQL databases, Ruby on Rails, NodeJS, Python fastAPI, and Java Spring, often in conjunction with cloud platforms such as AWS and Azure.