Rails Query Optimization: Part 2— Advanced Indexing and Query Refactoring

In Part 1, we explored essential query optimization techniques and tools like Bullet and Prosopite gem. In this part, we delve into advanced indexing strategies and query refactoring methods that can significantly boost performance.
1. Advanced Indexing Techniques

Indexes speed up query execution by allowing the database to find rows faster. However, adding unnecessary indexes can slow down write operations.

1.1 Composite Indexes

Composite indexes are multi-column indexes that enhance performance for queries filtering on multiple columns.

Example — Adding a Composite Index:

add_index :orders, [:user_id, :created_at]

This index speeds up queries like:

Order.where(user_id: 1).order(created_at: :desc)

Best Practice: Place the most selective column first (i.e., the column with the highest number of unique values).

1.2 Partial Indexes

Partial indexes improve performance by indexing only a subset of rows that match a condition.

Example — Indexing Active Users Only:

add_index :users, :email, unique: true, where: "active = true"

This index is used only when querying active users:

User.where(active: true, email: "example@example.com")

Benefits: Reduced index size and faster writes.

1.3 Indexing JSONB Columns

For PostgreSQL, indexing JSONB columns can significantly enhance performance when querying JSON data.

Example — JSONB Index:

add_index :products, "(data->>'category')", using: :gin

This index is used when querying JSONB fields:

Product.where("data->>'category' = ?", 'Electronics')

1.4 Covering Indexes

A covering index includes all the columns required for a query, eliminating the need to access the table rows.

Example — Covering Index:

add_index :posts, [:user_id, :status, :created_at]

This index covers the following query:

Post.where(user_id: 1, status: 'published').order(created_at: :desc).pluck(:id)

Benefit: Reduced I/O operations, improving performance.

2. Query Refactoring Techniques

Refactoring queries can lead to significant performance improvements by reducing complexity and making better use of database features.

2.1 Using Subselects Efficiently

Subselects are useful for filtering data before joining tables, reducing the number of rows scanned.

Example — Refactored Subselect:

active_users = User.select(:id).where(active: true)
Post.where(user_id: active_users)

This approach is more efficient than using .joins or .includes.

2.2 Avoiding SELECT * in Queries

Fetching unnecessary columns increases memory usage and slows down queries.

Example — Optimized Query:

# Bad
Post.where(active: true)

# Good
Post.select(:id, :title).where(active: true)

Benefit: Reduced memory usage and faster query execution.

2.3 Using pluck Instead of map

Use .pluck to retrieve specific columns directly from the database, avoiding unnecessary object creation.

Example — Efficient Column Selection:

# Bad – Loads entire Active Record objects
titles = Post.where(active: true).map(&:title)

# Good – Only loads required columns
titles = Post.where(active: true).pluck(:title)

Benefit: Lower memory usage and faster query execution.

2.4 Refactoring Complex Conditions with merge

Using .merge keeps queries DRY and maintainable by reusing query scopes.

Example — Using merge for Complex Queries:

class Comment < ApplicationRecord
  scope :approved, -> { where(approved: true) }
end

Post.joins(:comments).merge(Comment.approved)

Benefit: Reusable and cleaner query logic.

3. Using Database Views and Materialized Views

Views and materialized views are powerful tools to optimize complex queries and enhance performance.

3.1 Database Views

Views are virtual tables defined by a query. They simplify complex joins and calculations.

Example — Creating a View:

CREATE VIEW active_posts AS
SELECT id, title, created_at FROM posts WHERE active = true;

In Rails, use the scenic gem to manage views:

gem 'scenic'

Then, create a view migration:

rails generate scenic:view active_posts

3.2 Materialized Views

Materialized views store query results physically, improving performance for complex calculations.

Example — Creating a Materialized View:

CREATE MATERIALIZED VIEW post_stats AS
SELECT user_id, COUNT(*) AS post_count FROM posts GROUP BY user_id;
  • Refreshing the View:
REFRESH MATERIALIZED VIEW post_stats;
  • In Rails, trigger refreshes using callbacks or background jobs.

When to Use:

  • Use database views for frequently used complex joins.

  • Use materialized views for expensive aggregations that don’t need real-time updates.

4. Real-World Query Refactoring Examples

Example 1 — Optimizing N+1 Queries

Before:

# N+1 Query Problem
posts = Post.all
posts.each do |post|
  puts post.comments.count
end

After:

# Optimized with Eager Loading
posts = Post.includes(:comments).all
posts.each do |post|
  puts post.comments.size
end

Example 2 — Refactoring Complex Joins

Before:

User.joins(:orders).where('orders.status = ?', 'completed')

After — Using merge:

class Order < ApplicationRecord
  scope :completed, -> { where(status: 'completed') }
end

User.joins(:orders).merge(Order.completed)

Example 3 — Efficient Counting

Before:

# Loads all records into memory
Post.where(active: true).size

After — Using count:

# Uses SQL COUNT(*) for better performance
Post.where(active: true).count

5. Performance Monitoring Tools

Tracking performance metrics helps identify bottlenecks and optimize application performance. Here are some recommended tools:

5.1 Rack Mini Profiler

rack-mini-profiler provides real-time performance metrics, including SQL queries, rendering time, and memory usage.

Installation:

Add to your Gemfile:

gem 'rack-mini-profiler'

Run:

bundle install

Usage:

  • Displays performance information as an overlay in the browser.

  • Identifies slow queries, N+1 queries, and memory bloat.

5.2 ScoutAPM

ScoutAPM provides in-depth insights into application performance, including detailed SQL analysis.

Features:

  • SQL query breakdown with execution times.

  • Historical performance trends.

  • Memory leak detection.

Installation:

Add to your Gemfile:

gem 'scout_apm'

Set up environment variables and run:

bundle install

5.3 Skylight

Skylight is another powerful APM tool that tracks endpoint performance, database queries, and external API calls.

Features:

  • Endpoint-specific query analysis.

  • Query breakdowns with detailed timing.

  • Historical performance metrics.

Installation:

gem 'skylight'

Run:

bundle install

6. Identifying Memory Leaks in Queries

Memory leaks can occur when queries load large datasets or Active Record objects unnecessarily.

6.1 Using the Memory Profiler

Memory Profiler helps identify memory bloat caused by inefficient queries.

Installation:

gem 'memory_profiler'

Usage:

require 'memory_profiler'
report = MemoryProfiler.report do
  Post.where(active: true).to_a
end
report.pretty_print

Output Analysis

  • Look for high memory allocation related to Active Record models.

  • Consider using .pluck or .select to optimize memory usage.

6.2 Using Rack Mini Profiler for Memory Leaks

Rack Mini Profiler also tracks memory usage by displaying detailed memory reports in the browser.

Example — Memory Optimization:

# Bad – Loads entire Active Record objects
titles = Post.where(active: true).map(&:title)
# Good – Only loads required columns
titles = Post.where(active: true).pluck(:title)

6.3 Optimizing Memory Usage with Batch Processing

For large datasets, use .find_in_batches or .in_batches to avoid loading all records at once.

Example — Using .find_in_batches:

Post.where(active: true).find_in_batches(batch_size: 100) do |batch|
  batch.each do |post|
    puts post.title
  end
end

Benefits:

  • Efficient memory usage.

  • Reduced risk of memory leaks.

Summary

Advanced indexing and query refactoring techniques can significantly enhance Rails application performance. Use composite, partial, JSONB, and covering indexes to optimize queries while minimizing unnecessary indexing. Refactor queries with subselects, pluck, and merge to reduce memory usage and improve efficiency. Leverage database views and materialized views for complex queries, and monitor performance with tools like Rack Mini Profiler, ScoutAPM, and Skylight. Finally, prevent memory leaks by using batch processing and memory profiling tools to optimize large dataset handling.

0
Subscribe to my newsletter

Read articles from NonStop io Technologies directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

NonStop io Technologies
NonStop io Technologies

Product Development as an Expertise Since 2015 Founded in August 2015, we are a USA-based Bespoke Engineering Studio providing Product Development as an Expertise. With 80+ satisfied clients worldwide, we serve startups and enterprises across San Francisco, Seattle, New York, London, Pune, Bangalore, Tokyo and other prominent technology hubs.