Why We Ditched ActiveRecord Pattern

In our journey to build a scalable, maintainable, and robust event-driven microservice architecture, we made a fundamental decision early on: to abandon the traditional ActiveRecord pattern in favor of a more explicit approach combining Repository and Record patterns. This article explores why we made this choice and the significant benefits it has brought to our system.

The Problem with ActiveRecord

The ActiveRecord pattern, popularized by frameworks like Ruby on Rails, combines data access and business logic into a single object. While this approach offers simplicity and rapid development for smaller applications, it introduces several challenges as systems grow:

Implicit Database Access: ActiveRecord objects can load and persist data implicitly, making it difficult to track and optimize database operations.
N+1 Query Problems: The ease of accessing associations often leads to inefficient query patterns.
Mutable State: ActiveRecord objects maintain a mutable state, which can lead to unexpected side effects.
Mixed Concerns: Query logic, business rules, and data structure are all combined in one class.
Testing Complexity: The tight coupling to the database makes unit testing more challenging.

Our Alternative: Repository and Record Patterns

Instead of ActiveRecord, we adopted a combination of two patterns:

Record Pattern

Records are immutable value objects that represent data entities. They:

Define the structure of data with typed fields
Represent relationships between entities
Are immutable, preventing unexpected state changes
Focus solely on data structure, not data access

module Account
  class Record < Verse::Model::Record::Base
    type "iam/accounts" # Used for JSON API serialization

    field :id, type: Integer, primary: true
    field :email, type: String
    field :account_type, type: String
    # Note: readonly keyword is used for reflection only, to generate update schema.
    # A record is always a read-only structure
    field :created_at, type: Time, readonly: true 
    field :updated_at, type: Time, readonly: true
    field :password_digest, type: String, visible: false

    belongs_to :person, repository: "Person::Repository", foreign_key: :person_id
    has_many :roles, repository: "Account::Role::Repository", foreign_key: :account_id
  end
end

Repository Pattern

Repositories handle data access and persistence. They:

Provide CRUD operations for records
Encapsulate query logic
Handle database-specific concerns
Manage transactions and consistency
Control authorization and access rules

module Account
  class Repository < Verse::Sequel::Repository
    self.table = "accounts" # Database table name
    self.resource = "iam:accounts" # Type used for event publishing & security scoping

    def scoped(action)
      # Scope the resource accessible based on auth_context provided when 
      # creating the repository
      auth_context.can!(action, self.class.resource) do |scope|
        scope.all? { table }
        scope.own? { table.where(id: auth_context.metadata[:id]) }
      end
    end

    # Custom filter for `index` and `find_by` queries
    custom_filter :role_name do |collection, value|
      frag = <<-SQL
        EXISTS (
          SELECT 1
          FROM account_roles
          WHERE account_roles.account_id = accounts.id
          AND account_roles.name IN (?)
        )
      SQL

      value = [value] unless value.is_a?(Array)
      collection.where(Sequel.lit(frag, value))
    end
  end
end

Key Advantages Over ActiveRecord

1. Immutable Records = Explicit Database Actions

With ActiveRecord, it's easy to modify an object and forget to save it, or conversely, accidentally persist changes:

# ActiveRecord approach
user = User.find(1)
user.email = "new@example.com"  # Changed but not saved!
# ... later in the code ...
user.save  # Oops, saved without realizing it had been changed

With our Record/Repository approach, all database operations are explicit:

# Repository/Record approach
user = user_repo.find(1)
# user.email = "new@example.com"  # Error! Records are immutable
updated_user = user_repo.update!(1, { email: "new@example.com" })  # Explicit database operation

This explicitness is crucial when database operations can be slow or expensive. There's no ambiguity about when data is being read from or written to the database.

2. Prevention of N+1 Queries

The N+1 query problem is a common performance issue with ActiveRecord:

# ActiveRecord approach - generates N+1 queries
users = User.all
users.each do |user|
  puts user.posts # Each iteration triggers a new query
end

Our Repository pattern makes it almost impossible to fall into this trap because associations aren't automatically loaded:

# Repository approach
users = user_repo.index({})
# users.each { |user| puts user.posts.count }  # Error! No implicit loading

# Instead, you must explicitly include associations or use optimized queries
users_with_posts = user_repo.index({}, included: ["posts"])
users_with_posts.first.posts # It is accessible and loaded now.

# Or better yet, create a specific query method
users_with_post_counts = user_repo.index_with_post_counts

This forces developers to think about data access patterns upfront, leading to more efficient queries.

3. Separation of Query Logic from Business Logic

In ActiveRecord, complex query logic often gets mixed with business logic:

# ActiveRecord approach - query logic mixed with business logic
class User < ActiveRecord::Base
  def self.active_premium_users_with_recent_activity
    where(status: 'active', plan: 'premium')
      .joins(:activities)
      .where('activities.created_at > ?', 30.days.ago)
      .distinct
  end

  def can_access_premium_feature?
    premium? && active?
  end
end

Our approach cleanly separates these concerns:

# Repository - handles query logic
class User::Repository < Verse::Sequel::Repository
  def active_premium_users_with_recent_activity
    scoped(:read)
      .where(status: 'active', plan: 'premium')
      .join(:activities, user_id: :id)
      .where(Sequel[:activities][:created_at] > Sequel.lit('NOW() - INTERVAL \'30 days\''))
      .distinct
  end
end

# Service - handles business logic
class UserService < Verse::Service::Base
  use_repo repo: User::Repository

  def can_access_premium_feature?(user_id)
    user = repo.find(user_id)
    user.plan == 'premium' && user.status == 'active'
  end
end

This separation makes code more maintainable and easier to test. It also allows for specialized optimization of queries without affecting business logic later in the development process.

4. Virtual Repositories for Complex Data Access

One powerful feature of our approach is the ability to create "virtual" repositories that aren't tied to a specific table but represent a projection or a complex query:

module QueryResult
  class Repository < Verse::Sequel::Repository
    attr_accessor :query_id

    def initialize(auth_context, query_id, metadata: {})
        super(auth_context, metadata:)
        @query_id = query_id
    end

    # Redefine table as a query with complex from-clause.
    def table
      # Complex SQL query that joins multiple tables and calculates relevance scores
      sql_statement = Sequel.lit(
        complex_query_fragment,
        query_id: query_id,
        # other parameters...
      )

      client { |db| db.from(sql_statement) }
    end

    def scoped(action)
        # Use this repo as a read-only repo
        raise ArgumentError, "is read-only" unless action == :read
        super
    end
  end
end

This allows us to encapsulate complex data access patterns in a clean, reusable way. The repository can handle the complexity of joining multiple tables, calculating derived values, or even accessing external services, while still presenting a consistent interface to the rest of the application.

5. Automatic Event Publishing

In a microservice architecture, communication between services is crucial. Our repository pattern automatically publishes events to an event bus on mutative actions:

module Instance
  class Repository < Verse::Sequel::Repository
    self.resource = "quiz:instances"

    event(name: "completed")
    def complete!(instance_id)
      no_event do # Optionally, prevent the event `updated` to be published, 
                  # as we replace it by `completed`
        update!(
          instance_id,
          {
            ended_at: Time.now,
            status: "completed"
          }
        )
      end
    end
  end
end

When complete! is called, it automatically publishes a "completed" event to the event bus after the database operation succeeds. Other services can subscribe to these events to react accordingly.

In Verse, the parameters passed to the repositories are sent to the event payload. In the case above, we will get an event quiz:instances:completed(resource_id=query_id, payload={})

The no_event block allows us to perform nested operations without triggering additional events, preventing event cascades.

6. Query/Event Method Flagging for Master/Replica Setups

In a distributed system with read replicas, it's important to direct read queries to replicas and write operations to the master. Our approach makes this explicit:

module Instance
  class Repository < Verse::Sequel::Repository
    # Write operation - goes to master
    def update_status!(id, status)
      update!(id, { status: })
    end

    # Flag this method as read operation - can go to replica
    query
    def exists_for_quiz?(quiz_id)
      scoped(:read)
        .where(quiz_id: quiz_id)
        .select(1)
        .limit(1)
        .any?
    end
  end
end

Methods marked with query are automatically routed to read replicas, while other methods go to the master. This simple annotation makes it easy to optimize database load without complex configuration or middleware.

There is a catch: In the case of a read action followed by a write, you can use Repository#with_db_mode(:rw, &block) to force usage of the master node.

7. Table-Level Authentication with Scoped Methods

Authorization is a cross-cutting concern that's often awkwardly implemented in ActiveRecord. Our repository pattern elegantly handles this with scoped methods:

module Account
  class Repository < Verse::Sequel::Repository
    def scoped(action)
      auth_context.can!(action, "iam:accounts") do |scope|
        scope.all? { table }  # Admins can access all accounts

        scope.by_ou? do # Scoped by organizational units, with the specific ou stored in the context itself
          ou = auth_context[:ou]
          auth_context.reject! unless ou
          Service::TableQuery.by_related_ou(table, ou, related_table: :people, foreign_key: :person_id)
        end

        scope.own? { table.where(id: auth_context.metadata[:id]) }  # Users can access their own account
      end
    end
  end
end

This approach:

Centralizes authorization logic in the repository
Makes it impossible to accidentally bypass authorization
Allows for fine-grained access control based on user context
Keeps authorization logic close to the data it protects

Real-World Comparison

Let's compare a typical ActiveRecord implementation with our Repository/Record approach for a common task: finding users with a specific role and updating their status.

ActiveRecord Approach

# ActiveRecord implementation
class User < ActiveRecord::Base
  has_many :roles

  def self.with_role(role_name)
    joins(:roles).where(roles: { name: role_name })
  end
end

# Usage
admin_users = User.with_role('admin')
admin_users.update_all(status: 'active')

Issues with this approach:

Authorization is not enforced
The update triggers callbacks but no explicit events
It's not clear if this should run on master or replica
Complex queries would mix with the User model

Repository/Record Approach

# Repository implementation
module User
  class Repository < Verse::Sequel::Repository
    self.table = "users"
    self.resource = "iam:users"

    def scoped(action)
      auth_context.can!(action, "iam:users") do |scope|
        scope.all? { table }
        scope.own? { table.where(id: auth_context.metadata[:id]) }
      end
    end

    custom_filter :role_name do |collection, value|
      collection.join(:user_roles, user_id: :id).where(Sequel[:user_roles][:name] => value)
    end

    event(name: "status_updated")
    def update_status_for_role!(role_name, status)
      users = scoped(:update).where(role_name: role_name)
      # Prevent `updated` event to be triggered, supersed by `status_updated`
      no_event{ users.update!(status: status) }
    end
  end
end

# Usage
user_repo.update_status_for_role!('admin', 'active')

Benefits of this approach:

Authorization is automatically enforced
An event is published for other services
The method name makes it clear it's a write operation
Query logic is encapsulated in the repository

Conclusion

Switching from ActiveRecord to the Repository/Record pattern has been transformative for our microservice architecture. While it required more upfront design and slightly more code, the benefits have far outweighed the costs:

Explicit database operations prevent accidental queries and make performance bottlenecks obvious
Immutable records lead to more predictable code with fewer side effects
Separation of concerns makes our codebase more maintainable and testable
Built-in event publishing facilitates communication between microservices
Query/event flagging optimizes database load in distributed systems
Integrated authorization ensures consistent access control

For complex, distributed systems, especially those with microservice architectures, the explicitness and separation of concerns provided by the Repository/Record pattern offer significant advantages over the traditional ActiveRecord approach.

What's Next?

In future articles, we'll dive deeper into specific aspects of our architecture:

How we implement complex role-based authorization across microservices
Strategies for testing repositories and records effectively
Techniques for optimizing database queries and transactions
Real-time performance monitoring

Have you experimented with alternatives to ActiveRecord in your projects? We'd love to hear about your experiences and challenges. Share your thoughts in the comments below or reach out to our team to discuss how these patterns might benefit your architecture.

This article is part of our series on event-driven microservice architecture. Stay tuned for more insights into how we've built a scalable, maintainable system.

Passive Record