Understanding MongoDB Aggregation Pipeline with an Example

Rishav KumarRishav Kumar
4 min read

MongoDB's aggregation pipeline is a powerful framework. It provides a way to filter, group, calculate and manipulate data, returning precise results with great efficiency. In this article, I'll break down an example of how MongoDB's aggregation pipeline works using a specific piece of code designed to gather information about a user's channel, including their subscribers which I learned from Hitesh Choudhary

The code below performs an aggregation query on a collection of users (User), gathering detailed information about their subscriptions, subscribers, and other fields:

const channel = await User.aggregate([
  {
    $match: {
      username: username?.toLowerCase(),
    },
  },
  {
    $lookup: {
      from: "subscriptions",
      localField: "_id",
      foreignField: "channel",
      as: "peopleSeeingMe", // NOTE: subscribers
    },
  },
  {
    $lookup: {
      from: "subscriptions",
      localField: "_id",
      foreignField: "subscriber",
      as: "seeingWhom", // NOTE: subscribedTo (whom I'm seeing)
    },
  },
  {
    $addFields: {
      subscribersCount: {
        $size: "$peopleSeeingMe",
      },
      channelSubscribedTocount: {
        $size: "$seeingWhom",
      },
      isSubscribed: {
        $cond: {
          if: { $in: [req.user?._id, "$peopleSeeingMe.subscriber"] },
          then: true,
          else: false,
        },
      },
    },
  },
  {
    $project: {
      fullname: 1,
      username: 1,
      email: 1,
      subscribersCount: 1,
      channelSubscribedTocount: 1,
      isSubscribed: 1,
      avatar: 1,
      coverImage: 1,
    },
  },
]);

Step-by-Step Explanation of Each Stage

The code is divided into several stages, each performing a specific operation on the user data:

1. $match Stage

{ $match: { username: username?.toLowerCase(),},}
  • Usage: This stage filters the User collection to match a specific username. The username is converted to lowercase to ensure the search is case-insensitive.

  • Result: Only the user with the matching username proceeds to the next stage.

2. $lookup Stage: Gathering Subscribers (peopleSeeingMe)

{ $lookup: {
    from: "subscriptions",
    localField: "_id",
    foreignField: "channel",
    as: "peopleSeeingMe",},}
  • Usage: This stage performs a left outer join between the User collection and the subscriptions collection, where the user's _id matches the channel field in the subscriptions collection.

  • Result: The resulting array, peopleSeeingMe, contains documents where other users are subscribed to the current user's channel.

3. $lookup Stage: Gathering Subscribed Channels (seeingWhom)

{ $lookup: {
    from: "subscriptions",
    localField: "_id",
    foreignField: "subscriber",
    as: "seeingWhom",},}
  • Usage: Similar to the previous $lookup, this stage retrieves the subscriptions that the user has made (i.e., channels the user is subscribed to). It looks for entries in the subscriptions collection where the user's _id matches the subscriber field.

  • Result: The array seeingWhom holds all the channels that the user is subscribed to.

4. $addFields Stage: Calculating Additional Data

{
  $addFields: {
    subscribersCount: {
      $size: "$peopleSeeingMe",
    },
    channelSubscribedTocount: {
      $size: "$seeingWhom",
    },
    isSubscribed: {
      $cond: {
        if: { $in: [req.user?._id, "$peopleSeeingMe.subscriber"] },
        then: true,
        else: false,},},},}
  • Usage: This stage introduces new fields into the documents:
  1. subscribersCount: The total number of subscribers is calculated using $size, which returns the length of the peopleSeeingMe array.

  2. channelSubscribedTocount: Similarly, the count of channels the user is subscribed to is derived from the seeingWhom array.

  3. isSubscribed: A boolean field that checks if the currently authenticated user (req.user._id) is subscribed to this user's channel. The $in operator checks if the req.user._id exists in the peopleSeeingMe.subscriber array.

5. $project Stage: Selecting Fields to Return

{ $project: {
    fullname: 1,
    username: 1,
    email: 1,
    subscribersCount: 1,
    channelSubscribedTocount: 1,
    isSubscribed: 1,
    avatar: 1,
    coverImage: 1,},}
  • Usage: This stage controls which fields should be included in the final output. Only the specified fields are included, such as fullname, username, email, subscribersCount, channelSubscribedTocount, isSubscribed, avatar, and coverImage.

  • Result: The output document will only contain the specified fields, hiding any others from the result set.


Aggregation Pipeline in Action

Each stage in the MongoDB aggregation pipeline builds upon the results of the previous stage, transforming the data step-by-step. In this example:

  1. The pipeline first filters the user by username.

  2. It then gathers the user's subscribers and the channels they are subscribed to via two separate $lookup operations.

  3. Additional fields, such as the number of subscribers, subscriptions, and whether the current user is subscribed to the channel, are computed.

  4. Finally, the result is projected to only include the relevant information.

1
Subscribe to my newsletter

Read articles from Rishav Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rishav Kumar
Rishav Kumar

MERN Stack Dev Looking for Internship/Job