Understanding MongoDB Aggregation Pipeline

Ankit DwivediAnkit Dwivedi
3 min read

What is an Aggregation Pipeline?

In MongoDB, an aggregation pipeline is a powerful way to process data and perform complex queries. It has stages that your data goes through to transform and analyze it. Each stage takes in documents (records), performs an operation on them, and passes the result to the next stage. The final output is the result of all these transformations.

What is Aggregation?

Aggregation is the process of gathering data and summarizing it in a meaningful way. For example, you might want to count the number of users, find the average age of users, or sum up sales figures. In MongoDB, aggregation operations can be done using the aggregation pipeline.

Example:

const channel = await User.aggregate([
    {
        $match: {username: username?.toLowerCase()}
    },
    {
        $lookup: {
            from: "subscriptions",
            localField: "_id",
            foreignField: "channel",
            as: "subscribers"
        }
    },
    {
        $lookup: {
            from: "subscriptions",
            localField: "_id",
            foreignField: "subscriber",
            as: "subscribedTo"
        }
    },
    {
        $addFields: {
            subscribersCount: {$size: "$subscribers"},
            channelsSubscribedToCount: {$size: "$subscribedTo"},
            isSubscribed: {
                $cond: {
                    if: {$in: [req.user._id, "$subscribers.subscriber"]},
                    then: true,
                    else: false
                }
            }
        }
    },
    {
        $project: {
            fullname: 1,
            username: 1,
            subscribersCount: 1,
            channelsSubscribedToCount: 1,
            isSubscribed: 1,
            email: 1,
            avatar: 1,
            coverImage: 1
        }
    }
])

Stage 1: "$match":

{
    $match: {username: username?.toLowerCase()}
}

The $match stage filters the documents to find those that match a specific condition. In this case, it looks for a user whose username matches the provided username (converted to lowercase). If the username is "Ankit", it will find the user document with username: "ankit".

Stage 2: "$lookup"(subscribers):

{
    $lookup: {
        from: "subscriptions",
        localField: "_id",
        foreignField: "channel",
        as: "subscribers"
    }
}

The $lookup stage performs a join with another collection. Here, it looks in the subscriptions collection for documents where the channel field matches the user's _id. It then creates a new array field called subscribers with these matching documents. This helps to find all users who have subscribed to this user's channel.

Stage 3: "$lookup"(subscribedTo):

{
    $lookup: {
        from: "subscriptions",
        localField: "_id",
        foreignField: "subscriber",
        as: "subscribedTo"
    }
}

This $lookup stage is similar to the previous one but it looks for documents where the subscriber field matches the user's _id. It then creates a new array field called subscribedTo with these matching documents. This helps to find all channels that this user has subscribed to.

Stage 4: "$addFields":

{
    $addFields: {
        subscribersCount: {$size: "$subscribers"},
        channelsSubscribedToCount: {$size: "$subscribedTo"},
        isSubscribed: {
            $cond: {
                if: {$in: [req.user._id, "$subscribers.subscriber"]},
                then: true,
                else: false
            }
        }
    }
}

The $addFields stage adds new fields to each document:

  • subscribersCount: Counts the number of documents in the subscribers array using the $size operator.

  • channelsSubscribedToCount: Counts the number of documents in the subscribedTo array using the $size operator.

  • isSubscribed: Checks if the current user's _id (req.user._id) is in the subscribers.subscriber array. If true, it sets isSubscribed to true; otherwise, it sets it to false.

Stage 5: "$project":

{
    $project: {
        fullname: 1,
        username: 1,
        subscribersCount: 1,
        channelsSubscribedToCount: 1,
        isSubscribed: 1,
        email: 1,
        avatar: 1,
        coverImage: 1
    }
}

The $project stage specifies which fields to include in the final output. A value of 1 means the field is included, while 0 would exclude it. In this case, it includes the fullname, username, subscribersCount, channelsSubscribedToCount, isSubscribed, email, avatar, and coverImage fields.

Conclusion

The MongoDB aggregation pipeline is a powerful tool for transforming and analyzing your data. In my example, I used the pipeline to fetch a user's channel profile, including subscriber counts, subscription counts, and whether the current user is subscribed to the channel.

0
Subscribe to my newsletter

Read articles from Ankit Dwivedi directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ankit Dwivedi
Ankit Dwivedi