Aggregation Pipelines in MongoDB

The concept of aggregation pipelines in MongoDB is considered one of the complex topics in MongoDB databases. Therefore, it is often found in SDE II or above job roles and rarely for SDE I.

  • In this concept, we generally consider that aggregation pipelines consists of one or more stages that processes the documents.

  • Each stage is used to perform a function on the input documents.

  • The documents that are output from a stage are passed to the next stage.

  • An aggregation pipeline can return results for groups of documents.

// how to write aggregation pipelines.
db.collection.aggregate([
    {}, // first pipeline
    {}  // second pipeline 
])

Some stages used in Aggregation Pipelines :

  1. $match : In MongoDB, the $match operator is used within the aggregation framework to filter documents based on specified criteria. It is similar to the WHERE clause in SQL. Here's how you can use it:

     db.collection.aggregate([
       { $match: { field: value } }
     ])
    

    In this example:

    • db.collection.aggregate() is used to perform aggregation operations on the collection.

    • { $match: { field: value } } is the stage where you specify the filtering criteria. Replace field with the field you want to filter on and value with the value you want to match.

For instance, if you have a collection of documents representing books and you want to find books with a specific genre:

    db.books.aggregate([
      { $match: { genre: "Science Fiction" } }
    ])

This query will return all documents from the books collection where the genre field is equal to "Science Fiction".

  1. $lookup : In MongoDB, the $lookup stage is used within the aggregation framework to perform a left outer join between documents from two collections. This allows you to combine data from multiple collections in a single query. Here's how you can use it:

     db.collection1.aggregate([
       {
         $lookup: {
           from: "collection2",
           localField: "field1",
           foreignField: "field2",
           as: "outputField"
         }
       }
     ])
    

    In this example:

    • db.collection1.aggregate() is used to perform aggregation operations on collection1.

    • $lookup is the stage where you specify the details of the join.

    • from specifies the name of the collection to join with (collection2).

    • localField specifies the field from the input documents (collection1) to join on (field1).

    • foreignField specifies the field from the documents of the "from" collection (collection2) to join on (field2).

    • as specifies the name of the output field that will contain the joined array.

For instance, if you have two collections, orders and products, and you want to retrieve all orders with details of the corresponding products:

    db.orders.aggregate([
      {
        $lookup: {
          from: "products",
          localField: "productId",
          foreignField: "_id",
          as: "productDetails"
        }
      }
    ])

In this example, orders and products are the collections, productId is the field in the orders collection that matches with the _id field in the products collection, and productDetails is the name of the output field that will contain the joined array with product details.

  1. $addFields : In MongoDB, the $addFields stage is used within the aggregation framework to add new fields to documents in the pipeline. This stage is particularly useful when you want to include computed fields or transform existing fields. Here's how you can use it:

     db.collection.aggregate([
       {
         $addFields: {
           newField: expression
         }
       }
     ])
    

    In this example:

    • db.collection.aggregate() is used to perform aggregation operations on the collection.

    • $addFields is the stage where you specify the fields to be added.

    • newField is the name of the new field you want to add.

    • expression is the expression used to compute the value of the new field.

For instance, if you have a collection of documents representing employees and you want to add a new field totalSalary that combines salary and bonus:

    db.employees.aggregate([
      {
        $addFields: {
          totalSalary: { $sum: ["$salary", "$bonus"] }
        }
      }
    ])

In this example, $sum is an aggregation operator that calculates the sum of the provided array. $salary and $bonus are the existing fields in the documents, and totalSalary is the new field that will contain the sum of salary and bonus for each document.

You can use any valid expression to compute the value of the new field, including arithmetic operations, functions, or even concatenation of strings.

  1. $project : In MongoDB, the $project stage is used within the aggregation framework to shape documents by including, excluding, or renaming fields. It allows you to reshape documents before passing them to the next stage in the aggregation pipeline. Here's how you can use it:

     db.collection.aggregate([
       {
         $project: {
           field1: 1,          // include field1
           field2: 1,          // include field2
           newField: "$field3", // include field3 and rename it as newField
           _id: 0             // exclude _id field
         }
       }
     ])
    

    In this example:

    • db.collection.aggregate() is used to perform aggregation operations on the collection.

    • $project is the stage where you specify the fields to be included, excluded, or renamed.

    • field1: 1 and field2: 1 include the fields field1 and field2 in the output document.

    • newField: "$field3" includes field3 in the output document but renames it as newField.

    • _id: 0 excludes the _id field from the output document.

For instance, if you have a collection of documents representing employees and you only want to include their name and age fields in the output:

    db.employees.aggregate([
      {
        $project: {
          name: 1,
          age: 1,
          _id: 0
        }
      }
    ])

This will output documents containing only the name and age fields, with the _id field excluded.

Additionally, you can use $project to create computed fields, apply expressions, or reshape documents according to your requirements.

Database is always in another continent , therefore always use await.

Content Resources :

  • Courtesy : Hitesh Choudhary

  • For detailed video explanation follow :

    %[https://youtu.be/SUZKhBvxW5c?si=rR4azTi6cJ4rNHn7]

0
Subscribe to my newsletter

Read articles from Agnibha Chakraborty directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Agnibha Chakraborty
Agnibha Chakraborty