Transform Your Data: A Practical Guide to MongoDB Aggregation

Nitin SainiNitin Saini
11 min read

Aggregation in MongoDB is a robust framework that allows you to process and transform data within your collections to derive meaningful insights and summarized results. It's like having a miniature data processing engine directly within your database, enabling complex data manipulations that go beyond simple queries.

What is Aggregation Pipeline?

At its heart, MongoDB's aggregation operates through an aggregation pipeline. Imagine a conveyor belt where your documents (data records) are fed in at one end. As they travel along the belt, they pass through a series of specialized workstations, each called a stage. Each stage performs a specific operation on the documents it receives and then passes its modified output to the next stage in the sequence.

Aggregation Pipeline

This sequential processing allows you to build sophisticated data transformations by chaining together multiple operations. The output of an aggregation pipeline is typically a cursor. Just like with a standard find() operation, a cursor is a pointer to the result set. This means that MongoDB doesn't necessarily return all the aggregated data at once, especially for large results. Instead, it provides a cursor that you can iterate over to retrieve the results one by one, managing memory efficiently and allowing for flexible consumption of the processed data.

Why Do We Need Aggregation in MongoDB?

Aggregation is essential in MongoDB for several key reasons:

  • Summarizing and Analyzing Data: It helps in calculating totals, averages, counts, minimums, maximums, and other statistics across sets of documents. For example, you can easily get total revenue per region or average product price within a category.

  • Shaping and Transforming Documents: Aggregation allows you to modify the structure of documents, by creating new fields, removing irrelevant ones, or renaming fields. This is especially useful when tailoring data for a specific application or generating customized reports.

  • Joining Data Across Collections: Even though MongoDB uses a flexible schema, you can still combine data from multiple collections using $lookup. This is useful for enriching data with related information or flattening data structures temporarily.

  • Handling Complex Data Workflows: Aggregation is powerful for multi-step processing, like identifying patterns, filtering based on dynamic conditions, or calculating moving averages and other trends.

  • Server-Side Efficiency: Since all aggregation operations are executed on the MongoDB server, data is processed where it lives, reducing network overhead and boosting performance, particularly for large-scale datasets.

Common Aggregation Pipeline Stages

Here are some of the most frequently used stages in an aggregation pipeline, along with their features and examples. Assuming, you already know how to work with mongoDB basic document insert, update, get, and delete operations. Below is a dataset based on we will create an aggregation pipeline. But, first let’s know about the pipeline stages.

[
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f0" },
    "transactionId": "TRX-0000001",
    "userId": "user_A",
    "orderDate": { "$date": "2025-07-10T14:30:00Z" },
    "items": [
      { "productId": "PROD-ABC", "productName": "Laptop", "category": "Electronics", "quantity": 1, "unitPrice": 1200.00 },
      { "productId": "PROD-XYZ", "productName": "Wireless Mouse", "category": "Electronics", "quantity": 1, "unitPrice": 25.50 }
    ],
    "totalAmount": 1225.50,
    "paymentMethod": "Credit Card",
    "shippingAddress": {
      "street": "101 Cyber City",
      "city": "Bengaluru",
      "state": "Karnataka",
      "zipCode": "560001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f1" },
    "transactionId": "TRX-0000002",
    "userId": "user_B",
    "orderDate": { "$date": "2025-07-15T09:15:00Z" },
    "items": [
      { "productId": "PROD-123", "productName": "History Book", "category": "Books", "quantity": 2, "unitPrice": 18.00 }
    ],
    "totalAmount": 36.00,
    "paymentMethod": "UPI",
    "shippingAddress": {
      "street": "22 Library Road",
      "city": "Delhi",
      "state": "Delhi",
      "zipCode": "110001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f2" },
    "transactionId": "TRX-0000003",
    "userId": "user_C",
    "orderDate": { "$date": "2025-06-20T11:00:00Z" },
    "items": [
      { "productId": "PROD-456", "productName": "Cotton T-Shirt", "category": "Apparel", "quantity": 3, "unitPrice": 22.00 },
      { "productId": "PROD-789", "productName": "Running Shoes", "category": "Sports", "quantity": 1, "unitPrice": 75.00 }
    ],
    "totalAmount": 141.00,
    "paymentMethod": "Debit Card",
    "shippingAddress": {
      "street": "500 Lake View",
      "city": "Mumbai",
      "state": "Maharashtra",
      "zipCode": "400001",
      "country": "India"
    },
    "status": "shipped"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f3" },
    "transactionId": "TRX-0000004",
    "userId": "user_D",
    "orderDate": { "$date": "2025-07-01T16:45:00Z" },
    "items": [
      { "productId": "PROD-JKL", "productName": "Smartwatch", "category": "Electronics", "quantity": 1, "unitPrice": 199.99 }
    ],
    "totalAmount": 199.99,
    "paymentMethod": "Credit Card",
    "shippingAddress": {
      "street": "30 Green Street",
      "city": "Hyderabad",
      "state": "Telangana",
      "zipCode": "500001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f4" },
    "transactionId": "TRX-0000005",
    "userId": "user_E",
    "orderDate": { "$date": "2025-07-18T10:00:00Z" },
    "items": [
      { "productId": "PROD-MNO", "productName": "Coffee Maker", "category": "Home & Kitchen", "quantity": 1, "unitPrice": 95.00 },
      { "productId": "PROD-PQR", "productName": "Mug Set", "category": "Home & Kitchen", "quantity": 1, "unitPrice": 30.00 }
    ],
    "totalAmount": 125.00,
    "paymentMethod": "Net Banking",
    "shippingAddress": {
      "street": "45 Park Avenue",
      "city": "Chennai",
      "state": "Tamil Nadu",
      "zipCode": "600001",
      "country": "India"
    },
    "status": "pending"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f5" },
    "transactionId": "TRX-0000006",
    "userId": "user_F",
    "orderDate": { "$date": "2025-06-05T18:20:00Z" },
    "items": [
      { "productId": "PROD-STU", "productName": "Gaming Console", "category": "Electronics", "quantity": 1, "unitPrice": 499.00 },
      { "productId": "PROD-VWX", "productName": "Gaming Headset", "category": "Accessories", "quantity": 1, "unitPrice": 79.99 }
    ],
    "totalAmount": 578.99,
    "paymentMethod": "Credit Card",
    "shippingAddress": {
      "street": "88 Tech Park",
      "city": "Bengaluru",
      "state": "Karnataka",
      "zipCode": "560001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f6" },
    "transactionId": "TRX-0000007",
    "userId": "user_G",
    "orderDate": { "$date": "2025-07-02T21:00:00Z" },
    "items": [
      { "productId": "PROD-YZA", "productName": "Yoga Mat", "category": "Sports", "quantity": 1, "unitPrice": 30.00 },
      { "productId": "PROD-BCD", "productName": "Dumbbell Set", "category": "Sports", "quantity": 1, "unitPrice": 60.00 }
    ],
    "totalAmount": 90.00,
    "paymentMethod": "UPI",
    "shippingAddress": {
      "street": "15 Fitness Road",
      "city": "Mumbai",
      "state": "Maharashtra",
      "zipCode": "400001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f7" },
    "transactionId": "TRX-0000008",
    "userId": "user_H",
    "orderDate": { "$date": "2025-04-25T07:00:00Z" },
    "items": [
      { "productId": "PROD-EFG", "productName": "E-Reader", "category": "Electronics", "quantity": 1, "unitPrice": 120.00 },
      { "productId": "PROD-HIJ", "productName": "USB-C Cable", "category": "Accessories", "quantity": 2, "unitPrice": 8.00 }
    ],
    "totalAmount": 136.00,
    "paymentMethod": "Credit Card",
    "shippingAddress": {
      "street": "700 Old Delhi Road",
      "city": "Delhi",
      "state": "Delhi",
      "zipCode": "110001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f8" },
    "transactionId": "TRX-0000009",
    "userId": "user_I",
    "orderDate": { "$date": "2025-07-19T05:00:00Z" },
    "items": [
      { "productId": "PROD-KLM", "productName": "Blender", "category": "Home & Kitchen", "quantity": 1, "unitPrice": 70.00 }
    ],
    "totalAmount": 70.00,
    "paymentMethod": "UPI",
    "shippingAddress": {
      "street": "23 Food Street",
      "city": "Kolkata",
      "state": "West Bengal",
      "zipCode": "700001",
      "country": "India"
    },
    "status": "completed"
  },
  {
    "_id": { "$oid": "6699a9e3e7f4a5b6c7d8e9f9" },
    "transactionId": "TRX-0000010",
    "userId": "user_J",
    "orderDate": { "$date": "2025-03-01T13:00:00Z" },
    "items": [
      { "productId": "PROD-NOP", "productName": "Jeans", "category": "Apparel", "quantity": 1, "unitPrice": 45.00 },
      { "productId": "PROD-QRS", "productName": "Jacket", "category": "Apparel", "quantity": 1, "unitPrice": 90.00 }
    ],
    "totalAmount": 135.00,
    "paymentMethod": "Debit Card",
    "shippingAddress": {
      "street": "10 Mall Road",
      "city": "Lucknow",
      "state": "Uttar Pradesh",
      "zipCode": "226001",
      "country": "India"
    },
    "status": "cancelled"
  }
]
  1. $match: This is the first stage of aggregation pipeline and used to filters documents to pass only those that match the specified conditions to the next stage. It's like the where clause in SQL or a find() query in MongoDB.

    Query: Find all transactions with status: "completed"

     db.ecommerce_transactions.aggregate([
       {
         $match: {
           status: "completed"
         }
       }
     ])
    

    Output: Subset of the 10 documents with _id ending in e9f0, e9f1, e9f3, e9f5, e9f6, e9f7, e9f8, e9f9 would be returned.

     [
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f0"),
         "transactionId": "TRX-0000001",
         "userId": "user_A",
         "orderDate": ISODate("2025-07-10T14:30:00Z"),
         "items": [
           { "productId": "PROD-ABC", "productName": "Laptop", "category": "Electronics", "quantity": 1, "unitPrice": 1200 },
           { "productId": "PROD-XYZ", "productName": "Wireless Mouse", "category": "Electronics", "quantity": 1, "unitPrice": 25.5 }
         ],
         "totalAmount": 1225.5,
         "paymentMethod": "Credit Card",
         "shippingAddress": { "street": "101 Cyber City", "city": "Bengaluru", "state": "Karnataka", "zipCode": "560001", "country": "India" },
         "status": "completed"
       },
       ...
       ...
     ]
    
  2. $unwind: It deconstructs an array field from the input documents to output a document for each element. If a document has an array with three elements, $unwind would produce three separate documents, each containing one of the array elements.

    Query: Deconstruct the items array.

     db.ecommerce_transactions.aggregate([
       {
         $unwind: "$items"
       },
       {
         $limit: 2
       }
     ])
    

    Output: The above query will return the two documents after unwind. Notice how, first document from the original collection has been split into two, one for each item from items array.

     [
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f0"),
         "transactionId": "TRX-0000001",
         "userId": "user_A",
         "orderDate": ISODate("2025-07-10T14:30:00Z"),
         "items": { "productId": "PROD-ABC", "productName": "Laptop", "category": "Electronics", "quantity": 1, "unitPrice": 1200 },
         "totalAmount": 1225.5,
         "paymentMethod": "Credit Card",
         "shippingAddress": { "street": "101 Cyber City", "city": "Bengaluru", "state": "Karnataka", "zipCode": "560001", "country": "India" },
         "status": "completed"
       },
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f0"),
         "transactionId": "TRX-0000001",
         "userId": "user_A",
         "orderDate": ISODate("2025-07-10T14:30:00Z"),
         "items": { "productId": "PROD-XYZ", "productName": "Wireless Mouse", "category": "Electronics", "quantity": 1, "unitPrice": 25.5 },
         "totalAmount": 1225.5,
         "paymentMethod": "Credit Card",
         "shippingAddress": { "street": "101 Cyber City", "city": "Bengaluru", "state": "Karnataka", "zipCode": "560001", "country": "India" },
         "status": "completed"
       },
    
  3. $group: This stage groups input documents by a specified _id expression and applies accumulator expressions to each group. This is where you calculate aggregates like sums, averages, counts, etc.

    Query: Calculate the total totalAmount for each paymentMethod

     db.ecommerce_transactions.aggregate([
       {
         $group: {
           _id: "$paymentMethod",
           totalRevenue: { $sum: "$totalAmount" }, 
           numberOfTransactions: { $sum: 1 }
         }
       }
     ])
    

    Output: The above query will do the following operations:

    1. Group by the 'paymentMethod' field

    2. $sum: Sum the 'totalAmount' for each group

    3. Count the number of documents (transactions) in each group

       [
         { "_id": "Debit Card", "totalRevenue": 276, "numberOfTransactions": 2 },
         { "_id": "UPI", "totalRevenue": 196, "numberOfTransactions": 3 },
         { "_id": "Credit Card", "totalRevenue": 2039.49, "numberOfTransactions": 4 },
         { "_id": "Net Banking", "totalRevenue": 125, "numberOfTransactions": 1 }
       ]
      
  4. $project: It Reshapes each document in the stream. You can include, exclude, or rename fields, and also add new computed fields using expressions.

    Query: Display transactionId, userId, totalAmount, and calculate a new field gstAmount (18% of totalAmount). Exclude the _id field.

     db.ecommerce_transactions.aggregate([
       {
         $project: {
           _id: 0, // 0 to exclude and 1 to keep
           transactionId: 1,
           userId: 1,
           totalAmount: 1,
           gstAmount: { $multiply: ["$totalAmount", 0.18] }, // here we Calculate 18% GST
           deliveryCity: "$shippingAddress.city" // Rename a nested field for simplicity
         }
       },
       {
         $limit: 3 
       }
     ])
    

    Output (first 3 documents):

     [
       {
         "transactionId": "TRX-0000001",
         "userId": "user_A",
         "totalAmount": 1225.5,
         "gstAmount": 220.59,
         "deliveryCity": "Bengaluru"
       },
       {
         "transactionId": "TRX-0000002",
         "userId": "user_B",
         "totalAmount": 36,
         "gstAmount": 6.48,
         "deliveryCity": "Delhi"
       },
       {
         "transactionId": "TRX-0000003",
         "userId": "user_C",
         "totalAmount": 141,
         "gstAmount": 25.38,
         "deliveryCity": "Mumbai"
       }
     ]
    
  5. $sort: Reorders the document stream by a specified sort key.

    Query: Get the 3 most recent transactions.

     db.ecommerce_transactions.aggregate([
       {
         $sort: { orderDate: -1 } // Sort by orderDate in descending order (most recent first)
       },
       {
         $limit: 3
       }
     ])
    

    Output:

     [
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f4"),
         "transactionId": "TRX-0000005",
         "userId": "user_E",
         "orderDate": ISODate("2025-07-18T10:00:00Z"),
         "items": [
           { "productId": "PROD-MNO", "productName": "Coffee Maker", "category": "Home & Kitchen", "quantity": 1, "unitPrice": 95 },
           { "productId": "PROD-PQR", "productName": "Mug Set", "category": "Home & Kitchen", "quantity": 1, "unitPrice": 30 }
         ],
         "totalAmount": 125,
         "paymentMethod": "Net Banking",
         "shippingAddress": { "street": "45 Park Avenue", "city": "Chennai", "state": "Tamil Nadu", "zipCode": "600001", "country": "India" },
         "status": "pending"
       },
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f1"),
         "transactionId": "TRX-0000002",
         "userId": "user_B",
         "orderDate": ISODate("2025-07-15T09:15:00Z"),
         "items": [
           { "productId": "PROD-123", "productName": "History Book", "category": "Books", "quantity": 2, "unitPrice": 18 }
         ],
         "totalAmount": 36,
         "paymentMethod": "UPI",
         "shippingAddress": { "street": "22 Library Road", "city": "Delhi", "state": "Delhi", "zipCode": "110001", "country": "India" },
         "status": "completed"
       },
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f0"),
         "transactionId": "TRX-0000001",
         "userId": "user_A",
         "orderDate": ISODate("2025-07-10T14:30:00Z"),
         "items": [
           { "productId": "PROD-ABC", "productName": "Laptop", "category": "Electronics", "quantity": 1, "unitPrice": 1200 },
           { "productId": "PROD-XYZ", "productName": "Wireless Mouse", "category": "Electronics", "quantity": 1, "unitPrice": 25.5 }
         ],
         "totalAmount": 1225.5,
         "paymentMethod": "Credit Card",
         "shippingAddress": { "street": "101 Cyber City", "city": "Bengaluru", "state": "Karnataka", "zipCode": "560001", "country": "India" },
         "status": "completed"
       }
     ]
    
  6. $limit: Passes the first n documents unmodified to the pipeline.

    Query: Get the first 5 transactions recorded (in their natural order).

     db.ecommerce_transactions.aggregate([
       {
         $limit: 5
       }
     ])
    

    Output: The first 5 documents from the provided sample dataset will be returned: TRX-0000001 through TRX-0000005.

  7. $skip: Skips the first n documents and passes the remaining documents unmodified to the pipeline.

    Query: Skip the first 5 transactions and get the next 3.

     db.ecommerce_transactions.aggregate([
       {
         $sort: { orderDate: 1 }
       },
       {
         $skip: 5
       },
       {
         $limit: 3
       }
     ])
    

    Output (based on sorted orderDate): The documents with orderDate from 2025-06-05, 2025-06-20, and 2025-07-01 would likely be next after skipping the earliest 5.

     [
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f5"),
         "transactionId": "TRX-0000006",
         "userId": "user_F",
         "orderDate": ISODate("2025-06-05T18:20:00Z"),
         "items": [
           { "productId": "PROD-STU", "productName": "Gaming Console", "category": "Electronics", "quantity": 1, "unitPrice": 499 },
           { "productId": "PROD-VWX", "productName": "Gaming Headset", "category": "Accessories", "quantity": 1, "unitPrice": 79.99 }
         ],
         "totalAmount": 578.99,
         "paymentMethod": "Credit Card",
         "shippingAddress": { "street": "88 Tech Park", "city": "Bengaluru", "state": "Karnataka", "zipCode": "560001", "country": "India" },
         "status": "completed"
       },
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f2"),
         "transactionId": "TRX-0000003",
         "userId": "user_C",
         "orderDate": ISODate("2025-06-20T11:00:00Z"),
         "items": [
           { "productId": "PROD-456", "productName": "Cotton T-Shirt", "category": "Apparel", "quantity": 3, "unitPrice": 22 },
           { "productId": "PROD-789", "productName": "Running Shoes", "category": "Sports", "quantity": 1, "unitPrice": 75 }
         ],
         "totalAmount": 141,
         "paymentMethod": "Debit Card",
         "shippingAddress": { "street": "500 Lake View", "city": "Mumbai", "state": "Maharashtra", "zipCode": "400001", "country": "India" },
         "status": "shipped"
       },
       {
         "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9f3"),
         "transactionId": "TRX-0000004",
         "userId": "user_D",
         "orderDate": ISODate("2025-07-01T16:45:00Z"),
         "items": [
           { "productId": "PROD-JKL", "productName": "Smartwatch", "category": "Electronics", "quantity": 1, "unitPrice": 199.99 }
         ],
         "totalAmount": 199.99,
         "paymentMethod": "Credit Card",
         "shippingAddress": { "street": "30 Green Street", "city": "Hyderabad", "state": "Telangana", "zipCode": "500001", "country": "India" },
         "status": "completed"
       }
     ]
    
  8. $lookup: Performs a left outer join to an unsharded collection in the same database. This allows you to combine data from two different collections.

    Query: Join ecommerce_transactions with a another users collection to get user emails. We'll assume you have a users collection with a userId field.

    Assumed users collection data:

     [
       { "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9fa"), "userId": "user_A", "userName": "Alice Smith", "email": "alice.s@example.com" },
       { "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9fb"), "userId": "user_B", "userName": "Bob Johnson", "email": "bob.j@example.com" },
       ...
       ...
       { "_id": ObjectId("6699a9e3e7f4a5b6c7d8e9fc"), "userId": "user_C", "userName": "Charlie Brown", "email": "charlie.b@example.com" }
     ]
    
     db.ecommerce_transactions.aggregate([
       {
         $match: { transactionId: "TRX-0000003" } // Select a specific transaction for demonstration
       },
       {
         $lookup: {
           from: "users", // The collection to join with
           localField: "userId", // Field from the current collection (ecommerce_transactions)
           foreignField: "userId", // Field from the 'users' collection
           as: "userInfo" // The name of the new array field to add to the input documents
         }
       },
       {
         $project: {
           _id: 0,
           transactionId: 1,
           userId: 1,
           totalAmount: 1,
           orderDate: 1,
           userEmail: { $arrayElemAt: ["$userInfo.email", 0] } // Get the email from the first element of userInfo array
         }
       }
     ])
    

    Output:

     [
       {
         "transactionId": "TRX-0000003",
         "userId": "user_C",
         "totalAmount": 141,
         "orderDate": ISODate("2025-06-20T11:00:00Z"),
         "userEmail": "charlie.b@example.com"
       }
     ]
    

Wrap Up

These examples showcase how you can combine various aggregation stages to perform complex data analysis and transformations on a large dataset within MongoDB. Remember that for actual large datasets, proper indexing on the fields used in $match, $sort, and $lookup is absolutely critical for performance.

0
Subscribe to my newsletter

Read articles from Nitin Saini directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nitin Saini
Nitin Saini

A Full Stack Web Developer, possessing a strong command of React.js, Node.js, Express.js, MongoDB, and AWS, alongside Next.js, Redux, and modern JavaScript (ES6+)