MongoDB Aggregation Made Easy: A Beginner’s Guide

Anjali SainiAnjali Saini
7 min read

Why Do We Need Aggregation?

MongoDB’s aggregation framework helps to analyze and transform data efficiently. While simple find queries can retrieve documents, aggregation allows:
✔ Summarizing data (e.g., total sales, average ratings).
✔ Filtering and restructuring results.
✔ Performing complex calculations.
✔ Joining data from multiple sources (though this guide focuses on single collection processing).


Normal Query vs Aggregation

FeatureNormal Query (find())Aggregation (aggregate())
PurposeFetching raw documentsData processing & transformation
ComplexitySimple filtering ($eq, $gte)Multi-step operations
Grouping DataNot possibleUses $group stage
Transforming OutputLimited (projection)$project, $lookup, etc.
ComputationNot possibleCan calculate sums, averages, counts

Example:
📌 A find() query can return users with age > 25, but aggregation can also compute the average age of those users.


What is Aggregation?

Aggregation is a data processing technique in MongoDB that allows performing operations like filtering, grouping, sorting, and transformations within a pipeline.


What is an Aggregation Pipeline?

The aggregation pipeline is a sequence of stages where each stage modifies the data and passes it to the next stage.
It works like a conveyor belt where:
1️⃣ The input documents are fetched.
2️⃣ Various operations are performed.
3️⃣ The final output is returned.

Aggregation Visulization

Example Pipeline:

db.orders.aggregate([
  { $match: { status: "shipped" } },
  { $group: { _id: "$customerId", totalAmount: { $sum: "$amount" } } },
  { $sort: { totalAmount: -1 } }
]);

✅ Filters only "shipped" orders.
✅ Groups by customerId and calculates total amount spent.
✅ Sorts customers by highest spending.


Key Aggregation Stages

StagePurpose
$matchFilters documents (like find())
$groupGroups documents and performs calculations
$projectModifies output fields, includes/excludes fields
$sortSorts results (ascending or descending)
$limitLimits the number of documents returned
$skipSkips a specified number of documents
$countCounts the number of matching documents
$unwindDeconstructs array fields into separate documents
$addFieldsAdds new computed fields
$lookup(Not applicable for single collection) Joins another collection

Sample Dataset (10 Documents)

📌 Consider a products collection:

[
  { "_id": 1, "name": "Laptop", "category": "Electronics", "price": 800, "stock": 50, "ratings": [4, 5, 3] },
  { "_id": 2, "name": "Headphones", "category": "Electronics", "price": 100, "stock": 200, "ratings": [5, 5, 4] },
  { "_id": 3, "name": "Chair", "category": "Furniture", "price": 150, "stock": 20, "ratings": [3, 4] },
  { "_id": 4, "name": "Table", "category": "Furniture", "price": 250, "stock": 15, "ratings": [5, 4, 4] },
  { "_id": 5, "name": "T-shirt", "category": "Clothing", "price": 20, "stock": 300, "ratings": [4, 3, 5] },
  { "_id": 6, "name": "Jeans", "category": "Clothing", "price": 40, "stock": 100, "ratings": [5, 5, 4] },
  { "_id": 7, "name": "Smartphone", "category": "Electronics", "price": 1000, "stock": 30, "ratings": [5, 4, 5] },
  { "_id": 8, "name": "Monitor", "category": "Electronics", "price": 300, "stock": 25, "ratings": [4, 5, 5] },
  { "_id": 9, "name": "Sofa", "category": "Furniture", "price": 700, "stock": 10, "ratings": [4, 3, 5] },
  { "_id": 10, "name": "Jacket", "category": "Clothing", "price": 60, "stock": 50, "ratings": [5, 4, 3] }
]

Possible Aggregation Queries

1️⃣ $match - Filter products in Electronics

db.products.aggregate([
  { $match: { category: "Electronics" } }
]);

2️⃣ $group - Total stock per category

db.products.aggregate([
  { $group: { _id: "$category", totalStock: { $sum: "$stock" } } }
]);

3️⃣ $project - Show only name and price

db.products.aggregate([
  { $project: { _id: 0, name: 1, price: 1 } }
]);

4️⃣ $sort - Sort by price (highest to lowest)

db.products.aggregate([
  { $sort: { price: -1 } }
]);

5️⃣ $limit - Get top 3 expensive products

db.products.aggregate([
  { $sort: { price: -1 } },
  { $limit: 3 }
]);

6️⃣ $unwind - Get each rating as a separate document

db.products.aggregate([
  { $unwind: "$ratings" }
]);

7️⃣ $addFields - Add average rating per product

db.products.aggregate([
  {
    $addFields: {
      avgRating: { $avg: "$ratings" }
    }
  }
]);

8️⃣ $count - Count total products

db.products.aggregate([
  { $count: "totalProducts" }
]);

9️⃣ $skip - Skip first 5 products

db.products.aggregate([
  { $skip: 5 }
]);


Example Scenario & Aggregation Routes using Node Express Mongoose

Dataset: Orders Collection

Sample orders collection (10 documents):

[
  { "_id": 1, "customer": "Alice", "totalAmount": 500, "status": "Completed", "items": [ "Laptop", "Mouse" ], "date": "2024-02-10" },
  { "_id": 2, "customer": "Bob", "totalAmount": 150, "status": "Pending", "items": [ "Keyboard" ], "date": "2024-02-12" },
  { "_id": 3, "customer": "Charlie", "totalAmount": 700, "status": "Completed", "items": [ "Monitor", "Mousepad" ], "date": "2024-02-08" },
  { "_id": 4, "customer": "Alice", "totalAmount": 250, "status": "Completed", "items": [ "Headphones" ], "date": "2024-02-15" },
  { "_id": 5, "customer": "David", "totalAmount": 300, "status": "Pending", "items": [ "Tablet" ], "date": "2024-02-14" },
  { "_id": 6, "customer": "Bob", "totalAmount": 800, "status": "Completed", "items": [ "Smartphone", "Charger" ], "date": "2024-02-05" },
  { "_id": 7, "customer": "Charlie", "totalAmount": 120, "status": "Cancelled", "items": [ "USB Drive" ], "date": "2024-02-13" },
  { "_id": 8, "customer": "Alice", "totalAmount": 650, "status": "Completed", "items": [ "Monitor", "Keyboard" ], "date": "2024-02-07" },
  { "_id": 9, "customer": "David", "totalAmount": 200, "status": "Completed", "items": [ "Mouse" ], "date": "2024-02-09" },
  { "_id": 10, "customer": "Bob", "totalAmount": 450, "status": "Completed", "items": [ "Laptop Stand" ], "date": "2024-02-11" }
]

1. Route: Get Total Revenue of Completed Orders

💡 Scenario: We need to calculate the total revenue from all Completed orders.
🔧 Aggregation Used: $match, $group

app.get("/orders/revenue", async (req, res) => {
  try {
    const revenue = await Order.aggregate([
      { $match: { status: "Completed" } }, // Filter only completed orders
      { $group: { _id: null, totalRevenue: { $sum: "$totalAmount" } } } // Sum up all amounts
    ]);

    res.json(revenue[0] || { totalRevenue: 0 });
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

📌 Example Output:

{ "totalRevenue": 3600 }

2. Route: Get Customer-wise Total Spending

💡 Scenario: We want to group orders by customers and sum up their total spending.
🔧 Aggregation Used: $group, $sort

app.get("/orders/customer-spending", async (req, res) => {
  try {
    const spending = await Order.aggregate([
      { $group: { _id: "$customer", totalSpent: { $sum: "$totalAmount" } } }, // Group by customer
      { $sort: { totalSpent: -1 } } // Sort by total spent (descending)
    ]);

    res.json(spending);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

📌 Example Output:

[
  { "_id": "Alice", "totalSpent": 1400 },
  { "_id": "Bob", "totalSpent": 1400 },
  { "_id": "Charlie", "totalSpent": 820 },
  { "_id": "David", "totalSpent": 500 }
]

3. Route: Get Orders Placed in a Date Range

💡 Scenario: Retrieve all orders placed between two dates.
🔧 Aggregation Used: $match

app.get("/orders/date-range", async (req, res) => {
  try {
    const { start, end } = req.query; // Expecting query params like ?start=2024-02-07&end=2024-02-12

    const orders = await Order.aggregate([
      { 
        $match: { 
          date: { $gte: start, $lte: end } 
        } 
      }
    ]);

    res.json(orders);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

📌 Example Query:
GET /orders/date-range?start=2024-02-07&end=2024-02-12

📌 Example Output:

[
  { "_id": 8, "customer": "Alice", "totalAmount": 650, "status": "Completed", "date": "2024-02-07" },
  { "_id": 1, "customer": "Alice", "totalAmount": 500, "status": "Completed", "date": "2024-02-10" },
  { "_id": 10, "customer": "Bob", "totalAmount": 450, "status": "Completed", "date": "2024-02-11" },
  { "_id": 2, "customer": "Bob", "totalAmount": 150, "status": "Pending", "date": "2024-02-12" }
]

💡 Scenario: Find out which items are most frequently sold across orders.
🔧 Aggregation Used: $unwind, $group, $sort

app.get("/orders/popular-items", async (req, res) => {
  try {
    const popularItems = await Order.aggregate([
      { $unwind: "$items" }, // Split items array into multiple documents
      { $group: { _id: "$items", count: { $sum: 1 } } }, // Count occurrences
      { $sort: { count: -1 } } // Sort by most sold items
    ]);

    res.json(popularItems);
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

📌 Example Output:

[
  { "_id": "Mouse", "count": 2 },
  { "_id": "Keyboard", "count": 2 },
  { "_id": "Monitor", "count": 2 },
  { "_id": "Laptop", "count": 1 },
  { "_id": "Mousepad", "count": 1 }
]

Conclusion

MongoDB Aggregation is a powerful tool for processing and analyzing data within a single collection.
✅ The Aggregation Pipeline allows performing complex queries in a structured way.
✅ Different stages like $match, $group, $project, $sort, and $unwind help in filtering, transforming, and summarizing data.
✅ Practicing with realistic datasets improves understanding of MongoDB's aggregation framework.

0
Subscribe to my newsletter

Read articles from Anjali Saini directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Anjali Saini
Anjali Saini

I am an Enthusiastic and self-motivated web-Developer . Currently i am learning to build end-to-end web-apps.