Introduction to MongoDB: Everything You Need to Know

Table of contents
- What is MongoDB?
- MongoDB Installation
- How to Create Database in MongoDB?
- Embedded Documents in MongoDB (Nested Documents Limit)
- CRUD Operations in MongoDB (All Methods)
- Find vs FindOne in MongoDB
- How to Insert Document in Collection (Insert vs InsertOne vs InsertMany)
- How to Update Document in MongoDB (UpdateOne vs UpdateMany)
- How to Delete Documents in MongoDB (DeleteOne vs DeleteMany)
- Select Column Query (Projection in MongoDB)
- Is MongoDB Really Schemaless?
- Datatypes in MongoDB
- How to delete database
- Ordered option in insert command
- Schema Validation in MongoDB
- Write concern in MongoDB
- Atomicity in MongoDB
- MongoImport in MongoDB (Import JSON in MongoDB)
- Comparison operators ( $eq, $ne, $lt, $gt, $lte, $gte, $in & $nin )
- Logical Operators( $not, $and, $or & $nor)
- Mastering MongoDB: Understanding the $exists and $type Operators
- From Beginner to Pro: Querying Arrays in MongoDB
- Advanced Update ( $inc, $min, $max, $mul, $unset, $rename & Upsert )
- Update Nested Arrays and Use $pop, $pull, $push and $addToSet Operators
- Master MongoDB Indexing
- MongoDB Aggregation Guide
- $bucket operator in MongoDB
- $lookup : How to Join Collections in MongoDB
- $project in MongoDB
- Capped Collection in MongoDB
- The Complete Guide to Authentication ( RBAC )
- MongoDB Replication & Sharding
- Replicate MongoDB Database Like a Pro
- Transactions in MongoDB: Complete Walkthrough
- Mastering Date Queries in MongoDB
- Managed & Unmanaged Database
What is MongoDB?
MongoDB is a NoSQL, document-oriented database designed to store and manage large volumes of unstructured, semi-structured, or structured data. Unlike traditional relational databases that use tables and rows, MongoDB stores data in JSON-like BSON (Binary JSON) documents, making it highly flexible and scalable.
MongoDB is widely used for modern applications because of its high performance, easy scalability, and rich feature set that supports various data types and complex queries.
Key Features of MongoDB:
Document-Based Storage: Data is stored in flexible, JSON-like documents.
Schema-less Design: No fixed schema, making it easier to modify data structures.
High Scalability: Built-in sharding and replication features to handle large amounts of data.
Indexing: Supports indexing to improve query performance.
Aggregation Framework: Allows for data transformations and analysis.
Replication: Automatic data replication for high availability.
Transactions: ACID-compliant multi-document transactions.
Integration: Easily integrates with various programming languages like Python, Java, Node.js, etc.
Why Use MongoDB?
MongoDB is particularly popular for applications that require:
Real-time analytics
Big data processing
Content management systems
Internet of Things (IoT) applications
Mobile applications
Social networks
MongoDB's flexible data model and horizontal scalability make it a great choice for projects where data structures may evolve over time.
MongoDB Installation
To install MongoDB on Windows, follow these steps:
Step 1: Download MongoDB
Visit the official MongoDB website: https://www.mongodb.com/try/download/community
Choose the Community Server version.
Select your Windows operating system and download the
.msi
installer package.
Step 2: Install MongoDB
Run the downloaded
.msi
file.Follow the installation wizard and select Complete setup type.
Choose whether to install MongoDB as a service (recommended) or run manually.
Set the installation path (default:
C:\Program Files\MongoDB\Server\<version>
).
Step 3: Set Environment Variables
Go to System Properties → Environment Variables.
Add the MongoDB
bin
directory to thePath
variable:- Example:
C:\Program Files\MongoDB\Server\<version>\bin
- Example:
Step 4: Run MongoDB
Open Command Prompt.
Type the following command to start the MongoDB server:
mongod
The server will start running on the default port
27017
.
or else manually start the server from the Control Panel
Step 5: Verify Installation
Open another Command Prompt.
Type the following command to connect to MongoDB:
mongosh
If everything is set up correctly, the MongoDB shell will open.
Optional: MongoDB Compass
MongoDB Compass is a graphical user interface to interact with MongoDB databases. You can download it from the official website and install it for easier database management.
How to Create Database in MongoDB?
Creating a database in MongoDB is straightforward because MongoDB automatically creates databases when data is inserted.
Steps to Create a Database:
Open Command Prompt.
Start the MongoDB server by typing:
mongod
In another Command Prompt, open the MongoDB shell by typing:
mongosh
To create a new database, use the
use
command followed by the database name:use myDatabase
This command will switch to the specified database. If the database does not exist, MongoDB will create it when data is added.
Verify the current database using the command:
db
It will display the current active database.
Insert data to create the database:
db.myCollection.insertOne({name: "John", age: 30})
Once data is inserted, the database will be created automatically.
To view all available databases, use:
show dbs
Note: The newly created database will not appear in the list until it contains at least one document.
Embedded Documents in MongoDB (Nested Documents Limit)
MongoDB allows documents to be embedded within other documents, creating nested documents. This feature is used to represent complex data relationships in a single document without the need for separate collections.
Example of Embedded Documents:
db.users.insertOne({
name: "Alice",
address: {
street: "123 Main St",
city: "New York",
zip: "10001"
},
contact: {
email: "alice@example.com",
phone: "123-456-7890"
}
})
Nested Documents Limit:
MongoDB allows documents to be nested up to 100 levels deep.
The maximum size of a document is 16 MB.
Large or deeply nested documents can impact performance and make queries slower.
Accessing Nested Documents:
MongoDB provides dot notation to access fields inside embedded documents. Example:
db.users.find({"address.city": "New York"})
Updating Nested Documents:
To update fields inside nested documents, use the dot notation with the $set
operator. Example:
db.users.updateOne(
{name: "Alice"},
{$set: {"address.city": "Los Angeles"}}
)
Advantages of Embedded Documents:
Improves performance by reducing the need for joins.
Stores related data together for faster access.
Simplifies data retrieval.
Disadvantages of Embedded Documents:
Can cause large document sizes if not managed properly.
Difficult to update if embedded data changes frequently.
May lead to data redundancy in some cases.
Best Practices for Using Embedded Documents:
Use embedded documents for one-to-few relationships.
Avoid deep nesting beyond 2-3 levels.
Use references instead of embedding for one-to-many or many-to-many relationships.
Regularly optimize documents to maintain performance.
MongoDB's embedded documents provide a powerful way to structure data but should be used carefully to avoid performance issues and redundancy.
CRUD Operations in MongoDB (All Methods)
CRUD operations represent the basic operations for interacting with a database:
1. Create
insertOne()
- Inserts a single document.insertMany()
- Inserts multiple documents.
Example:
db.users.insertOne({name: "John", age: 25})
db.users.insertMany([{name: "Alice", age: 30}, {name: "Bob", age: 28}])
2. Read
find()
- Retrieves documents that match a query.findOne()
- Retrieves a single document.
Example:
db.users.find()
db.users.find({age: {$gt: 25}})
db.users.findOne({name: "Alice"})
3. Update
updateOne()
- Updates a single document.updateMany()
- Updates multiple documents.$set
- Modifies specific fields.$unset
- Removes fields.
Example:
db.users.updateOne({name: "John"}, {$set: {age: 26}})
db.users.updateMany({age: {$lt: 30}}, {$set: {status: "Active"}})
4. Delete
deleteOne()
- Deletes a single document.deleteMany()
- Deletes multiple documents.
Example:
db.users.deleteOne({name: "John"})
db.users.deleteMany({status: "Inactive"})
CRUD operations are the foundation of working with MongoDB, enabling users to manage and manipulate data effectively.
Find vs FindOne in MongoDB
In MongoDB, both the find()
and findOne()
methods are used to retrieve data from a collection, but they serve different purposes and return different types of results.
1. find()
Method
The find()
method is used to retrieve multiple documents that match a specified query from a collection.
Syntax:
db.collection.find(query, projection)
query: Specifies the filter criteria to select documents.
projection: (Optional) Specifies the fields to include or exclude from the result.
Example:
Retrieve all users with age greater than 25:
db.users.find({age: {$gt: 25}})
Output:
{_id: ObjectId("123"), name: "Alice", age: 30}
{_id: ObjectId("124"), name: "Bob", age: 28}
Important Points:
It returns a cursor object containing all matching documents.
You can iterate through the cursor to access each document.
If no documents are found, it returns an empty cursor.
2. findOne()
Method
The findOne()
method is used to retrieve a single document that matches the specified query.
Syntax:
db.collection.findOne(query, projection)
Example:
Retrieve one user with age greater than 25:
db.users.findOne({age: {$gt: 25}})
Output:
{_id: ObjectId("123"), name: "Alice", age: 30}
Important Points:
It returns the first matching document found.
If no documents are found, it returns
null
.It does not return a cursor object, only the document itself.
Differences at a Glance:
Feature | find() | findOne() |
Result Type | Cursor Object | Single Document |
Number of Results | Multiple Documents | One Document |
Return if No Match | Empty Cursor | null |
Use Case | Retrieve multiple records | Retrieve a single record |
When to Use Which?
Use find() when you expect multiple documents to match your query.
Use findOne() when you only need a single document or want to check if a document exists.
Both methods are essential for data retrieval in MongoDB and should be used based on the application's requirements.
How to Insert Document in Collection (Insert vs InsertOne vs InsertMany)
In MongoDB, inserting documents into a collection is one of the fundamental operations. MongoDB provides different methods to insert documents depending on the number of documents to be inserted.
1. insert()
Method (Deprecated)
The insert()
method was used to insert one or multiple documents into a collection.
Syntax:
db.collection.insert(document)
db.collection.insert([document1, document2, ...])
Example:
db.users.insert({name: "John", age: 25})
db.users.insert([{name: "Alice", age: 30}, {name: "Bob", age: 28}])
Note: This method is deprecated in the latest versions of MongoDB and is replaced by insertOne()
and insertMany()
.
2. insertOne()
Method
The insertOne()
method is used to insert a single document into a collection.
Syntax:
db.collection.insertOne(document)
Example:
db.users.insertOne({name: "John", age: 25})
Output:
{ acknowledged: true, insertedId: ObjectId("123abc") }
Key Points:
Inserts only one document.
Returns an acknowledgment object with
insertedId
.Provides better performance for single document insertions.
3. insertMany()
Method
The insertMany()
method is used to insert multiple documents into a collection at once.
Syntax:
db.collection.insertMany([document1, document2, ...])
Example:
db.users.insertMany([
{name: "Alice", age: 30},
{name: "Bob", age: 28},
{name: "Charlie", age: 35}
])
Output:
{
acknowledged: true,
insertedIds: [
ObjectId("123abc"),
ObjectId("124abc"),
ObjectId("125abc")
]
}
Key Points:
Inserts multiple documents.
Returns an acknowledgment object with an array of
insertedIds
.Faster than inserting documents one by one.
Differences at a Glance:
Method | Number of Documents | Output Type | Performance | Status |
insert() | Single or Multiple | Acknowledgment | Moderate | Deprecated |
insertOne() | Single | Acknowledgment | Fast | Active |
insertMany() | Multiple | Acknowledgment | Fastest | Active |
When to Use Which?
Use insertOne() when inserting a single document.
Use insertMany() when inserting multiple documents at once.
Avoid using the deprecated insert() method in new projects.
These methods provide flexibility and performance optimization depending on the application's needs.
How to Update Document in MongoDB (UpdateOne vs UpdateMany)
In MongoDB, updating documents is an essential operation to modify existing data in collections. MongoDB provides different methods to update documents based on the number of documents to be modified.
1. updateOne()
Method
The updateOne()
method is used to update a single document that matches the specified query.
Syntax:
db.collection.updateOne(filter, update, options)
filter: Specifies the condition to select the document.
update: Defines the modifications to apply.
options: (Optional) Additional options like
upsert
.
Example:
Update the age of a user named "John":
db.users.updateOne(
{name: "John"},
{$set: {age: 26}}
)
Output:
{ acknowledged: true, matchedCount: 1, modifiedCount: 1 }
Key Points:
Only updates the first matching document.
Returns an acknowledgment with
matchedCount
andmodifiedCount
.If no document matches, nothing will be updated unless the
upsert
option is used.
2. updateMany()
Method
The updateMany()
method is used to update multiple documents that match the specified query.
Syntax:
db.collection.updateMany(filter, update, options)
Example:
Update the status of all users with age greater than 25:
db.users.updateMany(
{age: {$gt: 25}},
{$set: {status: "Active"}}
)
Output:
{ acknowledged: true, matchedCount: 3, modifiedCount: 3 }
Key Points:
Updates all matching documents.
Returns an acknowledgment with
matchedCount
andmodifiedCount
.Can be combined with filters to apply bulk updates.
Differences at a Glance:
Method | Number of Documents | Output Type | Performance | Use Case |
updateOne() | One | Acknowledgment | Faster | Update a single document |
updateMany() | Multiple | Acknowledgment | Slower | Update multiple documents |
Optional upsert
Option
Both updateOne()
and updateMany()
support the upsert
option. When set to true
, MongoDB will insert a new document if no matching document is found.
Example:
db.users.updateOne(
{name: "David"},
{$set: {age: 40}},
{upsert: true}
)
Output:
{ acknowledged: true, matchedCount: 0, modifiedCount: 0, upsertedId: ObjectId("123abc") }
When to Use Which?
Use updateOne() when you want to modify only one matching document.
Use updateMany() when you need to modify multiple documents.
Always use the
upsert
option if you want to insert a document if no match is found.
These methods help maintain data consistency and optimize performance in MongoDB applications.
How to Delete Documents in MongoDB (DeleteOne vs DeleteMany)
In MongoDB, deleting documents is an essential operation to remove unnecessary or outdated data from collections. MongoDB provides two methods to delete documents based on the number of documents to be removed.
1. deleteOne()
Method
The deleteOne()
method is used to delete a single document that matches the specified query.
Syntax:
db.collection.deleteOne(filter)
- filter: Specifies the condition to select the document to delete.
Example:
Delete one user named "John":
db.users.deleteOne({name: "John"})
Output:
{ acknowledged: true, deletedCount: 1 }
Key Points:
Deletes only the first matching document.
Returns an acknowledgment object with the
deletedCount
field.If no matching document is found,
deletedCount
will be 0.
2. deleteMany()
Method
The deleteMany()
method is used to delete multiple documents that match the specified query.
Syntax:
db.collection.deleteMany(filter)
Example:
Delete all users with age greater than 25:
db.users.deleteMany({age: {$gt: 25}})
Output:
{ acknowledged: true, deletedCount: 3 }
Key Points:
Deletes all documents that match the filter condition.
Returns an acknowledgment object with the
deletedCount
field.If no matching documents are found,
deletedCount
will be 0.
Differences at a Glance:
Method | Number of Documents | Output Type | Use Case |
deleteOne() | One | Acknowledgment | Delete a single document |
deleteMany() | Multiple | Acknowledgment | Delete multiple documents |
Important Notes:
Always use filters carefully to avoid accidental deletion of unintended documents.
If the filter is empty
{}
, all documents in the collection will be deleted.The
deleteOne()
method will delete the first document only, even if multiple documents match the query.
Example: Delete All Documents
To delete all documents from a collection:
db.users.deleteMany({})
Output:
{ acknowledged: true, deletedCount: X }
Where X is the total number of documents deleted.
When to Use Which?
Use deleteOne() when you need to remove only one document.
Use deleteMany() when you want to delete multiple documents that meet the filter criteria.
Always test your queries before execution to avoid data loss.
Select Column Query (Projection in MongoDB)
In MongoDB, Projection is used to select specific fields (columns) from documents in a collection rather than retrieving the entire document. This helps in optimizing performance by fetching only the required data.
What is Projection?
Projection is used with the find()
or findOne()
methods to filter which fields should be included or excluded in the query result.
Syntax:
db.collection.find(query, projection)
query: Specifies the filter to select documents.
projection: Specifies which fields should be included or excluded.
1. Include Specific Columns
To include specific fields, set the field value to 1.
Example:
Retrieve only the name
and age
of users:
db.users.find({}, {name: 1, age: 1, _id: 0})
Output:
{ name: "Alice", age: 30 }
{ name: "Bob", age: 28 }
Key Points:
By default, MongoDB always includes the
_id
field.Use
_id: 0
to exclude the_id
field.
2. Exclude Specific Columns
To exclude specific fields, set the field value to 0.
Example:
Exclude the age
field:
db.users.find({}, {age: 0})
Output:
{ _id: ObjectId("123"), name: "Alice" }
{ _id: ObjectId("124"), name: "Bob" }
Note: You cannot mix inclusion and exclusion in the same projection except for the _id
field.
3. Nested Documents Projection
You can select or exclude fields from nested documents using dot notation.
Example:
Retrieve only the address.city
field:
db.users.find({}, {"address.city": 1, _id: 0})
Output:
{ address: { city: "New York" } }
4. Projection with Conditions
Projection can be combined with filter conditions.
Example:
Retrieve only name
of users whose age is greater than 25:
db.users.find({age: {$gt: 25}}, {name: 1, _id: 0})
Output:
{ name: "Alice" }
{ name: "Bob" }
Differences at a Glance:
Type | Inclusion | Exclusion |
Syntax | { field: 1 } | { field: 0 } |
Mixed Fields | Not Allowed | Not Allowed |
_id Field | Can be excluded | Can be excluded |
Nested Fields | Supported | Supported |
When to Use Projection?
Use Projection to optimize query performance.
Use Include Projection to fetch only necessary data.
Use Exclude Projection when you need to hide sensitive data.
Projection plays a significant role in reducing network traffic and enhancing application performance in MongoDB.
Is MongoDB Really Schemaless?
MongoDB is often referred to as a schemaless database, but the term schemaless can be misleading. Understanding what schemaless means in MongoDB requires clarifying how MongoDB handles data structure.
What Does Schemaless Mean?
In traditional relational databases, data must follow a strict schema where each table has fixed columns with predefined data types. However, MongoDB offers more flexibility by not enforcing a predefined schema for its collections.
Schemaless in MongoDB means:
Documents in the same collection can have different fields.
No fixed structure is required during insertion.
New fields can be added dynamically without altering the existing documents.
Example:
In a relational database, every row in a table must follow the same structure:
Name | Age | Address |
Alice | 25 | New York |
Bob | 30 | Los Angeles |
In MongoDB, documents in the same collection can have different fields:
{ "name": "Alice", "age": 25, "address": "New York" }
{ "name": "Bob", "email": "bob@email.com" }
Is MongoDB Completely Schemaless?
MongoDB is not entirely schemaless because:
Collections can have Schema Validation rules using JSON Schema.
Some data consistency checks can be applied.
Fields like
_id
are always present.Applications often enforce their own schema during data insertion.
How to Apply Schema Validation?
MongoDB provides Schema Validation using validator
during collection creation.
Example:
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email"],
properties: {
name: {
bsonType: "string",
description: "Name must be a string"
},
email: {
bsonType: "string",
description: "Email must be a string"
}
}
}
}
})
Pros and Cons of Schemaless Design
Pros | Cons |
High flexibility | Difficult to maintain data consistency |
Easy to modify | No built-in foreign key relationships |
Fast development | Complex validation requires custom code |
Conclusion:
MongoDB is schemaless by default but offers optional schema validation for data consistency. The flexible document model makes MongoDB ideal for dynamic applications, but developers must enforce schema rules at the application level or through validation to maintain data integrity.
Datatypes in MongoDB
MongoDB supports a wide range of data types to store different kinds of data within documents. Each field in a document can hold different types of data, making MongoDB a flexible NoSQL database.
List of Data Types in MongoDB
MongoDB supports the following data types:
Data Type | Description | Example |
String | Stores text data | "MongoDB" |
Integer | Stores numeric data (32-bit/64-bit) | 25 |
Double | Stores floating-point numbers | 45.67 |
Boolean | Stores true or false | true |
Array | Stores multiple values in a list | ["red", "green", "blue"] |
Object | Stores embedded/nested documents | { address: "New York" } |
ObjectId | Stores unique ID for documents | ObjectId("507f1f77bcf86") |
Date | Stores current date and time | new Date() |
Null | Stores null value | null |
Binary Data | Stores binary data (images, files) | BinData() |
Regular Expression | Stores regex expressions | /pattern/ |
Timestamp | Stores timestamps | Timestamp(1617685561) |
Decimal128 | Stores high-precision decimal numbers | Decimal128("99.99") |
Min/Max Keys | Compares values | MinKey() / MaxKey() |
Examples of Data Types:
1. String
db.products.insertOne({name: "Laptop", brand: "Dell"})
2. Integer
db.products.insertOne({name: "Laptop", price: 50000})
3. Boolean
db.products.insertOne({name: "Laptop", available: true})
4. Array
db.products.insertOne({name: "Laptop", colors: ["Black", "Silver"]})
5. Object (Embedded Document)
db.products.insertOne({name: "Laptop", specifications: {RAM: "8GB", Storage: "512GB"}})
6. Date
db.orders.insertOne({orderDate: new Date()})
7. ObjectId
Every document automatically gets a unique _id
field:
ObjectId("507f1f77bcf86cd799439011")
Important Notes:
MongoDB automatically assigns the
_id
field with theObjectId
type if not provided.Dates are stored in ISODate format.
Arrays can hold mixed data types.
MongoDB uses BSON (Binary JSON) format internally to store data.
Conclusion
MongoDB provides a variety of data types to handle different forms of data efficiently. The dynamic schema and flexible data type support make MongoDB suitable for various applications, from simple text storage to complex nested documents.
How to delete database
Deleting a database in MongoDB is simple and can be done using the dropDatabase()
method.
Syntax:
db.dropDatabase()
Steps to Delete Database:
Select the Database: Before deleting the database, you need to switch to the database you want to delete using the
use
command.Example:
use mydatabase
This will switch the current database context to
mydatabase
.Delete the Database: After switching, execute the following command:
db.dropDatabase()
Output:
{ "dropped": "mydatabase", "ok": 1 }
The "ok": 1
message indicates that the database was successfully deleted.
How db.dropDatabase() Works:
The
db.dropDatabase()
method deletes the currently selected database.If no database is selected, MongoDB will default to the
test
database, which will be deleted.This method drops all collections and their associated data within the selected database.
Important Points:
The
dropDatabase()
method will delete the entire database, including all collections and documents inside it.You must switch to the database before executing the command.
Always backup your data before deleting the database.
If the selected database has no collections, MongoDB will still delete the empty database.
Example:
Delete a database named studentdb:
use studentdb
// Check the current database
db
// Drop the database
db.dropDatabase()
How to Verify Deletion?
To check if the database is deleted or not, use the following command:
show dbs
The deleted database name will no longer appear in the list.
Conclusion
MongoDB provides a simple method to delete databases using the dropDatabase()
command. Always ensure that the correct database is selected before deletion to avoid accidental data loss.
Ordered option in insert command
The Ordered option in MongoDB allows you to control how MongoDB processes multiple insert operations within a single insertMany()
command.
By default, MongoDB performs insert operations sequentially and stops on the first error. However, the ordered
option allows you to decide whether MongoDB should continue inserting the remaining documents even if one document fails.
Syntax:
db.collection.insertMany([documents], { ordered: <boolean> })
Parameter | Description |
documents | Array of documents to insert |
ordered | Boolean value (true or false ) to specify the order of execution |
How Ordered Option Works
ordered: true
(Default): MongoDB stops the insertion process when the first error occurs.ordered: false
: MongoDB continues inserting the remaining documents even if some documents fail.
Example:
Insert with ordered: true
(Default Behavior):
db.products.insertMany([
{ name: "Laptop", price: 50000 },
{ name: "Mouse", price: "InvalidPrice" },
{ name: "Keyboard", price: 1500 }
], { ordered: true })
Output: Only the first document will be inserted, and the insertion will stop due to the invalid price field in the second document.
Insert with ordered: false
:
db.products.insertMany([
{ name: "Laptop", price: 50000 },
{ name: "Mouse", price: "InvalidPrice" },
{ name: "Keyboard", price: 1500 }
], { ordered: false })
Output:
The first and third documents will be inserted.
The second document will be skipped due to the invalid price field.
Performance Impact
Ordered | Performance | Error Handling |
true | Slower | Stops on first error |
false | Faster | Continues inserting remaining documents |
When to Use:
Use ordered: true when data consistency is more important than speed.
Use ordered: false when performance is more critical, and you can tolerate partial inserts.
Conclusion
The Ordered option in MongoDB gives you better control over bulk insert operations. It helps balance between data consistency and performance depending on your application needs.
Schema Validation in MongoDB
Schema validation in MongoDB is used to enforce data integrity by specifying rules for document structure, field types, and constraints.
MongoDB uses JSON Schema validation to define the expected structure of documents inside collections.
Why Use Schema Validation?
Maintain data consistency.
Prevent incorrect data entries.
Enforce mandatory fields.
Define data types for fields.
How to Enable Schema Validation
Schema validation is applied during collection creation using the validator
option.
Syntax:
db.createCollection("collection_name", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["field1", "field2"],
properties: {
field1: {
bsonType: "string",
description: "Field1 must be a string"
},
field2: {
bsonType: "int",
description: "Field2 must be an integer"
}
}
}
}
})
Example:
Create a users collection with schema validation:
db.createCollection("users", {
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "age"],
properties: {
name: {
bsonType: "string",
description: "Name must be a string"
},
age: {
bsonType: "int",
description: "Age must be an integer"
}
}
}
}
})
Testing Schema Validation
Insert valid and invalid documents:
Valid Document:
db.users.insertOne({ name: "Alice", age: 25 })
Invalid Document (Missing Required Field):
db.users.insertOne({ name: "Alice" })
Output:
WriteError: Document failed validation
Update Schema Validation for Existing Collections
You can update schema validation rules on existing collections using the collMod
command:
db.runCommand({
collMod: "users",
validator: {
$jsonSchema: {
bsonType: "object",
required: ["name", "email"],
properties: {
name: { bsonType: "string" },
email: { bsonType: "string" }
}
}
}
})
Validation Levels
Level | Description |
strict | Reject documents that don't match the validation rules |
moderate | Allow documents that don't match validation rules, but apply them during updates |
Example:
db.createCollection("users", {
validator: { $jsonSchema: { bsonType: "object" } },
validationLevel: "strict"
})
Conclusion
Schema validation in MongoDB ensures that documents follow a consistent structure, improving data reliability and application stability. It is a powerful feature that helps maintain data quality in dynamic applications.
Write concern in MongoDB
Write Concern in MongoDB defines the level of acknowledgment the server must receive from the replica set members before considering a write operation successful.
It helps ensure data durability and consistency across MongoDB deployments.
Why Use Write Concern?
Ensure data reliability.
Define acknowledgment levels for writes.
Balance between performance and durability.
Prevent data loss in distributed systems.
Write Concern Parameters
MongoDB provides several levels of write concern acknowledgment:
Option | Description |
w: 0 | No acknowledgment from the server (Fastest but unsafe) |
w: 1 | Acknowledgment from Primary server only |
w: 2 | Acknowledgment from Primary + 1 Secondary |
w: majority | Acknowledgment from Majority of Replica Set Members |
Syntax:
Write concern can be applied in the insert, update, and delete operations using the writeConcern
option.
db.collection.insertOne(
{ name: "John", age: 30 },
{ writeConcern: { w: 1, j: true, wtimeout: 1000 } }
)
Explanation:
w
: Number of nodes to acknowledge the write.j
: Wait for the write to be committed to the journal.wtimeout
: Time in milliseconds to wait for acknowledgment before throwing an error.
Example:
1. Write Concern with w: 1
db.users.insertOne(
{ name: "Alice", age: 25 },
{ writeConcern: { w: 1 } }
)
Output:
{ acknowledged: true, insertedId: ObjectId("...") }
2. Write Concern with w: majority
db.users.insertOne(
{ name: "Bob", age: 28 },
{ writeConcern: { w: "majority", wtimeout: 5000 } }
)
Output:
{ acknowledged: true, insertedId: ObjectId("...") }
Unacknowledged Write Concern (w: 0
)
If you want the fastest performance without waiting for acknowledgment:
db.logs.insertOne(
{ message: "Server Restarted" },
{ writeConcern: { w: 0 } }
)
Note: This is not recommended for critical data.
Journaled Write Concern
To ensure the write is committed to the journal before acknowledgment:
db.orders.insertOne(
{ item: "Laptop", qty: 1 },
{ writeConcern: { w: 1, j: true } }
)
Write Timeout
Set a timeout period to avoid indefinite waiting:
db.transactions.insertOne(
{ user: "David", amount: 1000 },
{ writeConcern: { w: 2, wtimeout: 3000 } }
)
Conclusion
Write Concern in MongoDB provides flexibility and reliability in data storage by controlling acknowledgment levels. Choosing the appropriate write concern depends on your application's balance between performance and data durability.
Atomicity in MongoDB
Atomicity in MongoDB ensures that certain operations are performed as all-or-nothing transactions. This means that either the entire operation is successfully applied or none of it is applied, preventing partial changes.
MongoDB provides atomicity at the document level by default.
What is Atomicity?
Atomicity guarantees that a group of operations are treated as a single unit. If one operation in the group fails, the entire group is rolled back, leaving the database unchanged.
Is MongoDB Fully Atomic?
Single Document Operations: MongoDB guarantees atomic operations on single documents.
Multi-Document Operations: MongoDB does not guarantee atomicity by default but provides transactions to achieve atomicity across multiple documents.
How Atomicity Works on Single Document Operations
Single document operations like insert, update, and delete are atomic.
Example:
db.users.updateOne(
{ _id: 1 },
{ $set: { balance: 1000 } }
)
If this update operation is interrupted, the document will not be partially updated.
Atomicity in Embedded Documents
Atomicity also applies to embedded documents (Nested Documents).
Example:
db.orders.updateOne(
{ _id: 101 },
{ $set: { address: { city: "New York", zip: "10001" } } }
)
Both city
and zip
fields will be updated together or not at all.
Multi-Document Transactions
MongoDB provides multi-document transactions from version 4.0 onwards.
Syntax for Multi-Document Transactions:
session = db.getMongo().startSession()
transaction = session.startTransaction()
try {
db.orders.insertOne({ item: "Laptop", qty: 1 }, { session })
db.payments.insertOne({ user: "John", amount: 50000 }, { session })
transaction.commitTransaction()
} catch (error) {
transaction.abortTransaction()
}
finally {
session.endSession()
}
In this example, either both orders
and payments
collections will be updated or none will be.
When to Use Transactions
Banking applications (Money transfers)
Inventory management
E-commerce orders
Booking systems
Important Points
Feature | Supported |
Single Document Atomicity | ✅ |
Multi-Document Transactions | ✅ (From MongoDB 4.0) |
Rollback in Transactions | ✅ |
Cross-Collection Transactions | ✅ (Replica Sets Only) |
Conclusion
Atomicity in MongoDB ensures data consistency and integrity by guaranteeing all-or-nothing execution on documents. With the introduction of multi-document transactions, MongoDB can now support complex ACID-compliant operations across multiple collections.
MongoImport in MongoDB (Import JSON in MongoDB)
MongoImport is a command-line tool provided by MongoDB to import data from various formats like JSON, CSV, and TSV into MongoDB collections.
Why Use MongoImport?
Import bulk data into MongoDB.
Migrate data from external sources.
Automate data seeding during project setup.
Perform backup restores.
Syntax
mongoimport "<file_location>" -d <database_name> -c <collection_name> --jsonArray --drop
Option | Description |
-d | Target database name |
-c | Target collection name |
--jsonArray | Import JSON array documents |
--drop | Drops existing collection before import |
Import JSON File into MongoDB using VS Code
Create JSON File in VS Code
Open VS Code.
Create a file named
users.json
.Add the following data:
[
{"name": "Alice", "age": 25},
{"name": "Bob", "age": 30}
]
Run MongoImport Command
Open Command Prompt or Terminal.
Navigate to the file location.
Execute the following command:
mongoimport "C:\Users\username\Desktop\users.json" -d mydatabase -c users --jsonArray --drop
Output:
connected to: mongodb://localhost:27017/
imported 2 documents
Drop Existing Collection Before Import
The --drop
option will automatically delete the existing collection before importing new data.
Import CSV File
MongoImport can also import CSV files:
- Create
products.csv
in VS Code:
name,price
Laptop,50000
Mouse,700
- Import CSV File:
mongoimport "C:\Users\username\Desktop\products.csv" -d mydatabase -c products --type csv --headerline --drop
MongoImport with Authentication
If your MongoDB server requires authentication:
mongoimport "C:\Users\username\Desktop\users.json" -d mydatabase -c users --jsonArray --drop --username admin --password admin123
Then start the mongo using “mongosh” to check the above
Conclusion
MongoImport is a powerful utility to quickly import data into MongoDB from various file formats like JSON, CSV, and TSV. It simplifies the process of data migration and backup restoration, making it essential for large-scale data applications.
Comparison operators ( $eq, $ne, $lt, $gt, $lte, $gte, $in & $nin )
MongoDB provides Comparison Operators to filter documents based on specific conditions in queries. These operators are used in the find()
method to compare field values.
List of Comparison Operators
Operator | Description | Example Usage |
$eq | Matches values that are equal | { age: { $eq: 25 } } |
$ne | Matches values that are not equal | { age: { $ne: 30 } } |
$lt | Matches values less than | { age: { $lt: 25 } } |
$gt | Matches values greater than | { age: { $gt: 25 } } |
$lte | Matches values less than or equal | { age: { $lte: 30 } } |
$gte | Matches values greater than or equal | { age: { $gte: 18 } } |
$in | Matches any value from an array of values | { age: { $in: [25, 30, 35] } } |
$nin | Matches any value not in an array | { age: { $nin: [25, 30] } } |
Examples
1. $eq
(Equal To)
Find users with age equal to 25:
db.users.find({ age: { $eq: 25 } })
2. $ne
(Not Equal To)
Find users whose age is not equal to 30:
db.users.find({ age: { $ne: 30 } })
3. $lt
(Less Than)
Find users younger than 25:
db.users.find({ age: { $lt: 25 } })
4. $gt
(Greater Than)
Find users older than 25:
db.users.find({ age: { $gt: 25 } })
5. $lte
(Less Than or Equal To)
Find users whose age is 30 or younger:
db.users.find({ age: { $lte: 30 } })
6. $gte
(Greater Than or Equal To)
Find users whose age is 18 or older:
db.users.find({ age: { $gte: 18 } })
7. $in
Find users whose age is either 25, 30, or 35:
db.users.find({ age: { $in: [25, 30, 35] } })
8. $nin
Find users whose age is not 25, 30, or 35:
db.users.find({ age: { $nin: [25, 30, 35] } })
Combine Multiple Operators
You can combine multiple comparison operators using AND or OR conditions.
Example: Find users between 25 and 30 years old:
db.users.find({ age: { $gte: 25, $lte: 30 } })
Conclusion
MongoDB's Comparison Operators allow flexible and powerful filtering of documents based on field values. These operators are essential for querying data in MongoDB efficiently.
Logical Operators( $not, $and, $or & $nor)
MongoDB provides Logical Operators to combine multiple query conditions or negate certain conditions. These operators are primarily used in the find()
method to filter documents.
List of Logical Operators
Operator | Description | Example Usage |
$and | Joins multiple conditions (Both must be true) | { $and: [ { age: { $gt: 25 } }, { city: "Kolkata" } ] } |
$or | Joins multiple conditions (Any one must be true) | { $or: [ { age: { $gt: 25 } }, { city: "Kolkata" } ] } |
$not | Negates a condition | { age: { $not: { $gt: 25 } } } |
$nor | Joins multiple conditions (None should be true) | { $nor: [ { age: { $gt: 25 } }, { city: "Kolkata" } ] } |
1. $and
The $and
operator filters documents where both or all conditions must be true.
Example: Find users older than 25 who live in Kolkata:
db.users.find({ $and: [ { age: { $gt: 25 } }, { city: "Kolkata" } ] })
2. $or
The $or
operator filters documents where at least one condition must be true.
Example: Find users who are either older than 25 or live in Kolkata:
db.users.find({ $or: [ { age: { $gt: 25 } }, { city: "Kolkata" } ] })
3. $not
The $not
operator negates a condition.
Example: Find users who are not older than 25:
db.users.find({ age: { $not: { $gt: 25 } } })
4. $nor
The $nor
operator filters documents where none of the conditions are true.
Example: Find users who are neither older than 25 nor live in Kolkata:
db.users.find({ $nor: [ { age: { $gt: 25 } }, { city: "Kolkata" } ] })
Combine Logical Operators
You can combine multiple logical operators in a single query.
Example: Find users whose age is greater than 25 but not from Kolkata:
db.users.find({ $and: [ { age: { $gt: 25 } }, { city: { $not: { $eq: "Kolkata" } } } ] })
Conclusion
Logical operators in MongoDB allow you to build complex queries by combining multiple conditions. These operators provide greater flexibility in filtering documents and are essential for advanced queries.
Mastering MongoDB: Understanding the $exists and $type Operators
MongoDB provides powerful query operators to filter documents based on the existence of fields and their data types. Two essential operators are $exists
and $type
.
1. $exists
Operator
The $exists
operator checks whether a specified field exists in a document or not.
Syntax
{ field: { $exists: <boolean> } }
true
: Selects documents where the field exists (not null).false
: Selects documents where the field does not exist.
Example
Find users who have an email field:
db.users.find({ email: { $exists: true } })
Find users who do not have an email field:
db.users.find({ email: { $exists: false } })
2. $type
Operator
The $type
operator selects documents where the field is of a specified BSON data type.
Syntax
{ field: { $type: <type> } }
Common Data Type Codes
Type | Description |
string | String |
int | Integer |
double | Double (Floating Point) |
bool | Boolean |
array | Array |
object | Embedded Document |
date | Date |
Example
Find users where the age field is an integer:
db.users.find({ age: { $type: "int" } })
Find users where the status field is a boolean:
db.users.find({ status: { $type: "bool" } })
Combine $exists
and $type
You can combine both operators to filter documents based on field existence and type.
Example: Find users where the email field exists and is of type string:
db.users.find({ email: { $exists: true, $type: "string" } })
Conclusion
The $exists
and $type
operators allow fine-tuned control over document filtering by verifying both the existence and data type of fields. These operators are highly useful when working with unstructured or dynamic data in MongoDB.
From Beginner to Pro: Querying Arrays in MongoDB
MongoDB provides several methods to query array fields efficiently. Arrays can store multiple values and documents, making them highly flexible for data storage.
1. Query Array Elements by Exact Match
MongoDB can match an array field with exact array content.
Example
Find users whose skills exactly match ['Python', 'MongoDB']
:
db.users.find({ skills: ['Python', 'MongoDB'] })
This query will match documents with the exact array order and values.
2. Query Array Elements with $in
The $in
operator matches documents where the array contains any of the specified elements.
Example
Find users with Python or Java skills:
db.users.find({ skills: { $in: ['Python', 'Java'] } })
3. Query Array Elements with $all
The $all
operator matches documents where the array contains all specified elements, regardless of order.
Example
Find users with both Python and MongoDB skills:
db.users.find({ skills: { $all: ['Python', 'MongoDB'] } })
4. Query Array Elements with $size
The $size
operator matches arrays with an exact number of elements.
Example
Find users with exactly 2 skills:
db.users.find({ skills: { $size: 2 } })
5. Query Array Elements with $elemMatch
The $elemMatch
operator matches at least one array element that satisfies multiple conditions.
Example
Find users who have projects with a budget greater than 5000 and approved status:
db.users.find({ projects: { $elemMatch: { budget: { $gt: 5000 }, status: "approved" } } })
6. Query Specific Array Index
MongoDB allows querying elements by index position.
Example
Find users where the first element in the skills array is Python:
db.users.find({ "skills.0": "Python" })
7. Combination Queries with Arrays
You can combine array operators with logical operators like $and
, $or
, and $not
.
Example
Find users who know Python and have more than 2 skills:
db.users.find({ $and: [ { skills: "Python" }, { skills: { $size: { $gt: 2 } } } ] })
Conclusion
MongoDB's array query operators offer flexible and powerful ways to query data stored in arrays. Whether you're searching for exact matches or filtering elements by conditions, mastering these operators will help you build efficient MongoDB queries.
Advanced Update ( $inc, $min, $max, $mul, $unset, $rename & Upsert )
MongoDB provides Advanced Update Operators that allow you to modify documents dynamically without replacing entire documents. These operators are useful for incrementing values, setting fields, renaming fields, or even removing them.
1. $inc
Operator
The $inc
operator increments the value of a field by a specified amount.
Syntax
{ $inc: { field: value } }
Example
Increment the age by 2:
db.users.updateOne({ name: "John" }, { $inc: { age: 2 } })
2. $min
Operator
The $min
operator updates the field only if the specified value is less than the current field value.
Syntax
{ $min: { field: value } }
Example
Set the price to the lower value between 200
and the current price:
db.products.updateOne({ product: "Laptop" }, { $min: { price: 200 } })
3. $max
Operator
The $max
operator updates the field only if the specified value is greater than the current field value.
Syntax
{ $max: { field: value } }
Example
Update the salary only if the new value is higher:
db.employees.updateOne({ name: "Alice" }, { $max: { salary: 10000 } })
4. $mul
Operator
The $mul
operator multiplies the value of the field by a specified number.
Syntax
{ $mul: { field: value } }
Example
Double the price of a product:
db.products.updateOne({ product: "Phone" }, { $mul: { price: 2 } })
5. $unset
Operator
The $unset
operator removes the specified field from the document.
Syntax
{ $unset: { field: "" } }
Example
Remove the email field from the user document:
db.users.updateOne({ name: "John" }, { $unset: { email: "" } })
6. $rename
Operator
The $rename
operator renames a field to the specified name.
Syntax
{ $rename: { oldField: newField } }
Example
Rename name field to fullName:
db.users.updateOne({ name: "John" }, { $rename: { name: "fullName" } })
7. Upsert Option
The upsert
option updates a document if it exists, or inserts a new document if it doesn't.
Syntax
db.collection.updateOne(query, update, { upsert: true })
Example
Insert a user if the name Alice doesn't exist:
db.users.updateOne({ name: "Alice" }, { $set: { age: 25 } }, { upsert: true })
Conclusion
Advanced update operators in MongoDB provide efficient methods for modifying documents. Whether you're incrementing values, renaming fields, or removing fields, these operators make data manipulation more dynamic and performance-friendly.
Update Nested Arrays and Use $pop, $pull, $push and $addToSet Operators
MongoDB provides special operators to manipulate array fields within documents. These operators allow you to add, remove, or modify elements in arrays, including nested arrays.
1. Update Nested Arrays
To update nested arrays, MongoDB uses dot notation to access array elements inside embedded documents.
Example
Update the first skill in the skills array:
db.users.updateOne({ name: "John" }, { $set: { "skills.0": "NodeJS" } })
2. $push
Operator
The $push
operator appends a value to an array.
Syntax
{ $push: { field: value } }
Example
Add MongoDB to the skills array:
db.users.updateOne({ name: "John" }, { $push: { skills: "MongoDB" } })
3. $addToSet
Operator
The $addToSet
operator adds a value to an array only if the value does not already exist.
Syntax
{ $addToSet: { field: value } }
Example
Add Python to the skills array (if not already present):
db.users.updateOne({ name: "John" }, { $addToSet: { skills: "Python" } })
4. $pop
Operator
The $pop
operator removes an element from an array based on its position:
1
: Removes the last element.-1
: Removes the first element.
Syntax
{ $pop: { field: 1 or -1 } }
Example
Remove the last element from the skills array:
db.users.updateOne({ name: "John" }, { $pop: { skills: 1 } })
5. $pull
Operator
The $pull
operator removes all elements from an array that match a specified condition.
Syntax
{ $pull: { field: value } }
Example
Remove Python from the skills array:
db.users.updateOne({ name: "John" }, { $pull: { skills: "Python" } })
Combining Operators
You can combine multiple operators in a single update query.
Example
Add Java to the skills array and remove Python at the same time:
db.users.updateOne({ name: "John" }, { $push: { skills: "Java" }, $pull: { skills: "Python" } })
Conclusion
MongoDB's array update operators provide powerful ways to manipulate arrays dynamically. Whether you need to append, remove, or ensure uniqueness, these operators offer flexible solutions for array-based fields.
Master MongoDB Indexing
Indexing is a powerful feature in MongoDB that improves the performance of queries by creating data structures that allow faster search operations.
1. What is Indexing?
Indexes are special data structures that store a small portion of the dataset in an easy-to-traverse form. Indexes enhance query performance by reducing the amount of data that MongoDB needs to scan.
Without indexes, MongoDB performs a collection scan by searching every document in a collection, which can be time-consuming for large datasets.
2. Creating an Index
You can create an index using the createIndex()
method.
Syntax
db.collection.createIndex({ field: 1 })
1
: Ascending Order-1
: Descending Order
Example
Create an index on the name field in ascending order:
db.users.createIndex({ name: 1 })
3. View Indexes
To view all indexes in a collection:
db.collection.getIndexes()
Example
db.users.getIndexes()
4. Drop Index
To remove an index, use the dropIndex()
method.
Syntax
db.collection.dropIndex({ field: 1 })
Example
Remove the index on the name field:
db.users.dropIndex({ name: 1 })
5. Types of Indexes
MongoDB supports several types of indexes:
Type | Description |
Single Field Index | Index on a single field |
Compound Index | Index on multiple fields |
Multikey Index | Index on array fields |
Text Index | Index for text search |
Unique Index | Ensures unique field values |
Sparse Index | Index only on documents with the field |
TTL Index | Automatically deletes documents after a period |
6. Unique Index
The Unique Index ensures that no two documents have the same value for the indexed field.
Example
Create a unique index on the email field:
db.users.createIndex({ email: 1 }, { unique: true })
7. Compound Index
A Compound Index includes multiple fields and can improve performance when filtering by multiple criteria.
Example
Create an index on name and age fields:
db.users.createIndex({ name: 1, age: -1 })
8. Multikey Index
A Multikey Index is automatically created for array fields.
Example
Create an index on the tags array field:
db.products.createIndex({ tags: 1 })
9. Text Index
Text indexes allow full-text search on string fields.
Example
Create a text index on the description field:
db.products.createIndex({ description: "text" })
10. TTL Index
A TTL (Time-To-Live) index automatically removes documents after a certain period.
Example
Automatically delete documents after 3600 seconds (1 hour):
db.sessions.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 })
Conclusion
Indexing is essential for optimizing query performance in MongoDB. By using different types of indexes, you can improve query efficiency and ensure data integrity. Always choose the right type of index based on your application's requirements.
MongoDB Aggregation Guide
Aggregation in MongoDB is a powerful framework for processing and transforming data within collections. It performs data aggregation operations like filtering, grouping, sorting, and reshaping.
1. What is Aggregation?
Aggregation operations process data records and return computed results. MongoDB's aggregation pipeline provides an efficient and flexible way to perform complex data transformations.
2. Aggregation Pipeline
The Aggregation Pipeline is a sequence of stages, where each stage performs a specific operation on documents.
Basic Syntax
db.collection.aggregate([ { stage1 }, { stage2 }, ... ])
3. Stages in Aggregation Pipeline
Stage | Description |
$match | Filters documents |
$project | Selects specific fields |
$group | Groups documents |
$sort | Sorts documents |
$limit | Limits the number of documents |
$skip | Skips documents |
$lookup | Performs joins |
$unwind | Deconstructs arrays |
$out | Writes documents to a collection |
4. $match
Filters documents based on specified criteria.
Example
Get users older than 25:
db.users.aggregate([
{ $match: { age: { $gt: 25 } } }
])
5. $project
Select specific fields from documents.
Example
Display only name and age fields:
db.users.aggregate([
{ $project: { name: 1, age: 1, _id: 0 } }
])
6. $group
Groups documents by a specified field and performs aggregations.
Example
Count users by age:
db.users.aggregate([
{ $group: { _id: "$age", totalUsers: { $sum: 1 } } }
])
7. $sort
Sorts documents in ascending or descending order.
Example
Sort users by age:
db.users.aggregate([
{ $sort: { age: 1 } }
])
8. $limit
Limits the number of documents returned.
Example
Return only the first 3 documents:
db.users.aggregate([
{ $limit: 3 }
])
9. $skip
Skips a specified number of documents.
Example
Skip the first 2 documents:
db.users.aggregate([
{ $skip: 2 }
])
10. $lookup
Performs joins with other collections.
Example
Join orders collection with users collection:
db.orders.aggregate([
{
$lookup: {
from: "users",
localField: "userId",
foreignField: "_id",
as: "userDetails"
}
}
])
11. $unwind
Deconstructs an array field into multiple documents.
Example
Unwind tags array:
db.products.aggregate([
{ $unwind: "$tags" }
])
12. $out
Writes the aggregation result into a new collection.
Example
Save aggregation results into topUsers collection:
db.users.aggregate([
{ $match: { age: { $gt: 25 } } },
{ $out: "topUsers" }
])
Conclusion
Aggregation in MongoDB provides powerful tools for data transformation, filtering, and analysis. By chaining multiple stages in the aggregation pipeline, you can perform complex operations efficiently.
$bucket operator in MongoDB
The $bucket
operator in MongoDB is used within the aggregation pipeline to categorize documents into specified groups or ranges based on a particular field's value.
1. What is $bucket
?
The $bucket
operator divides documents into groups or buckets based on defined boundaries. It works like a histogram, where documents are grouped into specified ranges.
Syntax
{
$bucket: {
groupBy: <expression>,
boundaries: [ <lower_bound>, <upper_bound>, ... ],
default: <bucket_name>,
output: { <field1>: { <accumulator> }, <field2>: { <accumulator> } }
}
}
Field | Description |
groupBy | Field to group by |
boundaries | Array of boundary values |
default | Bucket name for out-of-range documents |
output | Accumulation fields for grouped data |
2. Example
Group users by their age into predefined ranges:
Sample Data
db.users.insertMany([
{ name: "John", age: 20 },
{ name: "Alice", age: 25 },
{ name: "Bob", age: 35 },
{ name: "Charlie", age: 40 },
{ name: "David", age: 50 }
])
Query
db.users.aggregate([
{
$bucket: {
groupBy: "$age",
boundaries: [20, 30, 40, 50],
default: "Others",
output: {
totalUsers: { $sum: 1 }
}
}
}
])
Output
{
"_id": 20,
"totalUsers": 2
}
{
"_id": 30,
"totalUsers": 2
}
{
"_id": "Others",
"totalUsers": 1
}
3. Explanation
groupBy
: Theage
field is used to categorize documents.boundaries
: Documents are grouped into ranges[20-30)
,[30-40)
,[40-50)
.default
: Documents outside these ranges are grouped under the "Others" bucket.output
: ThetotalUsers
field counts the documents in each bucket.
4. Use Case
Age Group Classification
Price Ranges
Salary Brackets
Product Reviews Ratings
Conclusion
The $bucket
operator is a powerful tool for classifying documents into custom ranges, making it easier to analyze grouped data efficiently. It is especially useful for statistical and analytical purposes in MongoDB.
$lookup : How to Join Collections in MongoDB
The $lookup
operator in MongoDB is used in the aggregation pipeline to perform JOIN operations between two collections, similar to SQL joins.
1. What is $lookup
?
The $lookup
operator allows you to join documents from one collection with documents from another collection based on a specified field.
2. Syntax
{
$lookup: {
from: <foreignCollection>,
localField: <localField>,
foreignField: <foreignField>,
as: <newFieldName>
}
}
Field | Description |
from | The collection to join with |
localField | Field in the current collection |
foreignField | Field in the foreign collection |
as | Name of the new array field to store matched documents |
3. Example
Suppose we have two collections:
users
Collection
db.users.insertMany([
{ _id: 1, name: "John", userId: 101 },
{ _id: 2, name: "Alice", userId: 102 }
])
orders
Collection
db.orders.insertMany([
{ _id: 1, userId: 101, product: "Laptop" },
{ _id: 2, userId: 101, product: "Mouse" },
{ _id: 3, userId: 102, product: "Keyboard" }
])
Query to Join Collections
Join users
with orders
where users.userId
matches orders.userId
:
db.users.aggregate([
{
$lookup: {
from: "orders",
localField: "userId",
foreignField: "userId",
as: "userOrders"
}
}
])
Output
{
"_id": 1,
"name": "John",
"userId": 101,
"userOrders": [
{ "_id": 1, "userId": 101, "product": "Laptop" },
{ "_id": 2, "userId": 101, "product": "Mouse" }
]
}
{
"_id": 2,
"name": "Alice",
"userId": 102,
"userOrders": [
{ "_id": 3, "userId": 102, "product": "Keyboard" }
]
}
4. Explanation
from
: Theorders
collection is the foreign collection.localField
: TheuserId
field from theusers
collection.foreignField
: TheuserId
field from theorders
collection.as
: The result is stored in theuserOrders
array.
5. Unwind the Result
If you want to display each joined document separately, use the $unwind
operator:
db.users.aggregate([
{
$lookup: {
from: "orders",
localField: "userId",
foreignField: "userId",
as: "userOrders"
}
},
{ $unwind: "$userOrders" }
])
6. Multiple Joins
You can use multiple $lookup
stages to join multiple collections.
7. Limitations
$lookup
only works with collections in the same database.Performance may decrease for large datasets.
Only left outer joins are supported.
Conclusion
The $lookup
operator is a powerful tool for performing join operations in MongoDB, making it easier to combine data from multiple collections efficiently.
$project in MongoDB
The $project
operator in MongoDB is used in the aggregation pipeline to include, exclude, or transform fields in the documents.
1. What is $project
?
The $project
operator reshapes documents by selecting specific fields, adding computed fields, or excluding fields from the result set.
2. Syntax
{
$project: {
<field1>: <value>,
<field2>: <value>,
...
}
}
Value | Description |
1 | Include field |
0 | Exclude field |
<expression> | Add or compute fields |
3. Example: Include Specific Fields
Select only name
and age
fields from the users collection:
Sample Data
db.users.insertMany([
{ name: "John", age: 25, city: "New York" },
{ name: "Alice", age: 30, city: "London" }
])
Query
db.users.aggregate([
{
$project: {
name: 1,
age: 1,
_id: 0
}
}
])
Output
{
"name": "John",
"age": 25
}
{
"name": "Alice",
"age": 30
}
4. Exclude Fields
Exclude the city
field:
db.users.aggregate([
{
$project: {
city: 0
}
}
])
5. Add Computed Fields
Calculate ageInMonths
field:
db.users.aggregate([
{
$project: {
name: 1,
ageInMonths: { $multiply: [ "$age", 12 ] }
}
}
])
Output
{
"name": "John",
"ageInMonths": 300
}
{
"name": "Alice",
"ageInMonths": 360
}
6. Rename Fields
Rename name
to userName
:
db.users.aggregate([
{
$project: {
userName: "$name",
age: 1
}
}
])
7. Conditional Fields
Use $cond
to conditionally modify fields:
db.users.aggregate([
{
$project: {
name: 1,
isAdult: { $cond: { if: { $gte: [ "$age", 18 ] }, then: "Yes", else: "No" } }
}
}
])
8. Exclude _id
Field
To exclude _id
, set it explicitly to 0
:
db.users.aggregate([
{
$project: {
name: 1,
_id: 0
}
}
])
9. Combining $project
with Other Stages
You can combine $project
with $match
, $sort
, and other stages.
Example
db.users.aggregate([
{ $match: { age: { $gt: 25 } } },
{ $project: { name: 1, age: 1, _id: 0 } }
])
Conclusion
The $project
operator is a flexible tool to control the shape of documents, making it easier to select, exclude, or transform data in MongoDB aggregation pipelines.
Capped Collection in MongoDB
A Capped Collection in MongoDB is a fixed-size collection that automatically overwrites the oldest documents when it reaches its maximum size.
1. What is a Capped Collection?
Capped collections maintain insertion order.
It works like a circular queue.
Once the collection reaches its maximum size or document count, the oldest documents are automatically deleted to make room for new documents.
Ideal for logging and caching purposes.
2. Create Capped Collection
To create a capped collection, use the createCollection()
method with the capped
option.
Syntax
db.createCollection("logs", {
capped: true,
size: 1024, # Size in bytes
max: 5 # Optional: Maximum number of documents
})
Explanation
Field | Description |
capped | Set to true to enable capped collection |
size | Maximum size of the collection in bytes |
max | (Optional) Maximum number of documents |
3. Insert Data
db.logs.insertMany([
{ message: "Log 1" },
{ message: "Log 2" },
{ message: "Log 3" }
])
4. Verify Capped Collection
To check if a collection is capped:
db.logs.isCapped()
Output:
true
5. Automatic Deletion
If the collection reaches its size limit or maximum document limit, MongoDB automatically removes the oldest documents.
6. Converting to Capped Collection
You can convert an existing collection to capped using:
db.runCommand({ convertToCapped: "logs", size: 1024 })
7. Restrictions
Capped collections do not support document deletion using
delete
commands.Document updates cannot increase document size.
No index creation except
_id
.
8. Use Cases
Log Files
Sensor Data
Temporary Data Storage
Conclusion
Capped collections in MongoDB are perfect for scenarios where you need a fixed amount of storage and automatic data rotation, making them ideal for logging and caching purposes.
The Complete Guide to Authentication ( RBAC )
MongoDB uses Role-Based Access Control (RBAC) to manage user authentication and authorization. This method assigns users specific roles that determine their level of access to databases and collections.
1. What is RBAC?
Role-Based Access Control (RBAC) is a security model that restricts system access based on predefined roles assigned to users. MongoDB uses this model to protect data and control access to resources.
2. How Authentication Works in MongoDB
Authentication verifies the identity of users before granting access to the MongoDB server.
Steps:
Client connects to MongoDB.
User provides username and password.
MongoDB verifies credentials.
If authenticated, MongoDB assigns the user's role.
3. Enable Authentication in MongoDB
To enable authentication, follow these steps:
Edit the mongod.cfg file.
Add the following lines:
security:
authorization: enabled
- Restart MongoDB Service:
net stop MongoDB
net start MongoDB
4. Create Admin User
To create the first admin user:
Switch to Admin Database
use admin
Create User
db.createUser({
user: "admin",
pwd: "admin123",
roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
})
Authenticate User
db.auth("admin", "admin123")
5. Create Regular User with Roles
Create a user with read-only access:
db.createUser({
user: "readonlyUser",
pwd: "readonly123",
roles: [ { role: "read", db: "mydb" } ]
})
6. Built-in Roles in MongoDB
Role | Description |
read | Allows read-only access |
readWrite | Allows read and write operations |
dbAdmin | Database administration rights |
userAdmin | User administration rights |
clusterAdmin | Administer the cluster |
readAnyDatabase | Read access to all databases |
dbOwner | Full control over a database |
7. List All Users
To list all users in the current database:
db.getUsers()
8. Delete User
To delete a user:
db.dropUser("readonlyUser")
9. Update User
To update user roles:
db.updateUser("readonlyUser", { roles: [ { role: "readWrite", db: "mydb" } ] })
10. Authentication Methods
MongoDB supports multiple authentication methods:
SCRAM (Default)
LDAP
Kerberos
X.509 Certificates
11. Testing Authentication
To test user authentication:
mongo -u admin -p admin123 --authenticationDatabase admin
Conclusion
RBAC in MongoDB provides a flexible and secure way to control data access. By assigning appropriate roles, you can ensure that users only have the permissions necessary for their tasks.
MongoDB Replication & Sharding
MongoDB uses Replication and Sharding to ensure data availability, scalability, and fault tolerance in distributed systems.
1. What is Replication?
Replication is the process of synchronizing data across multiple servers to provide redundancy and high availability.
Key Features:
Automatic Failover
Data Redundancy
Increased Read Capacity
2. Replica Set Architecture
A Replica Set is a group of MongoDB servers where one server acts as the Primary node, and others are Secondary nodes.
Replica Set Components:
Component | Description |
Primary | Handles all write operations |
Secondary | Copies data from Primary and serves read operations |
Arbiter | Participates in elections but doesn't store data |
3. Create a Replica Set
- Start MongoDB instances with the --replSet option:
mongod --port 27017 --dbpath /data/db1 --replSet rs0
mongod --port 27018 --dbpath /data/db2 --replSet rs0
mongod --port 27019 --dbpath /data/db3 --replSet rs0
- Initiate the Replica Set:
rs.initiate()
- Add Secondary Nodes:
rs.add("localhost:27018")
rs.add("localhost:27019")
- Check Replica Set Status:
rs.status()
4. Read Preference
By default, all read operations are directed to the Primary node. However, you can configure read preferences to direct reads to secondary nodes.
Example:
db.getMongo().setReadPref("secondary")
5. What is Sharding?
Sharding is the method of distributing large datasets across multiple servers to achieve horizontal scalability.
Why Use Sharding?
Handle large data volumes
High throughput queries
Balanced workload distribution
6. Sharding Architecture
Component | Description |
Shard | Stores actual data |
Config Server | Stores metadata of the cluster |
Mongos | Acts as a query router |
7. Enable Sharding
- Start Config Servers:
mongod --configsvr --port 27020 --dbpath /data/configdb
- Start Shards:
mongod --shardsvr --port 27021 --dbpath /data/shard1
mongod --shardsvr --port 27022 --dbpath /data/shard2
- Start Mongos Router:
mongos --configdb localhost:27020
- Connect to Mongos and Enable Sharding:
use admin
sh.enableSharding("mydb")
- Shard Collection:
sh.shardCollection("mydb.mycollection", { key: 1 })
8. Shard Key Selection
Choosing the correct Shard Key is crucial for performance.
High Cardinality: Unique values
Low Frequency: Avoid hot spots
Even Distribution: Balance data across shards
9. Monitor Sharded Cluster
To check the cluster status:
sh.status()
Conclusion
Replication and Sharding are critical features of MongoDB that provide high availability, fault tolerance, and scalability. Understanding these concepts is essential for building distributed and reliable systems.
Replicate MongoDB Database Like a Pro
Replication in MongoDB provides redundancy and high availability by maintaining multiple copies of the same data across different servers. This guide walks through advanced replication concepts and best practices.
1. What is MongoDB Replication?
Replication is a method of copying data from one MongoDB server (Primary) to one or more servers (Secondaries). If the Primary server fails, one of the Secondaries is elected as the new Primary.
2. Why Use Replication?
High Availability
Fault Tolerance
Data Redundancy
Increased Read Capacity
3. Replica Set Architecture
Role | Description |
Primary | Handles write operations |
Secondary | Synchronizes data from Primary and serves read queries |
Arbiter | Participates in elections but does not store data |
4. Setting Up Replica Set
Step 1: Start MongoDB Instances
mongod --port 27017 --dbpath /data/db1 --replSet myReplicaSet
mongod --port 27018 --dbpath /data/db2 --replSet myReplicaSet
mongod --port 27019 --dbpath /data/db3 --replSet myReplicaSet
Step 2: Initiate Replica Set
rs.initiate({
_id: "myReplicaSet",
members: [
{ _id: 0, host: "localhost:27017" },
{ _id: 1, host: "localhost:27018" },
{ _id: 2, host: "localhost:27019" }
]
})
Step 3: Verify Replica Set
rs.status()
5. Priority-Based Election
You can configure which server is more likely to become Primary by setting priority:
rs.reconfig({
_id: "myReplicaSet",
members: [
{ _id: 0, host: "localhost:27017", priority: 2 },
{ _id: 1, host: "localhost:27018", priority: 1 },
{ _id: 2, host: "localhost:27019", priority: 0.5 }
]
})
6. Read Preferences
Control how clients read data from Replica Sets.
Read Preference | Description |
Primary | Default, reads only from Primary |
Secondary | Reads from Secondary |
Nearest | Reads from the nearest node |
Example:
db.getMongo().setReadPref("secondary")
7. Delayed Members
Configure delayed nodes to maintain a backup of older data:
rs.add({ host: "localhost:27020", priority: 0, hidden: true, slaveDelay: 3600 })
8. Hidden Members
Hidden members replicate data but are invisible to client applications:
rs.add({ host: "localhost:27021", hidden: true })
9. Arbiter Configuration
To add an Arbiter:
rs.addArb("localhost:27022")
10. Replica Set Failover Testing
To simulate Primary failure:
Shut down the Primary node.
Use rs.status() to check the new Primary.
Conclusion
Mastering MongoDB replication requires understanding replica set architecture, failover mechanisms, and advanced configurations. With the right setup, your MongoDB database will be highly available and fault-tolerant.
Transactions in MongoDB: Complete Walkthrough
MongoDB supports Multi-Document Transactions to ensure Atomicity, Consistency, Isolation, and Durability (ACID) properties across multiple documents and collections.
1. What is a Transaction?
A Transaction is a sequence of database operations that must execute entirely or not at all. If any operation fails, the whole transaction is rolled back.
2. Why Use Transactions?
Maintain Data Integrity
Perform Consistent Multi-Document Updates
Group Multiple Operations into One Unit
Rollback on Failure
3. How Transactions Work in MongoDB
MongoDB transactions follow these rules:
Transactions are supported on Replica Sets and Sharded Clusters.
They work only with WiredTiger Storage Engine.
Transactions can span across multiple documents and collections.
4. Start a Transaction
To begin a transaction, use the startSession() method:
session = db.getMongo().startSession()
transaction = session.startTransaction()
5. Commit a Transaction
Use the commitTransaction() method to save all changes:
try:
db.collection1.insertOne({"name": "John"}, {session: session})
db.collection2.updateOne({"name": "John"}, { $set: {"age": 30} }, {session: session})
transaction.commitTransaction()
print("Transaction Committed")
except Exception as e:
print("Transaction Failed: ", e)
transaction.abortTransaction()
6. Abort a Transaction
To discard changes:
transaction.abortTransaction()
print("Transaction Aborted")
7. Transactions in Sharded Clusters
To use transactions in Sharded Clusters:
Enable sharding.
Ensure all collections are in the same shard key zone.
Example:
sh.enableSharding("mydb")
sh.shardCollection("mydb.users", {"userId": 1})
8. Retryable Writes
MongoDB automatically retries write operations that fail due to network errors.
Enable retryable writes:
mongod --replSet rs0 --retryWrites true
9. Limitations of Transactions
Maximum Transaction Time: 60 seconds
Cannot perform DDL (Index or Collection Creation) inside a transaction
Capped Collections are not supported
10. Check Transaction Status
To monitor transactions:
db.currentOp({ "type": "transaction" })
Conclusion
MongoDB transactions provide robust ACID guarantees, making it easier to perform complex multi-document operations. However, they come with certain limitations that developers must consider when designing applications.
Mastering Date Queries in MongoDB
MongoDB provides extensive support for Date Queries using its ISODate format and various operators. This guide explains how to effectively perform date-based queries in MongoDB.
1. MongoDB Date Data Type
MongoDB stores dates as ISODate objects, which are internally represented as milliseconds since the epoch (January 1, 1970).
Example:
db.orders.insertOne({ orderId: 101, orderDate: new Date("2025-03-06") })
To display the current date:
new Date()
ISODate()
2. Date Operators
MongoDB provides several operators to work with dates.
Operator | Description |
$eq | Matches exact date |
$ne | Not equal to date |
$lt | Less than date |
$lte | Less than or equal |
$gt | Greater than date |
$gte | Greater than or equal |
$in | Matches any date in the array |
$nin | Does not match any date in the array |
3. Querying Dates
3.1 Find Documents Before a Certain Date
db.orders.find({ orderDate: { $lt: ISODate("2025-03-06") } })
3.2 Find Documents After a Certain Date
db.orders.find({ orderDate: { $gt: ISODate("2025-03-06") } })
3.3 Find Documents Between Two Dates
db.orders.find({ orderDate: { $gte: ISODate("2025-03-01"), $lt: ISODate("2025-03-06") } })
4. Date Projection
Project only date fields in the result set:
db.orders.find({}, { orderDate: 1, _id: 0 })
5. Aggregation with Dates
MongoDB Aggregation Pipeline provides additional functionality for date manipulation.
5.1 Filter by Date
db.orders.aggregate([
{ $match: { orderDate: { $gte: ISODate("2025-03-01"), $lt: ISODate("2025-03-06") } } }
])
5.2 Group by Year and Month
db.orders.aggregate([
{ $group: { _id: { year: { $year: "$orderDate" }, month: { $month: "$orderDate" } }, totalOrders: { $sum: 1 } } }
])
6. Date Functions
Function | Description |
$dateToString | Converts date to string |
$year | Extracts Year |
$month | Extracts Month |
$dayOfMonth | Extracts Day |
$hour | Extracts Hour |
$minute | Extracts Minute |
$second | Extracts Second |
Example:
db.orders.aggregate([
{ $project: { orderDate: { $dateToString: { format: "%Y-%m-%d", date: "$orderDate" } } } }
])
7. Update Dates
Update date fields using the $currentDate operator:
db.orders.updateOne({ orderId: 101 }, { $set: { shippedDate: new Date() } })
8. Delete Documents by Date
Delete documents before a certain date:
db.orders.deleteMany({ orderDate: { $lt: ISODate("2025-03-01") } })
9. Indexing Dates
Improve date query performance by creating indexes:
db.orders.createIndex({ orderDate: 1 })
Conclusion
Mastering date queries in MongoDB helps optimize search performance and ensures accurate data retrieval. Use the correct operators, functions, and indexes to manage time-sensitive data efficiently.
Managed & Unmanaged Database
Understanding Managed and Unmanaged Databases is crucial for selecting the right database solution based on your project requirements.
1. What is a Managed Database?
A Managed Database is a cloud-based database service where the service provider handles all administrative tasks such as setup, maintenance, backups, and security.
Key Features:
Automatic Backups
Scalability
High Availability
Security Patches
Performance Monitoring
Examples:
MongoDB Atlas
AWS RDS
Google Cloud Firestore
Azure Cosmos DB
Advantages:
Zero Maintenance Effort
High Uptime Guarantee
Automatic Scaling
Expert Support
Disadvantages:
Higher Cost
Limited Customization
Vendor Lock-in
2. What is an Unmanaged Database?
An Unmanaged Database is a self-hosted database where the user is responsible for installation, configuration, maintenance, and security.
Key Features:
Full Control over Configuration
Manual Backups
Cost-Effective
Custom Security Policies
Examples:
MongoDB Community Edition
MySQL Self-Hosted
PostgreSQL Self-Hosted
Advantages:
Complete Customization
Cost-Efficient
No Vendor Lock-in
Disadvantages:
High Maintenance Effort
Requires Technical Knowledge
Manual Scaling
3. Key Differences between Managed & Unmanaged Database
Features | Managed Database | Unmanaged Database |
Maintenance | Automatic | Manual |
Backup | Automated | Manual |
Scalability | Automatic | Manual |
Cost | Expensive | Cost-Effective |
Security | Provider Managed | User Managed |
Customization | Limited | Full Control |
4. Which One to Choose?
Use Case | Recommended Database |
Small Projects | Unmanaged Database |
High Traffic Apps | Managed Database |
Budget Constraints | Unmanaged Database |
Security Priority | Managed Database |
Custom Configuration | Unmanaged Database |
Conclusion
Choosing between Managed and Unmanaged Databases depends on the project's size, budget, technical expertise, and performance requirements. Managed databases are ideal for hassle-free operations, while unmanaged databases provide more control and flexibility.
Subscribe to my newsletter
Read articles from Arijit Das directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
