GraphLookup in Mongodb
Graph lookup is a way of doing recursion in Mongodb. It does a recursion to find the recursive outcome of any given condition within a single collection or across collections.
Syntax:
{
$graphLookup: {
from: <collection name>,
startWith: <field name>,
connectFromField: <field name>,
connectToField: <field name>,
as: <new field name to store the recursion result>,
maxDepth: <maximum recursion depth>, // Optional
depthField: <string>, // Optional
restrictSearchWithMatch: <document> // Optional
}
}
Parameter explanation:
from - Collection name to perform graph lookup ie. recursive search
startWith - It is a local field that indicates that the recursion starts here for each input document. It matches the ‘connectToField’ in ‘from’ collection.
connectFromField - Graphlookup takes up connectFromField in ‘from’ collection and matches it with the connectToField in ‘from’ collection for each document. Thereby recursion goes on.
connectToField: The field to be connected.
as:Takes a new field name that stores the response for each document as an array.
maxDepth: Maximum depth of the recursion to take place. It is an optional parameter.
depthField: Takes a new field name that stores the depth of recursion for each document. The depth value starts at Zero.
restrictSearchWithMatch: Additional recursive search conditions.
Considerations:
Sharded Collections:
- Sharding is the method of storing the data among different databases or machines when the data is huge. In graph lookup, in the ‘from’ parameter we can specify the sharded collections also.
Max Depth:
- maxDepth parameter in the $graphLookup stage indicates the maximum depth of recursion to take place for the specified query. On setting this parameter to ‘0’, it indicates that to not do a recursive search for the specified query.
Memory:
GraphLookup memory limit should be 100 megabytes.
aggregate() operation tasks allowDiskUse parameter as its input.
If it is { allowDiskUse: true } - It allows the pipeline stage which requires more than 100 megabytes of memory to write temporary files to disk.
If it is { allowDiskUse: false} - The pipeline stage which requires more than 100 megabytes of memory raises an error.
Even if the aggregate() operation has { allowDiskUse: true }, GraphLookup ignores the option. Only other stages in the pipeline have its effect.
Views and Collation:
If we are doing aggregation across multiple views then it should have the same collation.
Collation is nothing but it allows users to specify language-specific rules for string comparisons.
Demonstration:
Consider two collections - employees and managers.
Schema diagram:
In employees collection - managers field indicates the manager associated with the particular employee. It contains an array of managerId. ie._id in ‘managers’ collection.
In managers collection - senior_managers field indicates the senior managers for the particular manager. It holds an array of manager id ie._id in the ‘managers’ collection.
Managers collection:
Employees Collection:
Structure of employees and managers in my database:
Employees Example:
Managers Example:
Scenarios:
Case 1: Within the collection
Case 2: Across collections
Case 1: Within the collection
To find the senior managers associated with any particular manager, we have to do a recursion within the ‘managers’ collection.
Note: Run the below graphlookup query on the ‘managers’ collection
Example: For Jack, his senior managers are Tom, Laura, Steve, Mary and Chris - 5 documents as depicted in the 'Managers Example' picture.
{
$graphLookup: {
from: 'managers',
startWith: '$senior_managers',
connectFromField: 'senior_managers',
connectToField: '_id',
as: 'AllParentManagers'
}
}
Output: For each manager, In the ‘AllParentManagers’ field it displays the corresponding senior managers.
Case 2: Across collections
To find the managers associated with any particular employee, we have to do a recursion across the ‘employees’ and ‘managers’ collection.
Note: Run the below graphlookup query on employee collection
Example: For Alex, his managers are Jack, Julie, Tom, Steve, Laura, Mary, Chris, Sarah, Matt - 9 documents.
{
$graphLookup: {
from: 'managers',
startWith: '$managers',
connectFromField: 'senior_managers',
connectToField: '_id',
as: 'AllParentManagers',
}
}
Output: For each employee, In the ‘AllParentManagers’ field it displays the corresponding managers and their senior managers.
--------------------- THANKS FOR READING😇 ---------------------
Subscribe to my newsletter
Read articles from Aishwarya S directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by