The $lookup Stage
The $lookup
stage in the aggregation pipeline allows us to perform a left outer join to a collection in the same database to filter in documents from the joined collection for processing. The basic $lookup
syntax is as follows:
db.collection.aggregate([
{
$lookup:
{
from: "collection to join",
localField: "field from the input documents",
foreignField: "field from the documents of the 'from' collection",
as: "output array field"
}
}
]);
Let’s write a query to join books with authors:
db.books.aggregate([
{
$lookup:
{
from: "authors",
localField: "author_id",
foreignField: "_id",
as: "authorDetails"
}
}
]).pretty();
This would produce documents where each book now includes an array called ‘authorDetails’ containing the joined author document(s).
Handling Multiple Matches
What happens if our join condition matches multiple documents? In such cases, the $lookup
stage will append all matching documents to the output array. For our example, since each book has a single author, we should get exactly one match per book. However, if you’re referencing a collection with the possibility of multiple matches and need to handle this, MongoDB will naturally handle this by appending each matching document to the specified array.
Combining $lookup with Other Stages
The real power of the aggregation pipeline comes from combining various stages. Let’s use $lookup
with $match
to filter results:
db.books.aggregate([
{
$match: { title: 'Pride and Prejudice' }
},
{
$lookup:
{
from: 'authors',
localField: 'author_id',
foreignField: '_id',
as: 'authorDetails'
}
}
]).pretty();
We first filter books where the title is ‘Pride and Prejudice’ and then perform our $lookup
.
Optimizing $lookup Performance
Although $lookup
stages are powerful, they may cause performance hits, especially with large datasets or many lookups. Optimizations include:
- Making sure that the foreign field (the field from the joined collection) is indexed.
- Limiting the amount of data processed at each stage by using
$match
,$project
, etc., before the$lookup
stage.
Conclusion
In this tutorial, we’ve seen how to use the $lookup
aggregation stage in MongoDB to join documents from separate collections, effectively ‘merging’ reference relations. By mastering $lookup
and combining it with other pipeline stages, you can perform complex data retrievals and transformations that can harness the full power of MongoDB’s flexible document model.