ISSUE DESCRIPTION AND IMPACT
Matching by $type on a looked up field after a $lookup stage in an aggregation operation leads to incorrect results, because the $match on $type is pushed into the $lookup without removing the reference to the new field created by $lookup. This failure is unique to the $type operator.
For example, the following operation matches on "lookedup.name" after a $lookup into the "lookedup" field:
[ { $lookup: { from: "second", localField: "field", foreignField: "foreignField", as: "lookedup", }, }, { $unwind: "$lookedup" }, { $match: { "lookedup.name": { $type: "string" } } }, ]
MongoDB incorrectly optimizes the aggregation pipeline by attempting to match on "lookedup.name" in the "second" collection, where "lookedup.name" does not exist.
The correct behavior is for the optimized lookup stage to match on “name” instead.
DIAGNOSIS AND AFFECTED VERSIONS
All versions since 3.4.0 are affected by this issue.
Running the .explain() method will help identify the incorrect matching stage.
db.first.explain().aggregate([ { $lookup: { from: "second", localField: "field", foreignField: "foreignField", as: "lookedup", }, }, { $unwind: "$lookedup" }, { $match: { "lookedup.name": { $type: "string" } } }, ]);
If the query optimizer moves the $match stage into the $lookup stage while still referencing the field created by $lookup, the operation is impacted. For example:
{ $lookup: { from: "second", as: "lookedup", localField: "field", foreignField: "foreignField", unwinding: { preserveNullAndEmptyArrays: false }, matching: { "lookedup.name": { $type: [2] } }, }, };
where
matching: {“lookedup.name”: { $type: [2] } }
should be
matching: {“name”: { $type: [2] } }
REMEDIATION AND WORKAROUNDS
This fix will be included in 4.4.2, 4.2.11, 4.0.22, 3.6.21. To work around this issue, you can add a new field "newField" that represents "lookedup.name", run the match on "newField", then unset "newField".
db.first.aggregate([ { $lookup: { from: "second", localField: "field", foreignField: "foreignField", as: "lookedup", }, }, { $unwind: "$lookedup" }, { $addFields: { newField: "$lookedup.name" }}, { $match: { "newField": { $type: "string" } } }, { $unset: "newField" } ]);
Original Description
A $lookup stage followed by a $match on the type of a looked up field does not work. It seems to be using the wrong field name when the $match gets pushed inside the $lookup stage.
Steps to Reproduce
1. Create two collections as follows
- collection "first" contains: { "field" : "value" }
- collection "second" contains: { "foreignField" : "value", "name" : "Thomas" }
2. Run the following aggregation on collection "first"
db.first.aggregate( [ {$lookup: { from: "second", localField: "field", foreignField: "foreignField", as: "lookedup"} }, {$unwind: "$lookedup"}, {$match: {"lookedup.name": {$type: "string"}}} ]);
3. In the explain, you can see that the $match gets pushed into the lookup stage but uses the original field name "lookedup.name" which doesn't exist in the remote collection.
{ { "$lookup" : { "from" : "second", "as" : "lookedup", "localField" : "field", "foreignField" : "foreignField", "unwinding" : { "preserveNullAndEmptyArrays" : false }, "matching" : { "lookedup.name" : { "$type" : [ 2 ] } } } }
If a non-$type query is used (e.g. equality match), the field instead is just "name", which is the correct behavior.
{{ "$lookup" : { "from" : "second", "as" : "lookedup", "localField" : "field", "foreignField" : "foreignField", "unwinding" : { "preserveNullAndEmptyArrays" : false }, "matching" : { "name" : { "$eq" : "Thomas" } } } }
Expected Results
Aggregation returns the document because the field lookedup.name is a string
Actual Results
Aggregation returns no documents.
—
Hat tip to nathan.smyth who discovered this bug.
- is related to
-
SERVER-1475 {field: {$type:"array"}} should return documents where "field" is an array
- Closed
-
SERVER-21612 Combine post $lookup $match on looked up field
- Closed
- links to