-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
Fully Compatible
-
ALL
In SERVER-34741 we added this optimization:
{$group: {_id: "$foo", ...}}, {$match: {_id: ...}} -> {$match: {foo: ...}}, {$group: {_id: "$foo", ...}},
When a $match only touches the group key (_id), we move it before the group (and rename appropriately).
It's incorrect for a $type predicate, because $type can distinguish between values that compare equal.
For example:
> db.c.find() { "_id" : 1, "a" : NumberLong(5) } { "_id" : 2, "a" : 5 } > db.c.aggregate([ {$group: {_id: "$a", n: {$count: {}} }} ]) { "_id" : NumberLong(5), "n" : 2 } > db.c.aggregate([ {$group: {_id: "$a", n: {$count: {}} }}, {$match: {_id: {$type: 'long'}}} ]) { "_id" : NumberLong(5), "n" : 1 }
The $match after the $group should only be able to keep/drop whole groups, but here it changed the count 'n' within a group.
We should only push down the predicate when it treats equal values the same. And since we may add predicates over time, we should enable the optimization only in cases we know work, rather than disabling it in specific cases we know don't work.
Other things to consider:
- Custom collations affect what "compare equal" means. Which predicates can distinguish values that are collation-equal? Maybe $regex would be one.
- Does this interact with
SERVER-73253, which extended this optimization to support dotted paths?
- is caused by
-
SERVER-34741 Move $match in front of $group if condition is on group key
- Closed
- related to
-
SERVER-73241 Better path tracking when $$ROOT is used in a $group accumulator
- In Code Review
-
SERVER-73253 Better path tracking when renaming nested/compound grouping fields
- Closed