-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
ALL
-
-
(copied to CRM)
-
17
Currently we freely reorder predicates for rewrite and optimization for the sake of performance. We could provide a way to define the order certain predicates are run in so any side effects from them (such as errors being produced) are predictable
Here is a description of a situation that shows our current behavior:
There is an inconsistency in how we are handling an error for a specific query involving $expr for when we use an index vs when we do a collection scan. The inconsistency stems from two behaviors of our query system:
1. A match stage can be pushed down to hide an error
Suppose there are two $match stages, (a) one with an expression that can throw an error for a specific input and (b) the other $match stage filters out those specific error-causing inputs. When (a) is before (b) but there is an index that matches the query for (b), the query system will push that stage to the IXSCAN and run (b) before (a) which will suppress that error.
Example:
const docs = [ {num: 0}, {num: 1}, ]; const agg = [ {$match: {$expr: {$divide: [5, '$num']}}}, // (a) throws when $num is 0. {$match: {num: {$ne: 0}}}, // (b) filters out 0. ];
The aggregation above will return
{num: 0}instead of throwing the divide by zero error, because the $ne stage is pushed down to index scan:
"winningPlan" : { "stage" : "FETCH", "filter" : { "$expr" : { "$divide" : [ { "$const" : 5 }, "$num" ] } }, "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "num" : 1 }, "indexName" : "num_1", "isMultiKey" : false, "multiKeyPaths" : { "num" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "num" : [ "[MinKey, 0.0)", "(0.0, MaxKey]" ] } } }
2. Collection scans can form canonicalize queries to throw error
For collection scan, the opposite of the case above can happen where when (b) is placed before (a), but (a) is an $or containing a $expr that throws. In this scenario, the query system will absorb the two match stages into one and put the two predicates in an $and with a specific order. When this happens, (a) will be reordered to be placed before (b) and thus the query doing a collection scan will throw an error.
Given these two behaviors, the results for the following queries are inconsistent between an index scan and collection scan. Index scan will succeed while a collection scan will throw a divide by zero error.
const docs = [ {num: 0}, {num: 1}, ]; const agg1 = [ {$match: {num: {$ne: 0}}}, // collection scan moves this match down. {$match: {$or: [ {$expr: {$divide: [5, '$num']}}, // (a) throws when $num is 0. {a: {$exists: true}}, ]}}, ]; const agg2 = [ {$match: {$or: [ {$expr: {$divide: [5, '$num']}}, // (a) throws when $num is 0. {a: {$exists: true}}, ]}}, {$match: {num: {$ne: 0}}}, // index scan moves this match up. ];
The explain outputs for both of two queries will be the following:
// Collection scan "winningPlan" : { "stage" : "COLLSCAN", "filter" : { "$and" : [ { "$or" : [ { "b" : { "$exists" : true } }, { "$expr" : { "$divide" : [ { "$const" : 5 }, "$num" ] } } ] }, { "$nor" : [ { "num" : { "$eq" : 0 } } ] } ] }, "direction" : "forward" } // Index scan "winningPlan" : { "stage" : "FETCH", "filter" : { "$or" : [ { "b" : { "$exists" : true } }, { "$expr" : { "$divide" : [ { "$const" : 5 }, "$num" ] } } ] }, "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "num" : 1 }, "indexName" : "num_1", "isMultiKey" : false, "multiKeyPaths" : { "num" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "num" : [ "[MinKey, 0.0)", "(0.0, MaxKey]" ] } } }
This inconsistency also occurs for an $and expression in the same conditions.
- is depended on by
-
SERVER-37432 Re-enable aggregation_wildcard_fuzzer after addressing known failures
- Closed
- is duplicated by
-
SERVER-45231 Provide a way to cause a well-defined order of evaluation for predicates
- Closed
- is related to
-
SERVER-45123 Queries should not fail with type error when $type provides the correct filter
- Closed
-
SERVER-59146 Enable push down of $match with $expr
- Backlog
-
SERVER-45308 Alphabetical order of field names used in an $or clause drives evaluation order and thus affects performance
- Closed
- related to
-
SERVER-52619 Performance Regression on 4.4 compared to 4.2 on Aggregate query
- Closed
-
SERVER-45364 Query Planner should estimate cost of each predicate
- Backlog
-
SERVER-45231 Provide a way to cause a well-defined order of evaluation for predicates
- Closed