A rooted OR query with n branches gets turned into a plan with n index scans, even if each scan is over the same index. For example, consider a collection with an index on a:1,b:1. This query will result in a plan with two index scans:
db.c.find({$or: [{a:1,b:1}, {a: 1, b: 2}]})
"stage" : "SUBPLAN", "inputStage" : { "stage" : "FETCH", "inputStage" : { "stage" : "OR", "inputStages" : [ { "stage" : "IXSCAN", "keyPattern" : { "a" : 1, "b" : 1 }, "indexName" : "a_1_b_1", "isMultiKey" : false, "multiKeyPaths" : { "a" : [ ], "b" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "a" : [ "[1.0, 1.0]" ], "b" : [ "[1.0, 1.0]" ] } }, { "stage" : "IXSCAN", "keyPattern" : { "a" : 1, "b" : 1 }, "indexName" : "a_1_b_1", "isMultiKey" : false, "multiKeyPaths" : { "a" : [ ], "b" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "a" : [ "[1.0, 1.0]" ], "b" : [ "[2.0, 2.0]" ] } } ] } }
The same logic applies even if there are more than two branches. We know of one instance where a customer was running queries of this form with 20 branches (and was getting 20 IX scans).
It seems like it would be a strict improvement to use a plan which does a single index scan, instead of one per branch. Today, it's possible to "force" this behavior in some cases by using $in. For example:
db.c.find({a: 1, b: {$in: [1,2]}})
This query would result in a plan which does a scan over the a:1,b:1 index and uses bounds a:[1,1], b:[1,2].
Using $in is not always possible, however. For example, the below query could not be re-written to use $in because doing so would change its meaning:
db.c.find({$or: [{a:1,b:1}, {a: 2, b: 2}]})
Using $in:
db.c.find({a: {$in: [1,2]}, b: {$in: [1,2]}})
The second query would return documents where a is 1 but b is 2, whereas the first query would not.
- related to
-
SERVER-96053 Merge index scans on identical index if differing bounds for a single field
- Backlog
-
SERVER-31360 MatchExpression::getOptimizer() for $or/$and should collapse equivalent clauses
- Closed