-
Type: Improvement
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.0.6
-
Component/s: Index Maintenance, Querying
-
Query Optimization
In the following schema:
{ "arr" : [ "a", "b", "c" ], "field" : 1, "field_2" : 1 }
with the following index:
{ "arr" : 1, "field" : 1, "field_2" : 1 }
If I use this query:
db.coll
.find({
$or: [
{arr: 'a'},
{arr: []}
]
})
.sort({field: 1})
.explain()
The inputStages are separated to 3 stages - arr: [a, a], arr: [[], []], arr: [undefined, undefined].
This is a good scenario since these input stages are followed by a "SORT_MERGE" stage
BUT, if I use the following query:
db.coll
.find({
$or: [
{arr: 'a'},
{arr: []}
],
field: 1
})
.sort({field_2: 1})
.explain()
There are only 2 input stages - arr: [a, a], arr: [[undefined, undefined], [[] , []]] .
This results that an additional stage needs to happen in order to FETCH the empty array docs and then it cannot use the SORT_MERGE stage.
In large collection, this causes the following error:
"Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit
Even though there is an index for this query.
I would expect the second scenario to perform like the first one - separating the input stages to 3 stages and to use the SORT_MERGE function.
- is related to
-
SERVER-24518 MERGE_SORT_STAGE can be used more aggressively when OR_STAGE index sort orders match
- Backlog
-
SERVER-19972 Passing empty array to $in should result in an error
- Closed