-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.4.8, 2.6.0-rc0
-
Component/s: Querying
-
Query Optimization
-
ALL
-
(copied to CRM)
Currently we rank plans by work()'ing them some number of times and seeing who produces the most results.
If there is a plan with a blocking stage, it will therefore almost always be ranked lower than plans without a blocking stage. This is because plans without a blocking stage immediately start producing results. Although the plan with the blocking stage doesn't appear to be making progress immediately, it may end up looking much better / hitting EOF much faster if we were to run the plans to completion.
Blocking stages for which this is an issue:
- SORT
- AND_HASH
Here's one way to reproduce the issue, for the blocking SORT stage:
> t = db.t test.t > t.drop() true > t.ensureIndex({a: 1}) WriteResult({ "nInserted" : 1 }) > t.ensureIndex({b: 1}) WriteResult({ "nInserted" : 1 }) > for (var i = 0; i < 10000; i++) { t.save({a: 1}); } WriteResult({ "nInserted" : 1 }) > for (var i = 0; i < 150; i++) { t.save({b: 1}); } WriteResult({ "nInserted" : 1 }) > t.find({b: 1}).sort({a: 1}).explain() { "cursor" : "BtreeCursor a_1", "isMultiKey" : false, "n" : 150, "nscannedObjects" : 10150, "nscanned" : 10150, "nscannedObjectsAllPlans" : 10347, "nscannedAllPlans" : 10348, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 85, "nChunkSkips" : 0, "millis" : 139, "indexBounds" : { "a" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ] }, "server" : "localhost:27017", "filterSet" : false }
The query ends up using index {a: 1}, which involves scanning 10,000 docs and fetching them from disk if they're not in memory. We would have liked to use index {b: 1}, which would involve fetching only 150 docs and then sorting them in memory.
- is duplicated by
-
SERVER-13866 Wrong index is being picked
- Closed
-
SERVER-42733 Query not using index intersection
- Closed
- is related to
-
SERVER-13831 Find with $gte/$gt/$lte/$lt on Date field and sorting on another field leads to unnecessary in-memory sort without using index
- Closed
-
SERVER-20619 Statistics-based query optimization
- Backlog