-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.2.1
-
Component/s: Aggregation Framework, Index Maintenance
-
None
I could be wrong, but it seems to me that when running aggregation pipelines, indexes aren't used as well as for find queries with sort/limit. Example:
I have documents like this:
_id: ObjectId(...), id: "someid", status: "published", version: 123123123, foo: "bar"
Then there's an index for
{id: 1, status: 1, version: 1}
If I run
myCollection.find({id:'someid', status:'published'}).sort({version: -1}).limit(1).explain()
...I get the following inputStage indicating that this is done without document fetching.
"inputStage" : { "stage" : "SORT", "sortPattern" : { "id" : 1.0, "status" : 1.0, "version" : -1.0 }, "limitAmount" : 1.0, "inputStage" : { "stage" : "SORT_KEY_GENERATOR", "inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "id" : 1.0, "status" : 1.0, "version" : 1.0 }, "indexName" : "contentId_1_status_1_version_1", "isMultiKey" : false, "multiKeyPaths" : \{ "id" : [ ], "status" : [ ], "version" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2.0, "direction" : "forward", "indexBounds" : { "id" : [ "[\"someid\", \"someid\"]" ], "status" : [ "[\"published\", \"published\"]" ], "version" : [ "[MinKey, MaxKey]" ] } } } }
However, If I do the same with aggregation pipeline. I.e. something like
myCollection.explain().aggregate([ {$match: {id: 'content12', status:'published'}}, {$sort: {id: 1, status: 1, version: -1} }, {$limit: 1}])
....then it seems like it's fetching more. I get
"inputStage" : { "stage" : "IXSCAN", "keyPattern" : { "contentId" : 1.0, "status" : 1.0, "version" : 1.0 }, "indexName" : "contentId_1_status_1_version_1", "isMultiKey" : false, "multiKeyPaths" : \{ "contentId" : [ ], "status" : [ ], "version" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2.0, "direction" : "forward", "indexBounds" : { "contentId" : [ "[\"content12\", \"content12\"]" ], "status" : [ "[\"published\", \"published\"]" ], "version" : [ "[MinKey, MaxKey]" ] } }
This seems like a miss. Shouldn't an aggregation pipeline like that be able to use the same index handling?
- duplicates
-
SERVER-7568 Aggregation framework favors non-blocking sorts
- Closed
-
SERVER-36723 $limit should push down into the PlanStage layer, rather than coalescing with DocumentSourceCursor
- Closed
- is related to
-
SERVER-7568 Aggregation framework favors non-blocking sorts
- Closed
-
SERVER-36723 $limit should push down into the PlanStage layer, rather than coalescing with DocumentSourceCursor
- Closed