-
Type: Question
-
Resolution: Duplicate
-
Priority: Minor - P4
-
None
-
Affects Version/s: 3.4.22
-
Component/s: Aggregation Framework, Index Maintenance, Performance
-
None
I am not sure if this user error, a design choice (or flaw), or a bug. We encountered an issue where, after an index was applied, an aggregation query (which was previously running fine; slow-ish, but fine), spiked CPU to 100% and became very, very slow. We estimate an approximate 10x to 20x slow-down.
Note that we were able to modify the query to cause a new query plan to be generated which resolved the issue.
The query in question is:
db.GPSLocations.aggregate([ { $match: { organisation:
, date: { $gte: new Date(1560181081336) } } }, { $sort:
}, { $group: { _id: "$ownerId", last:
} } ]);
This query gets the last known GPS location for a user.
{ "stages":[ { "$cursor":{ "query":{ "organisation":{ "$in":[ "MFgsJF5SmL6hxaYcE", "MFgsJF5SmL6hxaYcE" ] }, "date":{ "$gte":ISODate("2019-06-10T15:38:01.336Z") } }, "fields":{ "date":1, "ownerId":1, "_id":1 }, "queryPlanner":{ "plannerVersion":1, "namespace":"bustle-supply-uat.GPSLocations", "indexFilterSet":false, "parsedQuery":{ "$and":[ { "organisation":{ "$eq":"MFgsJF5SmL6hxaYcE" } }, { "date":{ "$gte":ISODate("2019-06-10T15:38:01.336Z") } } ] }, "winningPlan":{ "stage":"FETCH", "filter":{ "date":{ "$gte": ISODate("2019-06-10T15:38:01.336Z") } }, "inputStage":{ "stage":"IXSCAN", "keyPattern":{ "organisation":1 }, "indexName":"organisation_1", "isMultiKey":false, "multiKeyPaths":{ "organisation":[ ] }, "isUnique":false, "isSparse":false, "isPartial":false, "indexVersion":2, "direction":"forward", "indexBounds":{ "organisation":[ "[\"MFgsJF5SmL6hxaYcE\", \"MFgsJF5SmL6hxaYcE\"]" ] } } }, "rejectedPlans":[ { "stage":"FETCH", "filter":{ "organisation":{ "$eq":"MFgsJF5SmL6hxaYcE" } }, "inputStage":{ "stage":"IXSCAN", "keyPattern":{ "date":1 }, "indexName":"date_1", "isMultiKey":false, "multiKeyPaths":{ "date":[ ] }, "isUnique":false, "isSparse":false, "isPartial":false, "indexVersion":2, "direction":"forward", "indexBounds":{ "date":[ "[new Date(1560181081336), new Date(9223372036854775807)]" ] } } } ] } } }, { "$sort":{ "sortKey":{ "ownerId":1, "date":1 } } }, { "$group":{ "_id":"$ownerId", "last":{ "$last":"$_id" } } } ], "ok":1 }
And the post-index query plan:
{ "stages":[ { "$cursor":{ "query":{ "organisation":{ "$in":[ "MFgsJF5SmL6hxaYcE", "MFgsJF5SmL6hxaYcE" ] }, "date":{ "$gte": ISODate("2019-06-10T15:38:01.336Z") } }, "sort":{ "ownerId":1, "date":1 }, "fields":{ "ownerId":1, "_id":1 }, "queryPlanner":{ "plannerVersion":1, "namespace":"bustle-supply-uat.GPSLocations", "indexFilterSet":false, "parsedQuery":{ "$and":[ { "organisation":{ "$eq":"MFgsJF5SmL6hxaYcE" } }, { "date":{ "$gte": ISODate("2019-06-10T15:38:01.336Z") } } ] }, "winningPlan":{ "stage":"FETCH", "filter":{ "$and":[ { "organisation":{ "$eq":"MFgsJF5SmL6hxaYcE" } }, { "date":{ "$gte": ISODate("2019-06-10T15:38:01.336Z") } } ] }, "inputStage":{ "stage":"IXSCAN", "keyPattern":{ "ownerId":1, "date":1 }, "indexName":"gpsLocationsByOwnerAndDate", "isMultiKey":false, "multiKeyPaths":{ "ownerId":[ ], "date":[ ] }, "isUnique":false, "isSparse":false, "isPartial":false, "indexVersion":2, "direction":"forward", "indexBounds":{ "ownerId":[ "[MinKey, MaxKey]" ], "date":[ "[MinKey, MaxKey]" ] } } }, "rejectedPlans":[ ] } } }, { "$group":{ "_id":"$ownerId", "last":{ "$last":"$_id" } } } ], "ok":1 }
The "offending" operation appears to be the sort in the post-index case. I noted that it included an ownerId property, which is not previous filtered by in the query plan.
Modifying the query to remove the ownerId from the sort resolves the issue;
db.GPSLocations.aggregate([ { $match: { organisation: { $in: [ "MFgsJF5SmL6hxaYcE", "MFgsJF5SmL6hxaYcE" ] }, date: { $gte: new Date(1560181081336) } } }, { $sort: { date: 1 } }, { $group: { _id: "$ownerId", last: { $last: "$_id" } } } ]);
Which renders this query plan:
{ "stages":[ { "$cursor":{ "query":{ "organisation":{ "$in":[ "MFgsJF5SmL6hxaYcE", "MFgsJF5SmL6hxaYcE" ] }, "date":{ "$gte": ISODate("2019-06-10T15:38:01.336Z") } }, "sort":{ "date":1 }, "fields":{ "ownerId":1, "_id":1 }, "queryPlanner":{ "plannerVersion":1, "namespace":"bustle-supply-uat.GPSLocations", "indexFilterSet":false, "parsedQuery":{ "$and":[ { "organisation":{ "$eq":"MFgsJF5SmL6hxaYcE" } }, { "date":{ "$gte": ISODate("2019-06-10T15:38:01.336Z") } } ] }, "winningPlan":{ "stage":"FETCH", "inputStage":{ "stage":"IXSCAN", "keyPattern":{ "organisation":1, "date":1 }, "indexName":"organisation_1_date_1", "isMultiKey":false, "multiKeyPaths":{ "organisation":[ ], "date":[ ] }, "isUnique":false, "isSparse":false, "isPartial":false, "indexVersion":2, "direction":"forward", "indexBounds":{ "organisation":[ "[\"MFgsJF5SmL6hxaYcE\", \"MFgsJF5SmL6hxaYcE\"]" ], "date":[ "[new Date(1560181081336), new Date(9223372036854775807)]" ] } } }, "rejectedPlans":[ { "stage":"FETCH", "filter":{ "organisation":{ "$eq":"MFgsJF5SmL6hxaYcE" } }, "inputStage":{ "stage":"IXSCAN", "keyPattern":{ "date":1 }, "indexName":"date_1", "isMultiKey":false, "multiKeyPaths":{ "date":[ ] }, "isUnique":false, "isSparse":false, "isPartial":false, "indexVersion":2, "direction":"forward", "indexBounds":{ "date":[ "[new Date(1560181081336), new Date(9223372036854775807)]" ] } } } ] } } }, { "$group":{ "_id":"$ownerId", "last":{ "$last":"$_id" } } } ], "ok":1 }
So why questions are:
- Given that the "pre-index" case work well, why does the "post-index" case not? I understand that MongoDB runs candidate query plans; should it not be able to determine that the post-index query plan is inferior to the first?
- Alternatively, is there something that we don't understand about the MongoDB aggregation pipeline and how it comes up with query plans? Do we need to be (more) careful in applying indexes to certain query shapes?
- If this is something that we did, is there documentation that can tell us why MongoDB acted this way in this case?
Many thanks!
Ben
- duplicates
-
SERVER-7568 Aggregation framework favors non-blocking sorts
- Closed