ISSUE SUMMARY
A negation (like $ne, $not, ...) inside an $elemMatch operator can under certain circumstances cause the query planner to create the wrong index bounds. This can lead to wrong or missing results. It can also potentially have a performance impact due to scanning many more documents than necessary.
USER IMPACT
The conditions to trigger this bug are narrow, but the impact can be severe, ranging from performance impact due to excessive document scanning to incorrect results of the query.
To trigger this bug, the query needs to have an $elemMatch operator containing an (implicit or explicit) $and of several predicates, with one or more negated predicates. Further, a predicate needs to target the leading field of a compound index, and a negated predicate needs to target an inner or trailing field of the index.
Example:
Consider the following query, assuming an existing compound index on {"my_array.foo": 1, "my_array.bar": 1}.
db.coll.find({ some_field: true, my_array: { $elemMatch: { foo: 1, // predicate on leading indexed field bar: { $ne: 5 } // negated predicate on trailing indexed field } } })
This query fulfills the conditions to trigger the bug.
WORKAROUNDS
None
AFFECTED VERSIONS
All production release versions from 2.6.0 to 2.6.3 are affected by this issue.
FIX VERSION
The fix is included in the 2.6.4 production release.
RESOLUTION DETAILS
For NOT children of an ELEMMATCH node, the index tags need to be retrieved from the child of the NOT node.
Original description
I'm running into the following error while quering one of my collections:
error: { "$err" : "Runner error: Overflow hashed AND stage buffered data usage of 33554441 bytes exceeds internal limit of 33554432 bytes", "code" : 17144 }
This started happening after I upgraded my system from 2.4.x to 2.6.3. It seems to be somewhat related to https://jira.mongodb.org/browse/SERVER-13611 and https://jira.mongodb.org/browse/SERVER-14142. However, I'm not doing any sorting, which seemed to be the trigger to the former, and my indexes aren't spares, which seems to be a trigger for the latter. I only seem to get the error when my collection is sufficiently large (~4.3 million docs), and I include the $ne clause on one of the indexed fields. Here is an example document (one of the smaller ones) from the collection:
{ "_id": ObjectId("5283c074e4b096b90b0e8adc"), "assetTypeId": "CURRENCY", "name": "Zimbabwe, Zimbabwe Dollars", "startDate": "1970-01-01", "expirationDate": "2999-12-31", "unitSize": "1", "country": "ZW", "exchangeListings": [ ], "identifiers": [ { "name": "ZWD", "symbology": "CURR", "startDate": "1970-01-01", "endDate": "2999-12-31", "assetId": ObjectId("5283c074e4b096b90b0e8adc"), "assetName": "Zimbabwe, Zimbabwe Dollars", "assetTypeId": "CURRENCY", "createdBy": "hal", "creationDate": "2013-11-13 12:09:56.290000000", "lastModifiedBy": "hal", "lastModifiedDate": "2013-11-13 12:09:56.290000000", "_id": ObjectId("5283c074e4b096b90b0e8add") } ], "createdBy": "hal", "creationDate": "2013-11-13 12:09:56.283000000", "lastModifiedBy": "hal", "lastModifiedDate": "2013-11-13 12:09:56.283000000" }
Here are the indexes I have on the collection:
[ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "hal_hal_app.assets", "name" : "_id_" }, { "v" : 1, "key" : { "identifiers.name" : 1, "identifiers.symbology" : 1 }, "ns" : "hal_hal_app.assets", "name" : "identifiers.name_1_identifiers.symbology_1" }, { "v" : 1, "key" : { "identifiers._id" : 1 }, "unique" : true, "ns" : "hal_hal_app.assets", "name" : "asset-identifier-id" } ]
And here is the query that causes the problem:
db.assets.find({$and: [{ assetTypeId : { $ne : "CURRENCY"}}, { identifiers : { $elemMatch : { $and : [ { symbology : { $ne : "INDEX"}} , { name : "29799X52"}]}}}]});
Note that neither of these queries cause the problem:
db.assets.find({ identifiers : { $elemMatch : { $and : [ { symbology : { $ne : "INDEX"}} , { name : "29799X52"}]}}}); db.assets.find({$and: [{ assetTypeId : { $ne : "CURRENCY"}}, { identifiers : { $elemMatch : { $and : [ { symbology : { $in : ["CURR", "NYSE"]}} , { name : "29799X52"}]}}}]});
And here's what I could find from the mongod.log that seem relevant:
2014-07-21T10:26:35.443-0500 [conn6039] ERROR: Runner error, stats: { "type" : "KEEP_MUTATIONS", "works" : 1721179, "yields" : 13449, "unyields" : 13449, "invalidates" : 0, "advanced" : 0, "needTime" : 1721178, "needFetch" : 0, "isEOF" : 0, "children" : [ { "type" : "FETCH", "works" : 1721179, "yields" : 13449, "unyields" : 13449, "invalidates" : 0, "advanced" : 0, "needTime" : 1721178, "needFetch" : 0, "isEOF" : 0, "alreadyHasObj" : 0, "forcedFetches" : 0, "matchTested" : 0, "children" : [ { "type" : "AND_HASH", "works" : 1721179, "yields" : 13449, "unyields" : 13449, "invalidates" : 0, "advanced" : 0, "needTime" : 1721177, "needFetch" : 0, "isEOF" : 0, "flaggedButPassed" : 0, "flaggedInProgress" : 0, "memUsage" : 33554441, "memLimit" : 33554432, "children" : [ { "type" : "IXSCAN", "works" : 1721177, "yields" : 13449, "unyields" : 13449, "invalidates" : 0, "advanced" : 857729, "needTime" : 863448, "needFetch" : 0, "isEOF" : 0, "keyPattern" : "{ identifiers.name: 1, identifiers.symbology: 1 }", "boundsVerbose" : "field #0['identifiers.name']: [MinKey, \"INDEX\"), (\"INDEX\", MaxKey], field #1['identifiers.symbology']: [MinKey, MaxKey]", "isMultiKey" : 1, "yieldMovedCursor" : 0, "dupsTested" : 1721177, "dupsDropped" : 863448, "seenInvalidated" : 0, "matchTested" : 0, "keysExamined" : 1721178, "children" : [] }, { "type" : "IXSCAN", "works" : 1, "yields" : 13449, "unyields" : 13449, "invalidates" : 0, "advanced" : 1, "needTime" : 0, "needFetch" : 0, "isEOF" : 1, "keyPattern" : "{ identifiers.name: 1, identifiers.symbology: 1 }", "boundsVerbose" : "field #0['identifiers.name']: [\"29799X52\", \"29799X52\"], field #1['identifiers.symbology']: [MinKey, MaxKey]", "isMultiKey" : 1, "yieldMovedCursor" : 0, "dupsTested" : 1, "dupsDropped" : 0, "seenInvalidated" : 0, "matchTested" : 0, "keysExamined" : 1, "children" : [] } ] } ] } ] } 2014-07-21T10:26:36.136-0500 [conn6039] assertion 17144 Runner error: Overflow hashed AND stage buffered data usage of 33554441 bytes exceeds internal limit of 33554432 bytes ns:hal_hal_app.assets query:{ $and: [ { assetTypeId: { $ne: "CURRENCY" } }, { identifiers: { $elemMatch: { $and: [ { symbology: { $ne: "INDEX" } }, { name: "29799X52" } ] } } } ] } 2014-07-21T10:26:36.136-0500 [conn6039] query hal_hal_app.assets query: { $and: [ { assetTypeId: { $ne: "CURRENCY" } }, { identifiers: { $elemMatch: { $and: [ { symbology: { $ne: "INDEX" } }, { name: "29799X52" } ] } } } ] } planSummary: IXSCAN { identifiers.name: 1, identifiers.symbology: 1 }, IXSCAN { identifiers.name: 1, identifiers.symbology: 1 } ntoreturn:0 keyUpdates:0 exception: Runner error: Overflow hashed AND stage buffered data usage of 33554441 bytes exceeds internal limit of 33554432 bytes code:17144 numYields:5 locks(micros) r:8563926 reslen:164 5038ms