-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Minor - P4
-
None
-
Affects Version/s: 5.0.5
-
Component/s: None
-
Environment:archlinux , 128 GB RAM server
-
Storage Execution
Mongo appears to be returning duplicate documents for the same query, i.e. it returns more documents than there are unique {{_id}}s in the returned documents:
lobby-brain> count_iterated = 0; ids = {}
{}
lobby-brain> db.the_collection.find({
'a_boolean_key': true
}).forEach((el) => {
count_iterated += 1;
ids[el._id] = (ids[el._id]||0) + 1;
})
lobby-brain> count_iterated
278
lobby-brain> Object.keys(ids).length
251
That is, the number of unique _id returned is 251 – but there were 278 documents returned by the cursor.
Investigating further:
lobby-brain> ids
{
'60cb8cb92c909a974a96a430': 1,
'61114dea1a13c86146729f21': 1,
'6111513a1a13c861467d3dcf': 1,
...
'61114c491a13c861466d39cf': 2,
'61114bcc1a13c861466b9f8e': 2,
...
}
lobby-brain> db.the_collection.find({
_id: ObjectId("61114c491a13c861466d39cf")
}).forEach((el) => print("foo"));
foo
>
{{}}
That is, there aren't actually duplicate documents with the same _id -- it's just an issue with the .find().
I tried restarting the database, and rebuilding an index involving 'a_boolean_key', with the same results.
I've never seen this before and this seems impossible...
Version info:
Using MongoDB: 5.0.5
Using Mongosh: 1.0.4
{{}}
It is a stand-alone database, no replica set or sharding or anything like that.
Further Info
One thing to note is, there is a compound index with a_boolean_key as the first index, and a datetime field as the second. The boolean key is rarely updated on the database (~once/day), but the datetime field is frequently updated.
Maybe these updates are causing the duplicate return values?