We have seen a bug in WiredTiger where when restarting from a checkpoint, we notice that the checkpoint is corrupted via a key out-of-order error from MongoDB validation. Upon inspection of the corrupted database, it can be seen that two internal pages are referencing overlapping leaf pages.
The error message is:
{"t":{"$date":"2022-05-20T16:42:05.149+00:00"},"s":"I", "c":"COMMAND", "id":20514, "ctx":"conn2","msg":"CMD: validate","attr":{"namespace":"local.oplog.rs","background":false,"full":false,"enforceFastCount":false,"repair":false}} {"t":{"$date":"2022-05-20T16:42:05.149+00:00"},"s":"I", "c":"INDEX", "id":20303, "ctx":"conn2","msg":"validating collection","attr":{"namespace":"local.oplog.rs","uuid":{"uuid":{"$uuid":"957ce2cb-8d6b-4f40-b637-1a3f2f3d5ec6"}}}} {"t":{"$date":"2022-05-20T16:42:05.449+00:00"},"s":"E", "c":"STORAGE", "id":22406, "ctx":"conn2","msg":"WTCursor::next -- next was not greater than last which is a bug","attr":{"next":"7099859378722836930","last":"7099859378722837016"}} {"t":{"$date":"2022-05-20T16:42:05.449+00:00"},"s":"F", "c":"ASSERT", "id":23081, "ctx":"conn2","msg":"Invariant failure","attr":{"expr":"!TestingProctor::instance().isEnabled()","msg":"next was not greater than last","file":"src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp","line":2231}} {"t":{"$date":"2022-05-20T16:42:05.449+00:00"},"s":"F", "c":"ASSERT", "id":23082, "ctx":"conn2","msg":"\n\n***aborting after invariant() failure\n\n"} {"t":{"$date":"2022-05-20T16:42:05.449+00:00"},"s":"F", "c":"CONTROL", "id":4757800, "ctx":"conn2","msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}
Here's an example of the tree structure seen in the checkpoint: