-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 6.0.0-rc1, 5.3.0, 5.2.0, 5.1.0, 4.2.16, 4.0.27, 5.0.3, 4.4.9
-
Component/s: None
-
Replication
-
Fully Compatible
-
ALL
-
v7.0, v6.0, v5.0
-
Repl 2022-08-08, Repl 2022-08-22, Repl 2022-09-05, Repl 2023-07-24
-
152
On secondaries, inserts/updates into config.image_collection happen inside the transactions that perform the data write. But deletes are replicated explicitly. Thus if a batch contains both a data write that wants to upsert an image_collection document as well as a delete of that document, they can be executed out of order by different threads.
This can manifest in two ways:
- If this is a new image_collection document, that document will be leaked on secondaries that do the writes out of order.
- If the this is an update to an existing image_collection document, an out of order write assertion will crash the process. No data is corrupted. Restarting the secondary will eventually succeed.
Also note that tripping this bug requires an unlikely ingredient. Logical sessions on primaries are reaped after a configured amount of inactivity (minutes). This bug requires either:
- A secondary applying a batch that spans an entire reaping window.
- A client that stops using a logical session for long enough to reap it. But then uses the LSID again right as its being reaped.
- is related to
-
SERVER-69497 Have internal_sessions_reaping_basic.js oplog application use batches of size 1
- Closed
- related to
-
SERVER-80791 Potential data consistency issue with implicitly replicated collections
- Closed
-
SERVER-81423 Prevent the fuzzer from generating writes to config.image_collection / ban user writes to config.image_collection
- Closed