-
Type: Bug
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Catalog and Routing
-
ALL
-
CAR Team 2024-09-30, CAR Team 2024-10-14, CAR Team 2024-10-28
In SERVER-94825 we discovered in an attempt to use acquisitions that secondaries may mistakenly overwrite the fastcount.
The interleaving for this to happen is as follows:
- Oplog replication applies a batch with an insert on collA. It pauses before publishing the new lastApplied timestamp.
- Validate acquires a MODE_X lock on collA and preallocates a snapshot at lastApplied. At this point lastApplied is still the old value even if we have no oplog holes.
- Oplog replication publishes the new lastApplied.
- Validate finishes and overwrites the fast count with stale values
The key point here is that the acquisitions that modify the collection in some form specify the operation type as kWrite. If an operation specifies such type we should never update the read source since writes should always happen at the latest snapshot available.
- fixes
-
SERVER-94825 Uncommitted fast count updates leak out to other operations
- Closed
- is depended on by
-
SERVER-87119 Introduce the collection acquisition logic from the shard-role api into the validate command
- Backlog