-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 4.1.6
-
Component/s: Replication, Storage
-
None
-
Fully Compatible
-
Repl 2019-02-11, Repl 2019-02-25, Repl 2019-03-11, Repl 2019-03-25
-
66
When enableMajorityReadConcern:false, we disable journaling of replicated tables and use stable checkpoints and the oplog for crash recovery. Stable checkpoints in this case are not guaranteed to be behind the majority commit point, so we still use the rollbackViaRefetch algorithm. Under this configuration, it is possible for us to create a stable checkpoint whose collection data has two documents with the same _id. Consider the following behavior. Assume all operations are done on the same collection, and that no checkpoints are taken other than the one explicitly forced after rollback:
- Insert a document {_id:1} at timestamp T=1, with RecordId=1. Let this write majority commit.
- Delete document {_id:1} at timestamp T=2 and assume this write doesn't majority commit.
- Enter rollback, where the delete operation at T=2 is the only op necessary to roll back.
- To undo the delete operation, rollback refetches the document and inserts a new document {_id:1} with RecordId=2
- Complete rollback and set the stable timestamp to T=1, the rollback common point.
- Force a new stable checkpoint to be taken. This checkpoint, call it C1, is taken at timestamp T=1.
- Shut down uncleanly.
- Start up and recover from the most recent stable checkpoint, C1.
Because the checkpoint was taken at T=1, it does not include the delete at timestamp 2, so the storage engine includes the document at RecordId(1) in the checkpoint. This document is a duplicate of the document at RecordId(2), so when we crash and recover from this checkpoint, we will have two documents with the same _id in the collection.
- is related to
-
SERVER-42366 When EMRC=false we may set the stable timestamp ahead during rollback after forcing it back to the common point
- Closed
-
SERVER-43356 May fail to recover after a rollbackViaRefetch if sync source no longer has required opTime
- Closed
-
SERVER-45010 Clean shutdown after rollbackViaRefetch with eMRC=false can cause us to incorrectly overwrite unstable checkpoints
- Closed
-
SERVER-47219 Correct downgrade_after_rollback_via_refetch to not binary downgrade on crash
- Closed
-
SERVER-48082 WT clean shutdown should do a quick exit before shouldDowngrade() check if the node is still not safe to take stable checkpoints.
- Closed
-
SERVER-48518 Rollback via refetch (EMRC = false) can make readers to see the rolled back data even after the rollback node catches up to primary.
- Closed
-
SERVER-45181 Rollback via refetch should set initial data timestamp to max(minvalid, local oplog's top) on rollback success.
- Closed
- related to
-
SERVER-37897 Disable table logging for data files when enableMajorityReadConcern=false
- Closed