-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
v4.4, v4.2, v4.0, v3.6
-
-
Repl 2020-06-29, Repl 2020-07-13, Repl 2020-08-10
-
96
Below problems can be seen only for EMRC=false environment.
Problem 1: Find command on secondaries can list the rollbacked records.
Assume, rollback node is Node A and current primary is node B. Common point is TS(10). And, Node B’s (current primary) lastAppliedOpTime is TS(13).
1. Also, assume the following oplog entries in Node A will be rolled back,
- Insert {_id:1} to collection foo at TS(11) .
- Delete {_id:1} from collection foo at TS(15).
Since {_id:1} is not present in node B, the rollback via refetch algorithm effectively doesn't do anything for {_id:1} document on node A.
2. Before rollback, the node A’s lastApplied opTime is TS(15). When the rollback completes, the lastApplied opTime of node A is set to TS(10) = common point.
3. Node A after transitioning to secondary from rollback state, it catches up to primary and so node A's lastApplied opTime will be forwarded to TS(13).
Now, if a user tries runs a find command on collection ‘foo’ on node A (secondary), the read source of that find command will be set to kNoOverlap (default read source for secondaries). As a result readTimestamp for that find command will be set as TS(13) = min(lastAppliedopTime, allDurable). Because of which find command would be able to list the rollbacked data record {id:1} .
Problem 2: Find command on secondaries can also list records with duplicated _id.
Assume, rollback node is Node A and current primary is node B. Common point is TS(10). And, Node B’s (current primary) lastAppliedOpTime is TS(13).
1. Lets's say at TS(5), the user previously inserted a document {_id:1} collection foo with RecordId=1.
2. Now, assume the below oplog entry in Node A will be rolled back,
- Delete {_id:1} from collection foo at TS(15).
To rollback the delete operation, the rollback via refetch algorithm will refetch the document {_id:1} document from node B and insert the {_id:1} with RecordId=2 into node A and it will be a * non-timestamped write* (which is equivalent to TS(0)).
3. Before rollback, the node A’s lastApplied opTime is TS(15). When the rollback completes, the lastApplied opTime of node A is set to TS(10) = common point.
4. Node A after transitioning to secondary from rollback state, it catches up to primary and so node A's lastApplied opTime will be forwarded to TS(13).
Now, if a user tries runs a find command on collection ‘foo’ on node A (secondary), the read source will be kNoOverlap and read timestamp will be TS(13) = min(lastAppliedopTime, allDurable). Because of which find command would be able to list *two {_id:1} documents, one being RecordId=1 (whose TS(5)) and other being non-timestamped write RecordId=2 written during rollback.
I have put the repro steps to demonstrate the above 2 issues. Also, to be noted, both the above issues causes cursor count (itcount) and fast count mismatch( which was the symptom seen in the build failures).
Root cause:
The main problem for EMRC= false is that it's not safe to read on secondaries at a timestamp < the local node's top of oplog's timestamp prior to entering rollback.
- causes
-
SERVER-50183 Copy _awaitPrimaryAppliedSurpassesRollbackApplied function from RollbackTest to RollbackTestDeluxe
- Closed
- depends on
-
SERVER-47844 Update _setStableTimestampForStorage to set the stable timestamp without using the stable optime candidates set when EMRC=true
- Closed
- is related to
-
SERVER-48603 Rollback via refetch can result in out of order timestamps
- Closed
- related to
-
SERVER-46721 Step up may cause reads at PIT with holes after yielding
- Closed
-
SERVER-38341 Remove Parallel Batch Writer Mutex
- Closed
-
SERVER-38925 Rollback via refetch can cause _id duplication when enableMajorityReadConcern:false
- Closed
-
SERVER-47866 Secondary readers do not need to reacquire PBWM lock if there are catalog conflicts
- Closed