-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
Fully Compatible
-
ALL
-
Repl 2020-11-16
After SERVER-51246, future result returned by TenantOplogApplier::getNotificationForOpTime() can have recipient ts as Null. This can result in tenant migration recipient giving a false indication to donor about the data being majority committed on recipient replica set. Consider the below case.
1) Tenant oplog applier applies a first batch containing one entry {_id:1} at donor TS(1) for tenant 'foo'. This would result in recipient write, say, TS(11). So, _lastBatchCompletedOpTimes.donorOpTime is TS(1) and _lastBatchCompletedOpTimes.recipientOpTime is TS(11).
2) Tenant oplog applier applies second batch containing one resume token no-op entry {op: n, ..} at donor TS(2). This would result in no recipient writes. So, _lastBatchCompletedOpTimes.donorOpTime is TS(2) and _lastBatchCompletedOpTimes.recipientOpTime is TS(0).
3) Assume, now, recipient gets "recipientSyncData" cmd with returnAfterReachingDonorTimestamp set to TS(2). This results in calling Instance::waitUntilTimestampIsMajorityCommitted(), waits on a null recipient ts returned in the future result.
As a result, we don't make sure all data <= Donor TS(2) is majority committed on replica set.
- related to
-
SERVER-51246 Write a noop into the oplog buffer after each batch to ensure tenant applier reaches stop timestamp
- Closed