The PersistentTaskStore class called by the ReshardingOplogApplier uses WriteConcerns::kMajorityWriteConcernShardingTimeout when no argument is provided. The {w: "majority", wtimeout: 60000} write concern can lead resharding to fail with an operation-fatal error due to the write concern not being satisfied quickly enough. Furthermore, there is no requirement for the ReshardingOplogApplier to have its current batch become majority-committed before moving on to process the subsequent batch. This is because the new primary of the recipient shard will resume resharding batch application from wherever its local state left off from.
The ReshardingOplogApplier should instead use {w: 1} for its write concern (for example).
- is caused by
-
SERVER-53915 Persist total number of oplog entries applied in ReshardingOplogApplier
- Closed
- is related to
-
SERVER-61052 Resharding Donor & Recipient's Coordinator Doc Updates Can Time Out Waiting for Replication on Coordinator Doc, Leading to Fatal Assertion
- Closed