Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.3.0-rc0, 7.0.5, 6.0.13, 5.0.24
Affects Version/s: 5.0.0, 6.0.0, 7.0.0, 7.1.0
Component/s: Sharding
Labels:
- sharding-nyc-subteam3

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v7.2, v7.0, v6.0, v5.0
Sprint:
Cluster Scalability 2023-12-25
Story Points:
2
Confidence Status:
None
Work Order:
0

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The PersistentTaskStore class called by the ReshardingOplogApplier uses WriteConcerns::kMajorityWriteConcernShardingTimeout when no argument is provided. The {w: "majority", wtimeout: 60000} write concern can lead resharding to fail with an operation-fatal error due to the write concern not being satisfied quickly enough. Furthermore, there is no requirement for the ReshardingOplogApplier to have its current batch become majority-committed before moving on to process the subsequent batch. This is because the new primary of the recipient shard will resume resharding batch application from wherever its local state left off from.

The ReshardingOplogApplier should instead use {w: 1} for its write concern (for example).

is caused by

SERVER-53915 Persist total number of oplog entries applied in ReshardingOplogApplier

Closed

is related to

SERVER-61052 Resharding Donor & Recipient's Coordinator Doc Updates Can Time Out Waiting for Replication on Coordinator Doc, Leading to Fatal Assertion

Closed

Assignee:: Wenqin Ye

Reporter:: Max Hirschhorn

Participants:: Githook User, Max Hirschhorn, Wenqin Ye

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: Nov 06 2023 08:15:12 PM UTC

Updated:: Jan 25 2024 04:28:29 PM UTC

Resolved:: Dec 14 2023 08:03:08 PM UTC

Confidence Status Last Update:: 08/Dec/23 7:46 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates