Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.2.15, 4.4.7, 4.0.26, 5.0.0-rc2, 5.1.0-rc0
Affects Version/s: None
Component/s: Sharding
Labels:
- sharding-wfbf-day

Backwards Compatibility:
Fully Compatible
Backport Requested:

v5.0, v4.4, v4.2, v4.0
Sprint:
Sharding EMEA 2021-06-14
Confidence Status:
None
Work Order:
0

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Currently, chunk splits, whether manual or initiated by the auto-splitter, acquire the collection distributed lock. This is bad for 2 reasons:

Even if there is a single imbalanced shard, which is performing balancing, the chunk splitter will not be able to acquire the distributed lock and will repeatedly fail
With or without the presence of migrations, the dist lock acquisition still happens after we have performed the splitVector scan in order to determine the split-points, which means that good portion of these scans could end-up being wasted. This point is somewhat mitigated by the fact that the dist lock is taken with the default timeout of 5 seconds, but given that we try with 500 ms back-off there is still some chance that we waste scans on a sufficiently loaded system.

This ticket is to figure out how to remove the dist lock acquisition from splits without causing an impact in the reverse direction. I.e., splits causing the much more expensive moves to start failing, because the chunk being moved got split.

We could achieve this by having the split logic ignore size tracking and/or splitting chunks, which are currently being moved.

depends on

SERVER-56779 Do not use the collection distributed lock for chunk merges

Closed

is depended on by

SERVER-57032 Acquire only local distLock for Sharding DDL coordinators

Closed

is duplicated by

SERVER-25359 Create collection-specific ResourceMutex map to allow split/merge/move chunk on different collections to proceed in parallel

Closed

Assignee:: Kaloian Manassiev

Reporter:: Kaloian Manassiev

Participants:: Githook User, Kaloian Manassiev, Vivian Ge

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: May 05 2021 10:47:24 AM UTC

Updated:: Oct 29 2023 09:54:02 PM UTC

Resolved:: Jun 14 2021 12:45:35 PM UTC

Confidence Status Last Update:: 09/Jun/21 11:05 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates