The auto-merger currently works on a secondary thread executed concurrently with the balancer thread and its behavior can be summarized as follows:
- (1) while the balancer is enabled:
- (2) while there are <collection, shard> with mergeable chunks (mergeability requirements documented in DOCS-15976)
- (3) for each <collection, shard> discovered by (2):
- (4) squash together mergeable chunks
- (5) sleep for 15 seconds
- (3) for each <collection, shard> discovered by (2):
- (6) sleep for 1 hour
- (2) while there are <collection, shard> with mergeable chunks (mergeability requirements documented in DOCS-15976)
As part of a balancing round, the balancer is taking care of splitting chunks according to the configured zones so that they can then be moved off. Since splitting is an operation that does not imply ownership change, 2 or more split chunks are always mergeable as long as they reside on the same shard at least for the history window (defined in DOCS-15976).
The conflict between the balancer splitting chunks for zoning and the auto-merger squashing together mergeable chunks had been considered acceptable based on the following ideas:
- The auto-merger may merge chunks belonging to different zones that are currently residing on the same shard
- But anyway the auto-merger will then "go to sleep" for 1 hour
- This leaves enough time for the balancer to split again and keep on moving data (avoiding future merges)
It turns out that - given the extreme slowness of splits in case of several hundred of zones - there is a perfect interleaving leading to the following continuous conflict between the balancer and the auto-merger:
- (A) The balancer starts splitting chunks
- (B) The auto-merger discovers mergeable chunks due to (2)
- (C) Due to (4), the auto-merger squashes together chunks that were just split because of (A)
- (D) The auto-merger sleeps 15 seconds due to (5) while (A) is still running and discovers new chunks due to (2)
- (E) The balancer finishes (A) but part of the split chunks have been merged back
- Back to A, repeat
- is caused by
-
SERVER-74872 Auto-merger must keep on issuing requests as long as there are mergeable chunks
- Closed