-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Catalog and Routing
-
2
Choosing the correct strategy to initialize chunks to the shards is important to have an even distribution, i.e. all shards have the same amount of chunks.
Today’s createFirstChunks for initial split policies with zones will first create one (or some) chunk per zone and fill the gaps not contained in the zone’s domain. The current implementation distributes the chunks already created starting from the MinKey to the MaxKey, using a round-robin strategy. Because of zones can enforce chunks to be placed into specific shards, we could end up having some shards with more chunks than others.
An example of that situation could be:
Cluster with three shards (Shard0, Shard1, Shard2):
- presplitHashedZones = true
- zoneA for Shard0 with {-10, 0}
- zoneB for Shard0 with {0, MaxKey}
- numInitialChunks = default (a.k.a 2)
Chunk Distribution:
- Shard0: chunkA{MinKey, -10}, chunkB{-10, 0}, chunkC{0, MaxKey},
- Shard1: none
- Shard2: none
I propose improving this algorithm with first distributing all chunks associated with a zone, and then fill the gaps according to the number of chunks per shard.
- related to
-
SERVER-76405 Reduce the initial minimum number of chunks to 1 per shard for empty collections with hashed shard key
- Closed