-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
Catalog and Routing
-
3
The goal of resharding's lastOpEndingChunkImbalance serverStatus metric (see also SERVER-55430) is to assess whether resharding chose a good data distribution. If the balancer immediately starts incrementally moving data after resharding has repartitioned all of the data, then the data distribution chosen by resharding was very likely suboptimal.
The current lastOpEndingChunkImbalance serverStatus metric is calculated as [nChunks owned by shard with most chunks] - [nChunks owned by shard with fewest chunks]. Given that chunk splits will no longer automatically occur as data is inserted into the temporary resharding collection, the number of chunks will be unchanged from the start of the resharding operation and the calculated difference will always be either 0 or 1. However as of PM-2323 the balancer will use the data size rather than the number of chunks to balance the sharded collection. This means the balancer may still incrementally move data even when the calculated difference in number of chunks is 0 or 1. We should change the calculation for the lastOpEndingChunkImbalance serverStatus metric to rely on the same/similar calculation for how the balancer chooses whether it should incrementally move data between shards.
- is related to
-
SERVER-55430 Record metrics about whether a collection is rebalanced after resharding op finishes
- Closed