If a chunk that contains more than 250000 documents, but is less than half the chunk size, gets selected for a moveChunk operation, the balancer will mark the chunk as jumbo regardless of the shard key range available or the actual total data size of the chunk (so long as it is less than half the chunk size).
The issue can occur repeatedly in the same collection and could end up marking all chunks as jumbo. The resulting distribution is highly non-linear.
EDIT: I don't think it can mark every chunk as jumbo.
The issue is most likely to occur wherever large numbers of small documents are in use. The requirements are:
- Documents that average less than 268 bytes in size (explained below), or less than 134 for certain earlier versions.
- A shard key with a lot of range.
- More than one MongoS, more is better (or perhaps worse?)
- Inserts occurring across multiple MongoS (other ops don't matter) to the same sharded collection.
- Balancer enabled.
The average document size threshold can be calculated for 3.2 or later as:
chunkSizeMB * 1024 * 1024 / 250000
For 3.0 and earlier the average document size threshold formula is:
chunkSizeMB * 1024 * 1024 / 2 / 250000
Note that if chunkSizeMB is raised, so is the threshold for average document sizes, under which the issue might manifest.
In these conditions, whenever a moveChunk occurs by the balancer attempting to bring the cluster into balance (for example, after some other split or a new shard is added for capacity), there is a chance of selecting a chunk whose document count actually exceeds 250000 (the separate mongos didn't collude to auto-split this before the limit was reached). 250000 is the hard-coded document count limit for a chunk before a move is rejected as "chunkTooBig". The balancer responds to such a failure by issuing a split request. The split returns a single split point (because the size is still under half the chunk size) and then SERVER-14052 kicks in and declares the split failed too. The chunk is marked as jumbo despite being tiny in size and having a potentially huge shard key range.
Note that in the repro steps, the numInitialChunks set to 1 is not required, although doing so makes the issue occur much sooner thanks to the initial imbalance.
Workaround
Users can lower the chunksize setting, and then clear the jumbo flags on all affected chunks to have them re-evaluated for splitting.
- is duplicated by
-
SERVER-20140 shard balancer fails to split chunks with more than 250000 docs
- Closed
-
SERVER-24270 "chunk too big to move" and "chunk not full enough to trigger auto-split"
- Closed
- is related to
-
SERVER-11701 Allow user to force moveChunk of chunks larger than the 25000 byte limit
- Closed
- related to
-
SERVER-30572 Support configurable 'jumbo' chunk threshold
- Closed