-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
None
-
(copied to CRM)
Currently, there is not a good way for counting orphaned documents in a sharded cluster.
The current approach is to run a
db.collection.find({ShardKey:{$gte: MinKey, $lte: MaxKey}},{ShardKey:1,_id:0}).itcount()
and compare this to the sum of shard counts individually.
- This query requires a full index scan, streaming the entire results to the shell. This can take significant time to complete on a large sharded cluster.
- On a busy system these numbers could be off significantly due to inserts and deletes interleaving during the count.
- Requires multiple queries to multiple hosts, leading to timing errors.
Since the logic already exists to cleanup orphans, a similar command (or parameter to the existing cleanupOrphaned) to count them would be useful for determining the potential impact of orphans on a given sharded cluster.
- duplicates
-
SERVER-17013 Add 'dry run' mode for cleanupOrphaned
- Closed