During the post-processing phase of a map reduce run, when a shard pulls the documents for the chunks (of the output collection) that it owns from other shard(s), those documents are not deleted from the source shard(s). This may result in a large number of orphan documents which greatly increases the storage size of the output collection.
When documents are migrated across shards during post-processing, they should be removed from the source shard.
- is related to
-
SERVER-14324 MapReduce does not respect existing shard key on output:sharded
- Closed