-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.6.5, 2.7.8
-
Environment:Ubuntu 14.10
MongoDB packages from 10gen
PyMongo 2.7.1
-
Sharding
-
Linux
-
When outputting from a map reduce job into a sharded output collection which features a hashed index on the _id field, no output is produced. The _id field is also the sharding key, so this issue
Extensive testing shows that this happens only for the first map reduce that is ever run on a MongoDB cluster. It fails to produce output and in the process, the name of the output collection appears to become 'cursed' somehow: Any subsequent map-reduce job runs fail if that same output collection name is used.
Even if the collection is re-created or the entire database is dropped and re-created, or if a different database is used. The name of the output collection can never be used again. Only when outputting into a collection with a different name, the exact same map reduce job processing the exact same data will succeed.
The problem emerges on sharded clusters only, and only when the output collection uses a hashed index.
It is possible to work around this problem by running a dummy map reduce job on newly setup MongoDB clusters, using an output collection that will never be used in regular operations.
- is duplicated by
-
SERVER-14324 MapReduce does not respect existing shard key on output:sharded
- Closed
- related to
-
SERVER-43467 Complete TODO listed in SERVER-16605
- Closed