It's possible for concurrent, sharded mapReduces to fail with DEAD plan executors when there's a collision in temporary namespaces across multiple mongos processes.
This bug is intermittently triggered by the concurrency suite.
This seems to be the sequence of events:
1 - A mongos process issues a drop command, on all shards, on a tmp.mrs namespace after finishing the mapReduce.shardedfinish command (in cluster_map_reduce_cmd.cpp).
2 - At the same time, another mongos process tries to initialize a ParallelSortClusteredCursor on the very same tmp.mrs namespace as part of another mapReduce.shardedfinish command.
3 - The drop invalidates cursors on the tmp.mrs namespace, which leads to a DEAD plan executor and a failed mapReduce command.
Relevant log lines:
I COMMAND [conn26] CMD: drop db1.tmp.mrs.coll1_1440026655_43 E QUERY [conn30] Plan executor error during find: DEAD, stats: { stage: "FETCH", nReturned: 0, executionTimeMillisEstimate: 0, works: 0, advanced: 0, needTime: 0, needYield: 0, saveState: 1, restoreState: 0, isEOF: 0, invalidates: 0, docsExamined: 0, alreadyHasObj: 0, inputStage: { stage: "IXSCAN", nReturned: 0, executionTimeMillisEstimate: 0, works: 0, advanced: 0, needTime: 0, needYield: 0, saveState: 1, restoreState: 0, isEOF: 0, invalidates: 0, keyPattern: { _id: 1 }, indexName: "_id_", isMultiKey: false, isUnique: true, isSparse: false, isPartial: false, indexVersion: 1, direction: "forward", indexBounds: { _id: [ "[MinKey, MaxKey]" ] }, keysExamined: 0, dupsTested: 0, dupsDropped: 0, seenInvalidated: 0 } } I QUERY [conn30] assertion 17144 Executor error: OperationFailed Operation aborted because: all indexes on collection dropped ns:db1.tmp.mrs.coll1_1440026655_43 query:{ query: {}, orderby: { _id: 1 } }
Test output:
Error: map reduce failed:{ "ok" : 0, "errmsg" : "MR post processing failed: { ok: 0.0, errmsg: \"could not initialize cursor across all shards because : Executor error: OperationFailed Operation aborted because: all indexes on collection dropped @...\", code: 14827 }" }
- related to
-
SERVER-34539 Re-enable sharded mapReduce concurrency testing and only use a single mongos
- Closed