A error in map-reduce job crashes the secondary servers, and prevents the secondaries from starting again. I know what the error is in my map function that causes the job to fail, but that shouldn't be leaving my mongodb instance in a irrecoverable state. The primary is up and running, but pushed to a secondary since that's the only replica that's running.
The map function uses a list for a key, which is not supported. The unique index constraint is enforced on the last index of the list, which is not unique. Once I change it to a dictionary or concatenated string, it works just fine.
Every time I try starting the secondary server, I get the same error "duplicate key error index" and it crashes. I had to wipe out the secondaries and let Mongodb do a clean sync, which came with a big downtime.
This looks to be a mongodb bug. I am running a 3 replica set environment with 4 shards. All 4 shard servers in the 2 secondaries crashed with the same error.
Any help is greatly appreciated. If there is a way to recover from current state, Please let me know as well.
thanks!
2014-10-07T00:28:32.159+0000 [conn18913] end connection 172.31.15.135:55897 (9 connections now open) 2014-10-07T00:28:32.159+0000 [initandlisten] connection accepted from 172.31.15.135:55905 #18915 (10 connections now open) 2014-10-07T00:28:32.160+0000 [conn18915] authenticate db: local { authenticate: 1, nonce: "xxx", user: "__system", key: "xxx" } 2014-10-07T00:28:40.150+0000 [repl writer worker 1] ERROR: writer worker caught exception: :: caused by :: 11000 insertDocument :: caused by :: 11000 E11000 duplicate key error index: ModelDatabase.tmp.mr.RawData_0.$_id_ dup key: { : "009020" } on: { ts: Timestamp 1412641720000|2, h: -267785287631189678, v: 2, op: "i", ns: "ModelDatabase.tmp.mr.RawData_0", o: { _id: [ "20111028", "0088", "009020" ], value: { Count: 6.0, TotalWeight: 7.0 } } } 2014-10-07T00:28:40.150+0000 [repl writer worker 1] Fatal Assertion 16360 2014-10-07T00:28:40.150+0000 [repl writer worker 1]
- related to
-
SERVER-16308 Emitting arrays as ids in the map() function of the MR framework should not be allowed
- Backlog