-
Type: Bug
-
Resolution: Incomplete
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.0.2
-
Component/s: None
-
None
-
Environment:uname -a
Linux test-mongo2-us.internal.net 2.6.18-194.el5 #1 SMP Fri Apr 2 14:58:14 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
-
Linux
I have a simple 3 shard cluster (1 config server, 1 mongos) set up for testing, throwing a high load of reads and writes at a collection sharded on _id (which is a random integer between 1 and 10M). After about an hour of testing, test-mongo2-us (the master) crashed with the below error, possibly while splitting a chunk / rebablancing given the timedstamps on the log entries.
Tue Feb 7 00:01:21 [conn71] received splitChunk request: { splitChunk: "testdb.user", keyPattern:
{ _id: 1.0 }, min:
{ _id: 6252355 }, max:
{ _id: 6965549 }, from: "test-mongo2-us:27117", splitKeys: [
{ _id: 6585511 } ], shardId: "testdb.user-_id_6252355", configdb: "test-mongo1-us:27019" }
Tue Feb 7 00:01:21 [conn71] created new distributed lock for testdb.user on test-mongo1-us:27019 ( lock timeout : 900000, ping interval : 30000, process : 0 )
Tue Feb 7 00:01:21 [conn73] command admin.$cmd command: { splitChunk: "testdb.user", keyPattern:
, min:
{ _id: 6252355 }, max:
{ _id: 6965549 }, from: "test-mongo2-us:27117", splitKeys: [
{ _id: 6585515 } ], shardId: "testdb.user-_id_6252355", configdb: "test-mongo1-us:27019" } ntoreturn:1 reslen:351 555ms
Tue Feb 7 00:01:21 [conn68] received splitChunk request: { splitChunk: "testdb.user", keyPattern:
, min:
{ _id: 6252355 }, max:
{ _id: 6965549 }, from: "test-mongo2-us:27117", splitKeys: [
{ _id: 6585511 } ], shardId: "testdb.user-_id_6252355", configdb: "test-mongo1-us:27019" }
Tue Feb 7 00:01:21 [conn68] created new distributed lock for testdb.user on test-mongo1-us:27019 ( lock timeout : 900000, ping interval : 30000, process : 0 )
Tue Feb 7 00:01:21 [conn23] could not acquire lock 'testdb.user/test-mongo2-us.web.blizzard.net:27117:1328559662:781691710' (another update won)
Tue Feb 7 00:01:21 [conn23] distributed lock 'testdb.user/test-mongo2-us.web.blizzard.net:27117:1328559662:781691710' was not acquired.
Tue Feb 7 00:01:21 [conn23] command admin.$cmd command: { splitChunk: "testdb.user", keyPattern:
, min:
{ _id: 6252355 }, max:
{ _id: 6965549 }, from: "test-mongo2-us:27117", splitKeys: [
{ _id: 6585531 } ], shardId: "testdb.user-_id_6252355", configdb: "test-mongo1-us:27019" } ntoreturn:1 reslen:351 653ms
Tue Feb 7 00:01:21 Invalid access at address: 0xb9bd3c
Tue Feb 7 00:01:21 Got signal: 7 (Bus error).