-
Type: Bug
-
Resolution: Duplicate
-
Priority: Critical - P2
-
None
-
Affects Version/s: 2.2.0
-
Component/s: Sharding
-
None
-
ALL
Sharded 4 node replica set with priorities 2, 1, 0, 0
Transient failover from priority 2 to priority 1 node and back to priority 2 node causes mongos seg fault.
https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/NeeB86n9-JU
Just had more nodes collapse on a different column set now... here's the log with logLevel:2 turned on from mongos
Tue Sep 11 14:03:29 [conn1510] warning: splitChunk failed - cmd: { splitChunk: "catalog.feed_data_changelog", keyPattern: { retid: 1.0, feedid: 1.0, uniqid: 1.0, version: 1.0 }, min: { retid: 13712, feedid: 2669, uniqid: "287e1d9af8a592cfe0a80aa7b80df7ba", version: 1 }, max: { retid: 13712, feedid: 2669, uniqid: "4cf79bccacf4e71d2e73607677fea6e5", version: 1 }, from: "col03", splitKeys: [ { retid: 13712, feedid: 2669, uniqid: "38623d70f41a1a433b27e0f187219eb0", version: 3 } ], shardId: "catalog.feed_data_changelog-retid_13712feedid_2669uniqid_"287e1d9af8a592cfe0a80aa7b80df7ba"version_1", configdb: "servercfg1:27019,servercfg2:27019,servercfg3:27019" } result: { who: { _id: "catalog.feed_data_changelog", process: "server01c03:27017:1346968636:1431697445", state: 2, ts: ObjectId('504f71768fb2ef42a78a0c25'), when: new Date(1347383670849), who: "server01c03:27017:1346968636:1431697445:conn62819:2040432535", why: "migrate-{ retid: 8941, feedid: 8005, uniqid: "9cd5f39b3cfaeb79e8925eef346e68d0", version: 32 }" }, errmsg: "the collection's metadata lock is taken", ok: 0.0 } Tue Sep 11 14:03:29 [conn1510] ChunkManager: time to load chunks for catalog.feed_data_changelog: 5ms sequenceNumber: 191 version: 227|3||504aa7463a46fa0144cf6f5e based on: 227|3||504aa7463a46fa0144cf6f5e Tue Sep 11 14:03:29 [conn1510] warning: chunk manager reload forced for collection 'catalog.feed_data_changelog', config version is 227|3||504aa7463a46fa0144cf6f5e Tue Sep 11 14:04:19 [conn1510] end connection 127.0.0.1:52194 (2 connections now open) Tue Sep 11 14:04:20 [mongosMain] connection accepted from 127.0.0.1:52353 #1515 (3 connections now open) Tue Sep 11 14:04:35 [ReplicaSetMonitorWatcher] Primary for replica set col03 changed to server02c03:27017 Tue Sep 11 14:04:45 [ReplicaSetMonitorWatcher] Primary for replica set col03 changed to server01c03:27017 Tue Sep 11 14:04:45 [ReplicaSetMonitorWatcher] Primary for replica set col03 changed to server02c03:27017 Tue Sep 11 14:04:49 [WriteBackListener-server02c03:27017] DBClientCursor::init call() failed Tue Sep 11 14:04:49 [WriteBackListener-server02c03:27017] WriteBackListener exception : DBClientBase::findN: transport error: server02c03:27017 ns: admin.$cmd query: { writebacklisten: ObjectId('504e77846a941ccd587623c8') } Tue Sep 11 14:04:49 [conn1515] got not master for: server02c03:27017 Received signal 11 Backtrace: 0x8386d5 0x3bd7a302d0 0x2aaaab2bad80 /usr/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0x8386d5] /lib64/libc.so.6[0x3bd7a302d0] [0x2aaaab2bad80]
- duplicates
-
SERVER-7061 mongos can use invalid ptr to master conn when setShardVersion fails
- Closed