-
Type: Bug
-
Resolution: Cannot Reproduce
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.6.1
-
Component/s: Sharding
-
None
-
ALL
One of our mongod replicaset member, which is part of a cluster consisting of 3 shards, went down due to a segmentation fault:
2014-06-29T22:14:06.767+0200 [conn199059] ntoskip:0 ntoreturn:1 2014-06-29T22:14:06.767+0200 [conn199059] stale version detected during query over offerStore.$cmd : { $err: "[offerStore.offer] shard version not ok in Client::Context: version mismatch detected for offerStore.offer, stored major version 20016 does not match ...", code: 13388, ns: "offerStore.offer", vReceived: Timestamp 20015000|85, vReceivedEpoch: ObjectId('538f1c07b86632c2d721e203'), vWanted: Timestamp 20016000|0, vWantedEpoch: ObjectId('538f1c07b86632c2d721e203') } 2014-06-29T22:14:06.767+0200 [conn198333] end connection 172.16.65.202:43434 (1166 connections now open) 2014-06-29T22:14:06.767+0200 [conn199059] end connection 172.16.65.202:43728 (1165 connections now open) 2014-06-29T22:14:06.812+0200 [conn199123] moveChunk migrate commit accepted by TO-shard: { active: false, ns: "offerStore.offer", from: "offerStoreUK/s128:27017,s137:27017,s227:27017", min: { _id: 99144222 }, max: { _id: 129281657 }, shardKeyPattern: { _id: 1.0 }, state: "done", counts: { cloned: 2843, clonedBytes: 3969903, catchup: 0, steady: 0 }, ok: 1.0 } 2014-06-29T22:14:06.812+0200 [conn199123] moveChunk updating self version to: 20016|1||538f1c07b86632c2d721e203 through { _id: 129281657 } -> { _id: 131845582 } for collection 'offerStore.offer' 2014-06-29T22:14:06.812+0200 [conn199123] SyncClusterConnection connecting to [sx350:20019] 2014-06-29T22:14:06.814+0200 [conn199123] SyncClusterConnection connecting to [sx351:20019] 2014-06-29T22:14:06.816+0200 [conn199123] SyncClusterConnection connecting to [sx352:20019] 2014-06-29T22:14:07.147+0200 [conn199123] about to log metadata event: { _id: "s128-2014-06-29T20:14:07-53b0738fe4db6482ab714a67", server: "s128", clientAddr: "172.16.65.202:43756", time: new Date(1404072847147), what: "moveChunk.commit", ns: "offerStore.offer", details: { min: { _id: 99144222 }, max: { _id: 129281657 }, from: "offerStoreUK", to: "offerStoreUK3", cloned: 2843, clonedBytes: 3969903, catchup: 0, steady: 0 } } 2014-06-29T22:14:07.337+0200 [conn201544] end connection 172.16.64.98:54303 (1164 connections now open) 2014-06-29T22:14:07.338+0200 [initandlisten] connection accepted from 172.16.64.98:54305 #201548 (1165 connections now open) 2014-06-29T22:14:07.339+0200 [conn201548] authenticate db: local { authenticate: 1, nonce: "xxx", user: "__system", key: "xxx" } 2014-06-29T22:14:07.350+0200 [conn199123] MigrateFromStatus::done About to acquire global write lock to exit critical section 2014-06-29T22:14:07.350+0200 [conn199123] MigrateFromStatus::done Global lock acquired 2014-06-29T22:14:07.361+0200 [conn199123] doing delete inline for cleanup of chunk data 2014-06-29T22:14:07.361+0200 [conn199123] SEVERE: Invalid access at address: 0 2014-06-29T22:14:07.460+0200 [conn199123] SEVERE: Got signal: 11 (Segmentation fault). Backtrace:0x11c0e91 0x11c026e 0x11c035f 0x7f6a68197030 0xdd7996 0xdd96cc 0xdd9c1a 0xfd21a3 0xa1e85a 0xa1f8ce 0xa21086 0xd4dae7 0xb97322 0xb99902 0x76b6af 0x117720b 0x7f6a6818eb50 0x7f6a675320ed /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11c0e91] /usr/bin/mongod() [0x11c026e] /usr/bin/mongod() [0x11c035f] /lib/x86_64-linux-gnu/libpthread.so.0(+0xf030) [0x7f6a68197030] /usr/bin/mongod(_ZNK5mongo12RangeDeleter11NSMinMaxCmpclEPKNS0_8NSMinMaxES4_+0x26) [0xdd7996] /usr/bin/mongod(_ZNK5mongo12RangeDeleter17canEnqueue_inlockERKNS_10StringDataERKNS_7BSONObjES6_PSs+0x1fc) [0xdd96cc] /usr/bin/mongod(_ZN5mongo12RangeDeleter9deleteNowERKSsRKNS_7BSONObjES5_S5_bPSs+0x22a) [0xdd9c1a] /usr/bin/mongod(_ZN5mongo16MoveChunkCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0xea73) [0xfd21a3] /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa1e85a] /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xd5e) [0xa1f8ce] /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c6) [0xa21086] /usr/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x2307) [0xd4dae7] /usr/bin/mongod() [0xb97322] /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x442) [0xb99902] /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x76b6af] /usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4fb) [0x117720b] /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f6a6818eb50] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f6a675320ed]
We are running mongodb-linux-x86_64-2.6.1.
It might be related to a this issue:
https://jira.mongodb.org/browse/SERVER-14261
- is related to
-
SERVER-14261 stepdown during migration range delete can abort mongod
- Closed