-
Type: Bug
-
Resolution: Duplicate
-
Priority: Blocker - P1
-
None
-
Affects Version/s: 2.2.0
-
Component/s: Replication
-
Environment:Linux 3.0.0-12-server #20-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
-
Fully Compatible
-
Linux
We had a 4-node replica set crashed, all nodes at the same time.
Here are some log fragments showing the segfault:
Node 1:
Wed Sep 26 10:59:06 [conn29622] query datamodel.datamodel query: { _id: "C83A35-c83a35220710" } ntoreturn:1 idhack:1 keyUpdates:0 locks(micros) r:220254 reslen:256219 220ms
Wed Sep 26 10:59:08 [rsHealthPoll] replSet member ACS-PROV2:27017 is up
Wed Sep 26 10:59:08 [rsHealthPoll] replSet member ACS-PROV2:27017 is now in state SECONDARY
Wed Sep 26 10:59:08 [rsSync] info DFM::findAll(): extent 2d:4070b000 was empty, skipping ahead. ns:local.replset.minvalid
Wed Sep 26 10:59:09 [rsHealthPoll] replSet member ACS-PROV4:27017 is up
Wed Sep 26 10:59:09 [rsHealthPoll] replSet member ACS-PROV4:27017 is now in state SECONDARY
Wed Sep 26 10:59:09 [rsMgr] replSet warning caught unexpected exception in electSelf()
Wed Sep 26 10:59:09 Invalid access at address: 0x7f6a0b42e360 from thread:
Wed Sep 26 10:59:09 Got signal: 11 (Segmentation fault).
Wed Sep 26 10:59:09 Backtrace:
0xade6e1 0x5582d9 0x558862 0x7fc5508c7060 0x7f6a0b42e360
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5582d9]
/usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x558862]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10060) [0x7fc5508c7060]
[0x7f6a0b42e360]
Node 2:
Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46414 #35439 (42 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46413 #35440 (43 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57056 #35441 (44 connections now open) Wed Sep 26 10:59:13 [conn35441] recv(): message len 0 is too large0 Wed Sep 26 10:59:13 [conn35441] end connection 10.220.40.26:57056 (43 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57057 #35442 (44 connections now open) Wed Sep 26 10:59:13 [conn35414] recv(): message len 0 is too large0 Wed Sep 26 10:59:13 [conn35414] end connection 10.220.40.26:57054 (43 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57058 #35443 (44 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57062 #35444 (45 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57065 #35445 (46 connections now open) Wed Sep 26 10:59:13 [conn35444] Assertion: 10334:Invalid BSONObj size: 0 (0x00000000) first element: <98>^U^GG: ?type=112 0xade6e1 0x8036eb 0x80386c 0x571bc7 0x571f4c 0x7a2f42 0xb5b9b9 0xb5d052 0x56fa52 0x5dbf11 0x7fef22966efc 0x7fef21cf889d Wed Sep 26 10:59:13 [conn35445] recv(): message len 0 is too large0 Wed Sep 26 10:59:13 [conn35445] end connection 10.220.40.26:57065 (45 connections now open) /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1] /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x9b) [0x8036eb] /usr/bin/mongod() [0x80386c] /usr/bin/mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x497) [0x571bc7] /usr/bin/mongod(_ZN5mongo7BSONObjC1EPKc+0x3c) [0x571f4c] /usr/bin/mongod(_ZN5mongo12QueryMessageC1ERNS_9DbMessageE+0xb2) [0x7a2f42] /usr/bin/mongod() [0xb5b9b9] /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0xb5d052] /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x82) [0x56fa52] /usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x411) [0x5dbf11] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7efc) [0x7fef22966efc] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fef21cf889d] Wed Sep 26 10:59:13 [conn35444] AssertionException handling request, closing client connection: 10334 Invalid BSONObj size: 0 (0x00000000) first element: <98>^U^GG: ?type=112 Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.25:47291 #35446 (45 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.25:47292 #35447 (46 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46415 #35448 (47 connections now open) Wed Sep 26 10:59:13 [conn35448] Assertion: 10334:Invalid BSONObj size: 809775172 (0x44304430) first element: 0022938C388C: ?type=45 0xade6e1 0x8036eb 0x80386c 0x571bc7 0x571f4c 0x7a2f42 0xb5b9b9 0xb5d052 0x56fa52 0x5dbf11 0x7fef22966efc 0x7fef21cf889d Wed Sep 26 10:59:13 [conn35437] recv(): message len 0 is too large0 Wed Sep 26 10:59:13 [conn35437] end connection 10.220.40.23:46410 (46 connections now open) Wed Sep 26 10:59:13 [conn35419] end connection 10.220.40.23:46387 (45 connections now open) Wed Sep 26 10:59:13 [conn35418] end connection 10.220.40.23:46386 (44 connections now open) /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1] /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x9b) [0x8036eb] /usr/bin/mongod() [0x80386c] /usr/bin/mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x497) [0x571bc7] /usr/bin/mongod(_ZN5mongo7BSONObjC1EPKc+0x3c) [0x571f4c] /usr/bin/mongod(_ZN5mongo12QueryMessageC1ERNS_9DbMessageE+0xb2) [0x7a2f42] /usr/bin/mongod() [0xb5b9b9] /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0xb5d052] /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x82) [0x56fa52] /usr/bin/mongod(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x411) [0x5dbf11] /lib/x86_64-linux-gnu/libpthread.so.0(+0x7efc) [0x7fef22966efc] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fef21cf889d] Wed Sep 26 10:59:13 [conn35448] AssertionException handling request, closing client connection: 10334 Invalid BSONObj size: 809775172 (0x44304430) first element: 0022938C388C: ?type=45 Wed Sep 26 10:59:13 [conn35423] recv(): message len 0 is too large0 Wed Sep 26 10:59:13 [conn35423] end connection 10.220.40.23:46394 (42 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46426 #35449 (43 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46433 #35450 (44 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46432 #35451 (45 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46440 #35452 (46 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46441 #35453 (47 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46445 #35454 (48 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46446 #35455 (49 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46447 #35456 (50 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46448 #35457 (51 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.23:46449 #35458 (52 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.25:47294 #35459 (53 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57066 #35460 (54 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57068 #35461 (55 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57070 #35462 (56 connections now open) Wed Sep 26 10:59:13 [initandlisten] connection accepted from 10.220.40.26:57071 #35463 (57 connections now open) Wed Sep 26 10:59:13 Invalid access at address: 0x7f93fb9d2000 from thread: conn35459 Wed Sep 26 10:59:13 Got signal: 11 (Segmentation fault). Wed Sep 26 10:59:13 Backtrace: 0xade6e1 0x5582d9 0x558862 0x7fef2296f060 0x7fef21d5b056 0x610243 0x93070b 0x93d90b 0x944669 0x944f72 0x9456c7 0x880697 0x8810c6 0x849cfa 0x84b258 0x84b41b 0x6b046c 0x6b3d38 0xb5ba7d 0xb5d052 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1] /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5582d9] /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x558862] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10060) [0x7fef2296f060] /lib/x86_64-linux-gnu/libc.so.6(+0x144056) [0x7fef21d5b056] /usr/bin/mongod(_ZN5mongo20compareElementValuesERKNS_11BSONElementES2_+0x313) [0x610243] /usr/bin/mongod(_ZN5mongo10FieldRange9intersectERKS0_b+0x12b) [0x93070b] /usr/bin/mongod(_ZN5mongo13FieldRangeSet19intersectMatchFieldEPKcRKNS_11BSONElementEbb+0x5b) [0x93d90b] /usr/bin/mongod(_ZN5mongo13FieldRangeSet16handleMatchFieldERKNS_11BSONElementEb+0x1b9) [0x944669] /usr/bin/mongod(_ZN5mongo13FieldRangeSet4initEb+0xa2) [0x944f72] /usr/bin/mongod(_ZN5mongo13FieldRangeSetC1EPKcRKNS_7BSONObjEbb+0x147) [0x9456c7] /usr/bin/mongod(_ZN5mongo16MultiPlanScanner4initERKNS_7BSONObjES3_S3_+0x87) [0x880697] /usr/bin/mongod(_ZN5mongo16MultiPlanScanner4makeEPKcRKNS_7BSONObjES5_RKN5boost10shared_ptrIKNS_11ParsedQueryEEES5_NS_18QueryPlanGenerator18RecordedPlanPolicyES5_S5_+0x76) [0x8810c6] /usr/bin/mongod(_ZN5mongo15CursorGenerator19setMultiPlanScannerEv+0xea) [0x849cfa] /usr/bin/mongod(_ZN5mongo15CursorGenerator8generateEv+0x98) [0x84b258] /usr/bin/mongod(_ZN5mongo25NamespaceDetailsTransient9getCursorEPKcRKNS_7BSONObjES5_RKNS_24QueryPlanSelectionPolicyEPbRKN5boost10shared_ptrIKNS_11ParsedQueryEEEbPNS_16QueryPlanSummaryE+0x4b) [0x84b41b] /usr/bin/mongod(_ZN5mongo23queryWithQueryOptimizerEiRKSsRKNS_7BSONObjERNS_5CurOpES4_S4_RKN5boost10shared_ptrINS_11ParsedQueryEEES4_RKNS_17ShardChunkVersionERNS7_10scoped_ptrINS_25PageFaultRetryableSectionEEERNSG_INS_19NoPageFaultsAllowedEEERNS_7MessageE+0xfc) [0x6b046c] /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x16b8) [0x6b3d38] /usr/bin/mongod() [0xb5ba7d] /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0xb5d052]
Node 3:
Wed Sep 26 10:59:24 [initandlisten] connection accepted from 10.220.40.23:40409 #40515 (269 connections now open) Wed Sep 26 10:59:24 [initandlisten] connection accepted from 10.220.40.26:52395 #40516 (270 connections now open) Wed Sep 26 10:59:24 [conn39893] end connection 10.220.40.23:39959 (269 connections now open) Wed Sep 26 10:59:24 [conn40022] end connection 10.220.40.23:40038 (268 connections now open) Wed Sep 26 10:59:24 [conn40424] end connection 10.220.40.26:52385 (267 connections now open) Wed Sep 26 10:59:24 [initandlisten] connection accepted from 10.220.40.23:40410 #40517 (268 connections now open) Wed Sep 26 10:59:24 Invalid access at address: 0x7edd6b5ac000 from thread: conn39940 Wed Sep 26 10:59:24 Got signal: 11 (Segmentation fault). Wed Sep 26 10:59:24 Backtrace: 0xade6e1 0x5582d9 0x558862 0x7f37fcddf060 0x7f37fc1cb048 0x610243 0x93070b 0x93d90b 0x944669 0x944f72 0x9456c7 0x880697 0x8810c6 0x849cfa 0x84b258 0x84b41b 0xc07af3 0xc0cddd 0xb546af 0xb5dd61 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1] /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5582d9] /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x558862] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10060) [0x7f37fcddf060] /lib/x86_64-linux-gnu/libc.so.6(+0x144048) [0x7f37fc1cb048] /usr/bin/mongod(_ZN5mongo20compareElementValuesERKNS_11BSONElementES2_+0x313) [0x610243] /usr/bin/mongod(_ZN5mongo10FieldRange9intersectERKS0_b+0x12b) [0x93070b] /usr/bin/mongod(_ZN5mongo13FieldRangeSet19intersectMatchFieldEPKcRKNS_11BSONElementEbb+0x5b) [0x93d90b] /usr/bin/mongod(_ZN5mongo13FieldRangeSet16handleMatchFieldERKNS_11BSONElementEb+0x1b9) [0x944669] /usr/bin/mongod(_ZN5mongo13FieldRangeSet4initEb+0xa2) [0x944f72] /usr/bin/mongod(_ZN5mongo13FieldRangeSetC1EPKcRKNS_7BSONObjEbb+0x147) [0x9456c7] /usr/bin/mongod(_ZN5mongo16MultiPlanScanner4initERKNS_7BSONObjES3_S3_+0x87) [0x880697] /usr/bin/mongod(_ZN5mongo16MultiPlanScanner4makeEPKcRKNS_7BSONObjES5_RKN5boost10shared_ptrIKNS_11ParsedQueryEEES5_NS_18QueryPlanGenerator18RecordedPlanPolicyES5_S5_+0x76) [0x8810c6] /usr/bin/mongod(_ZN5mongo15CursorGenerator19setMultiPlanScannerEv+0xea) [0x849cfa] /usr/bin/mongod(_ZN5mongo15CursorGenerator8generateEv+0x98) [0x84b258] /usr/bin/mongod(_ZN5mongo25NamespaceDetailsTransient9getCursorEPKcRKNS_7BSONObjES5_RKNS_24QueryPlanSelectionPolicyEPbRKN5boost10shared_ptrIKNS_11ParsedQueryEEEbPNS_16QueryPlanSummaryE+0x4b) [0x84b41b] /usr/bin/mongod(_ZN5mongo14_updateObjectsEbPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEPNS_11RemoveSaverEbRKNS_24QueryPlanSelectionPolicyE+0x4b3) [0xc07af3] /usr/bin/mongod(_ZN5mongo13updateObjectsEPKcRKNS_7BSONObjES4_bbbRNS_7OpDebugEbRKNS_24QueryPlanSelectionPolicyE+0x12d) [0xc0cddd] /usr/bin/mongod(_ZN5mongo14receivedUpdateERNS_7MessageERNS_5CurOpE+0x4cf) [0xb546af] /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x10b1) [0xb5dd61]
Node 4:
Wed Sep 26 10:59:17 [initandlisten] connection accepted from 10.220.40.26:48214 #34542 (29 connections now open) Wed Sep 26 10:59:17 [conn34542] recv(): message len 0 is too large0 Wed Sep 26 10:59:17 [conn34542] end connection 10.220.40.26:48214 (27 connections now open) Wed Sep 26 10:59:17 [initandlisten] connection accepted from 10.220.40.26:48215 #34543 (28 connections now open) Wed Sep 26 10:59:17 [initandlisten] connection accepted from 10.220.40.26:48216 #34544 (29 connections now open) Wed Sep 26 10:59:17 [initandlisten] connection accepted from 10.220.40.26:48217 #34545 (30 connections now open) Wed Sep 26 10:59:17 [conn34544] recv(): message len 4 is too large4 Wed Sep 26 10:59:17 [conn34544] end connection 10.220.40.26:48216 (29 connections now open) Wed Sep 26 10:59:17 [conn34543] Assertion: 10306:Client Error: Next object larger than space left in message Wed Sep 26 10:59:17 [conn34545] recv(): message len 4 is too large4 0xade6e1 0x8036eb 0x7a3364 0xb5b9b9 0xb5d052 0x56fa52 0x5dbf11 0x7f74470f7efc 0x7f744648989d Wed Sep 26 10:59:17 [conn34545] end connection 10.220.40.26:48217 (28 connections now open) Wed Sep 26 10:59:17 [initandlisten] connection accepted from 10.220.40.26:48218 #34546 (29 connections now open) Wed Sep 26 10:59:17 [initandlisten] connection accepted from 10.220.40.26:48219 #34547 (30 connections now open) Wed Sep 26 10:59:17 [conn34546] recv(): message len 4 is too large4 Wed Sep 26 10:59:17 [conn34546] end connection 10.220.40.26:48218 (29 connections now open) Wed Sep 26 10:59:17 [conn34547] recv(): message len 0 is too large0 Wed Sep 26 10:59:17 [conn34547] end connection 10.220.40.26:48219 (29 connections now open) Wed Sep 26 10:59:17 Backtrace: 0xade6e1 0x5582d9 0x558862 0x7f7447100060 0x7f74464ec02e 0x610243 0x93070b 0x93d90b 0x944669 0x944f72 0x9456c7 0x880697 0x8810c6 0x849cfa 0x84b258 0x84b41b 0x6b046c 0x6b3d38 0xb5ba7d 0xb5d052 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xade6e1] /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x5582d9] /usr/bin/mongod(_ZN5mongo24abruptQuitWithAddrSignalEiP7siginfoPv+0x262) [0x558862] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10060) [0x7f7447100060] /lib/x86_64-linux-gnu/libc.so.6(+0x14402e) [0x7f74464ec02e] /usr/bin/mongod(_ZN5mongo20compareElementValuesERKNS_11BSONElementES2_+0x313) [0x610243] /usr/bin/mongod(_ZN5mongo10FieldRange9intersectERKS0_b+0x12b) [0x93070b] /usr/bin/mongod(_ZN5mongo13FieldRangeSet19intersectMatchFieldEPKcRKNS_11BSONElementEbb+0x5b) [0x93d90b] /usr/bin/mongod(_ZN5mongo13FieldRangeSet16handleMatchFieldERKNS_11BSONElementEb+0x1b9) [0x944669] /usr/bin/mongod(_ZN5mongo13FieldRangeSet4initEb+0xa2) [0x944f72] /usr/bin/mongod(_ZN5mongo13FieldRangeSetC1EPKcRKNS_7BSONObjEbb+0x147) [0x9456c7] /usr/bin/mongod(_ZN5mongo16MultiPlanScanner4initERKNS_7BSONObjES3_S3_+0x87) [0x880697] /usr/bin/mongod(_ZN5mongo16MultiPlanScanner4makeEPKcRKNS_7BSONObjES5_RKN5boost10shared_ptrIKNS_11ParsedQueryEEES5_NS_18QueryPlanGenerator18RecordedPlanPolicyES5_S5_+0x76) [0x8810c6] /usr/bin/mongod(_ZN5mongo15CursorGenerator19setMultiPlanScannerEv+0xea) [0x849cfa] /usr/bin/mongod(_ZN5mongo15CursorGenerator8generateEv+0x98) [0x84b258] /usr/bin/mongod(_ZN5mongo25NamespaceDetailsTransient9getCursorEPKcRKNS_7BSONObjES5_RKNS_24QueryPlanSelectionPolicyEPbRKN5boost10shared_ptrIKNS_11ParsedQueryEEEbPNS_16QueryPlanSummaryE+0x4b) [0x84b41b] /usr/bin/mongod(_ZN5mongo23queryWithQueryOptimizerEiRKSsRKNS_7BSONObjERNS_5CurOpES4_S4_RKN5boost10shared_ptrINS_11ParsedQueryEEES4_RKNS_17ShardChunkVersionERNS7_10scoped_ptrINS_25PageFaultRetryableSectionEEERNSG_INS_19NoPageFaultsAllowedEEERNS_7MessageE+0xfc) [0x6b046c] /usr/bin/mongod(_ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x16b8) [0x6b3d38] /usr/bin/mongod() [0xb5ba7d] /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x3a2) [0xb5d052]
- duplicates
-
JAVA-660 Concurrency issue can cause corrupted messages to be sent to the server
- Closed