-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.5
-
Component/s: None
-
Environment:3-Node Replicaset running in Amazon AWS on R3.2XLarge instances on Ubuntu Ubuntu 14.04.5 LTS
-
ALL
We've had multiple non-primary nodes crash with the error:
Invalid access at address: 0x78
Got signal: 11 (Segmentation fault).
So far we've had 2 secondaries crash and 2 initial sync's crash with the above. I captured 2 of the errors:
Initial Sync crash:
2017-07-08T12:34:32.219+0000 I NETWORK [conn38648] received client metadata from 10.0.8.200:34396 conn38648: { driver: { name: "PyMongo", version: "3.4.0" }, os: { type: "Linux", name: "Ubuntu 14.04 trusty", architecture: "x86_64", version: "3.13.0-121-generic" }, platform: "CPython 3.4.3.final.0", application: { name: "monitor: (pipeline.production.pazien, pid 10143)" } } 2017-07-08T12:34:32.223+0000 I NETWORK [thread1] connection accepted from 10.0.8.200:34400 #38649 (222 connections now open) 2017-07-08T12:34:32.241+0000 I - [conn38648] end connection 10.0.8.200:34396 (222 connections now open) 2017-07-08T12:34:32.243+0000 I NETWORK [conn38649] received client metadata from 10.0.8.200:34400 conn38649: { driver: { name: "PyMongo", version: "3.4.0" }, os: { type: "Linux", name: "Ubuntu 14.04 trusty", architecture: "x86_64", version: "3.13.0-121-generic" }, platform: "CPython 3.4.3.final.0", application: { name: "monitor: (pipeline.production.pazien, pid 10143)" } } 2017-07-08T12:34:33.958+0000 I - [repl writer worker 10] pazien.processor.settlements collection clone progress: 238795326/1265553643 18% (documents copied) 2017-07-08T12:34:39.872+0000 I - [conn38647] end connection 10.0.8.200:34388 (221 connections now open) 2017-07-08T12:34:39.872+0000 I - [conn38649] end connection 10.0.8.200:34400 (221 connections now open) 2017-07-08T12:34:42.414+0000 I - [conn38639] end connection 10.0.8.200:34014 (219 connections now open) 2017-07-08T12:34:42.414+0000 I - [conn38637] end connection 10.0.8.200:34001 (218 connections now open) 2017-07-08T12:34:55.773+0000 F - [InitialSyncInserters-pazien.processor.settlements0] Invalid access at address: 0x78 2017-07-08T12:34:55.828+0000 F - [InitialSyncInserters-pazien.processor.settlements0] Got signal: 11 (Segmentation fault). 0x7f0f0324ccb1 0x7f0f0324bec9 0x7f0f0324c536 0x7f0f00945330 0x7f0f009407b0 0x7f0f03b8102b 0x7f0f03b85395 0x7f0f03bd66aa 0x7f0f03bd12f3 0x7f0f03bd265a 0x7f0f03c2f42b 0x7f0f02f568a2 0x7f0f02f56bb0 0x7f0f02f4ac48 0x7f0f02f4b076 0x7f0f026bfa62 0x7f0f02bed483 0x7f0f02bed2f5 0x7f0f02d3f6c1 0x7f0f02d3f8c9 0x7f0f02d40a2f 0x7f0f031c670c 0x7f0f031c71bc 0x7f0f031c7ba6 0x7f0f03cc3130 0x7f0f0093d184 0x7f0f00669ffd ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"7F0F01CDB000","o":"1571CB1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F0F01CDB000","o":"1570EC9"},{"b":"7F0F01CDB000","o":"1571536"},{"b":"7F0F00935000","o":"10330"},{"b":"7F0F00935000","o":"B7B0","s":"__pthread_mutex_unlock"},{"b":"7F0F01CDB000","o":"1EA602B"},{"b":"7F0F01CDB000","o":"1EAA395","s":"__wt_split_multi"},{"b":"7F0F01CDB000","o":"1EFB6AA","s":"__wt_evict"},{"b":"7F0F01CDB000","o":"1EF62F3"},{"b":"7F0F01CDB000","o":"1EF765A","s":"__wt_cache_eviction_worker"},{"b":"7F0F01CDB000","o":"1F5442B"},{"b":"7F0F01CDB000","o":"127B8A2","s":"_ZN5mongo22WiredTigerRecoveryUnit8_txnOpenEPNS_16OperationContextE"},{"b":"7F0F01CDB000","o":"127BBB0","s":"_ZN5mongo16WiredTigerCursorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmbPNS_16OperationContextE"},{"b":"7F0F01CDB000","o":"126FC48","s":"_ZN5mongo21WiredTigerRecordStore14_insertRecordsEPNS_16OperationContextEPNS_6RecordEm"},{"b":"7F0F01CDB000","o":"1270076","s":"_ZN5mongo21WiredTigerRecordStore12insertRecordEPNS_16OperationContextEPKcib"},{"b":"7F0F01CDB000","o":"9E4A62","s":"_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextERKNS_7BSONObjERKSt6vectorIPNS_15MultiIndexBlockESaIS8_EEb"},{"b":"7F0F01CDB000","o":"F12483"},{"b":"7F0F01CDB000","o":"F122F5"},{"b":"7F0F01CDB000","o":"10646C1"},{"b":"7F0F01CDB000","o":"10648C9"},{"b":"7F0F01CDB000","o":"1065A2F","s":"_ZN5mongo4repl10TaskRunner9_runTasksEv"},{"b":"7F0F01CDB000","o":"14EB70C","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE"},{"b":"7F0F01CDB000","o":"14EC1BC","s":"_ZN5mongo10ThreadPool13_consumeTasksEv"},{"b":"7F0F01CDB000","o":"14ECBA6","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"},{"b":"7F0F01CDB000","o":"1FE8130"},{"b":"7F0F00935000","o":"8184"},{"b":"7F0F0056C000","o":"FDFFD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.5", "gitVersion" : "520b8f3092c48d934f0cd78ab5f40fe594f96863", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-121-generic", "version" : "#170-Ubuntu SMP Wed Jun 14 09:04:33 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "7F0F01CDB000", "elfType" : 3, "buildId" : "4CA0F472716BD03B90F2DCC8460A6D07F8994AD4" }, { "b" : "7FFD7D4F3000", "elfType" : 3, "buildId" : "1065C8F862FD32864124951EF77EC6AB63637C5A" }, { "b" : "7F0F01857000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "48A664AE6B0B4918A3EB0156C6364C4F084232FD" }, { "b" : "7F0F0147B000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "6B8997EA892A7FF37AC8CAA8F239D595251889BB" }, { "b" : "7F0F01273000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "AC72654C6338205F30190061C0D781CB0039B793" }, { "b" : "7F0F0106F000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "EED41ABB999C74882F001C53979CC820ED15BA82" }, { "b" : "7F0F00D69000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "8F0318B9CC6FD523C2587A15C5447ABBB8CD813D" }, { "b" : "7F0F00B53000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "36311B4457710AE5578C4BF00791DED7359DBB92" }, { "b" : "7F0F00935000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "F48E96A1F4A549776CA4167095AD7527720D4B0E" }, { "b" : "7F0F0056C000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "3217CA3A53A930C7BB1E5C83789D09B30B0F3B39" }, { "b" : "7F0F01AB6000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "37AFDBB933B8409476E845DF5FB11BC77CBCEEE6" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f0f0324ccb1] mongod(+0x1570EC9) [0x7f0f0324bec9] mongod(+0x1571536) [0x7f0f0324c536] libpthread.so.0(+0x10330) [0x7f0f00945330] libpthread.so.0(__pthread_mutex_unlock+0x0) [0x7f0f009407b0] mongod(+0x1EA602B) [0x7f0f03b8102b] mongod(__wt_split_multi+0xA5) [0x7f0f03b85395] mongod(__wt_evict+0x92A) [0x7f0f03bd66aa] mongod(+0x1EF62F3) [0x7f0f03bd12f3] mongod(__wt_cache_eviction_worker+0x47A) [0x7f0f03bd265a] mongod(+0x1F5442B) [0x7f0f03c2f42b] mongod(_ZN5mongo22WiredTigerRecoveryUnit8_txnOpenEPNS_16OperationContextE+0x52) [0x7f0f02f568a2] mongod(_ZN5mongo16WiredTigerCursorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmbPNS_16OperationContextE+0x90) [0x7f0f02f56bb0] mongod(_ZN5mongo21WiredTigerRecordStore14_insertRecordsEPNS_16OperationContextEPNS_6RecordEm+0xB8) [0x7f0f02f4ac48] mongod(_ZN5mongo21WiredTigerRecordStore12insertRecordEPNS_16OperationContextEPKcib+0x46) [0x7f0f02f4b076] mongod(_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextERKNS_7BSONObjERKSt6vectorIPNS_15MultiIndexBlockESaIS8_EEb+0x102) [0x7f0f026bfa62] mongod(+0xF12483) [0x7f0f02bed483] mongod(+0xF122F5) [0x7f0f02bed2f5] mongod(+0x10646C1) [0x7f0f02d3f6c1] mongod(+0x10648C9) [0x7f0f02d3f8c9] mongod(_ZN5mongo4repl10TaskRunner9_runTasksEv+0xAF) [0x7f0f02d40a2f] mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0x14C) [0x7f0f031c670c] mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xBC) [0x7f0f031c71bc] mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x96) [0x7f0f031c7ba6] mongod(+0x1FE8130) [0x7f0f03cc3130] libpthread.so.0(+0x8184) [0x7f0f0093d184] libc.so.6(clone+0x6D) [0x7f0f00669ffd] ----- END BACKTRACE ----- Secondary crash: 2017-06-23T20:10:42.964+0000 F - [thread2] Invalid access at address: 0x78 2017-06-23T20:10:43.011+0000 F - [thread2] Got signal: 11 (Segmentation fault). 0x7fe640585cb1 0x7fe640584ec9 0x7fe640585536 0x7fe63dc7e330 0x7fe63dc797b0 0x7fe640eba02b 0x7fe640ebe395 0x7fe640f0f6aa 0x7fe640f0a2f3 0x7fe640f0a687 0x7fe640f0c033 0x7fe640f75f36 0x7fe63dc76184 0x7fe63d9a2ffd ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"7FE63F014000","o":"1571CB1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7FE63F014000","o":"1570EC9"},{"b":"7FE63F014000","o":"1571536"},{"b":"7FE63DC6E000","o":"10330"},{"b":"7FE63DC6E000","o":"B7B0","s":"__pthread_mutex_unlock"},{"b":"7FE63F014000","o":"1EA602B"},{"b":"7FE63F014000","o":"1EAA395","s":"__wt_split_multi"},{"b":"7FE63F014000","o":"1EFB6AA","s":"__wt_evict"},{"b":"7FE63F014000","o":"1EF62F3"},{"b":"7FE63F014000","o":"1EF6687"},{"b":"7FE63F014000","o":"1EF8033","s":"__wt_evict_thread_run"},{"b":"7FE63F014000","o":"1F61F36","s":"__wt_thread_run"},{"b":"7FE63DC6E000","o":"8184"},{"b":"7FE63D8A5000","o":"FDFFD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.5", "gitVersion" : "520b8f3092c48d934f0cd78ab5f40fe594f96863", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-121-generic", "version" : "#170-Ubuntu SMP Wed Jun 14 09:04:33 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "7FE63F014000", "elfType" : 3, "buildId" : "4CA0F472716BD03B90F2DCC8460A6D07F8994AD4" }, { "b" : "7FFE5D918000", "elfType" : 3, "buildId" : "1065C8F862FD32864124951EF77EC6AB63637C5A" }, { "b" : "7FE63EB90000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "48A664AE6B0B4918A3EB0156C6364C4F084232FD" }, { "b" : "7FE63E7B4000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "6B8997EA892A7FF37AC8CAA8F239D595251889BB" }, { "b" : "7FE63E5AC000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "AC72654C6338205F30190061C0D781CB0039B793" }, { "b" : "7FE63E3A8000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "EED41ABB999C74882F001C53979CC820ED15BA82" }, { "b" : "7FE63E0A2000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "8F0318B9CC6FD523C2587A15C5447ABBB8CD813D" }, { "b" : "7FE63DE8C000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "36311B4457710AE5578C4BF00791DED7359DBB92" }, { "b" : "7FE63DC6E000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "F48E96A1F4A549776CA4167095AD7527720D4B0E" }, { "b" : "7FE63D8A5000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "3217CA3A53A930C7BB1E5C83789D09B30B0F3B39" }, { "b" : "7FE63EDEF000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "37AFDBB933B8409476E845DF5FB11BC77CBCEEE6" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7fe640585cb1] mongod(+0x1570EC9) [0x7fe640584ec9] mongod(+0x1571536) [0x7fe640585536] libpthread.so.0(+0x10330) [0x7fe63dc7e330] libpthread.so.0(__pthread_mutex_unlock+0x0) [0x7fe63dc797b0] mongod(+0x1EA602B) [0x7fe640eba02b] mongod(__wt_split_multi+0xA5) [0x7fe640ebe395] mongod(__wt_evict+0x92A) [0x7fe640f0f6aa] mongod(+0x1EF62F3) [0x7fe640f0a2f3] mongod(+0x1EF6687) [0x7fe640f0a687] mongod(__wt_evict_thread_run+0xD3) [0x7fe640f0c033] mongod(__wt_thread_run+0x16) [0x7fe640f75f36] libpthread.so.0(+0x8184) [0x7fe63dc76184] libc.so.6(clone+0x6D) [0x7fe63d9a2ffd] ----- END BACKTRACE -----
At first I thought it may have been a bad volume but after multiple new volumes and this happening 4 times, I don't believe it to be hardware.
- duplicates
-
SERVER-29850 Access violation due to a bug in internal page splitting in WiredTiger
- Closed