Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-30064

Multiple nodes crashed with Invalid access at address: 0x78 | Got signal: 11 (Segmentation fault).

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.5
    • Component/s: None
    • Environment:
      3-Node Replicaset running in Amazon AWS on R3.2XLarge instances on Ubuntu Ubuntu 14.04.5 LTS
    • ALL

      We've had multiple non-primary nodes crash with the error:

      Invalid access at address: 0x78
      Got signal: 11 (Segmentation fault).

      So far we've had 2 secondaries crash and 2 initial sync's crash with the above. I captured 2 of the errors:

      Initial Sync crash:

      2017-07-08T12:34:32.219+0000 I NETWORK  [conn38648] received client metadata from 10.0.8.200:34396 conn38648: { driver: { name: "PyMongo", version: "3.4.0" }, os: { type: "Linux", name: "Ubuntu 14.04 trusty", architecture: "x86_64", version: "3.13.0-121-generic" }, platform: "CPython 3.4.3.final.0", application: { name: "monitor: (pipeline.production.pazien, pid 10143)" } }
      2017-07-08T12:34:32.223+0000 I NETWORK  [thread1] connection accepted from 10.0.8.200:34400 #38649 (222 connections now open)
      2017-07-08T12:34:32.241+0000 I -        [conn38648] end connection 10.0.8.200:34396 (222 connections now open)
      2017-07-08T12:34:32.243+0000 I NETWORK  [conn38649] received client metadata from 10.0.8.200:34400 conn38649: { driver: { name: "PyMongo", version: "3.4.0" }, os: { type: "Linux", name: "Ubuntu 14.04 trusty", architecture: "x86_64", version: "3.13.0-121-generic" }, platform: "CPython 3.4.3.final.0", application: { name: "monitor: (pipeline.production.pazien, pid 10143)" } }
      2017-07-08T12:34:33.958+0000 I -        [repl writer worker 10]   pazien.processor.settlements collection clone progress: 238795326/1265553643 18% (documents copied)
      2017-07-08T12:34:39.872+0000 I -        [conn38647] end connection 10.0.8.200:34388 (221 connections now open)
      2017-07-08T12:34:39.872+0000 I -        [conn38649] end connection 10.0.8.200:34400 (221 connections now open)
      2017-07-08T12:34:42.414+0000 I -        [conn38639] end connection 10.0.8.200:34014 (219 connections now open)
      2017-07-08T12:34:42.414+0000 I -        [conn38637] end connection 10.0.8.200:34001 (218 connections now open)
      2017-07-08T12:34:55.773+0000 F -        [InitialSyncInserters-pazien.processor.settlements0] Invalid access at address: 0x78
      2017-07-08T12:34:55.828+0000 F -        [InitialSyncInserters-pazien.processor.settlements0] Got signal: 11 (Segmentation fault).
      
       0x7f0f0324ccb1 0x7f0f0324bec9 0x7f0f0324c536 0x7f0f00945330 0x7f0f009407b0 0x7f0f03b8102b 0x7f0f03b85395 0x7f0f03bd66aa 0x7f0f03bd12f3 0x7f0f03bd265a 0x7f0f03c2f42b 0x7f0f02f568a2 0x7f0f02f56bb0 0x7f0f02f4ac48 0x7f0f02f4b076 0x7f0f026bfa62 0x7f0f02bed483 0x7f0f02bed2f5 0x7f0f02d3f6c1 0x7f0f02d3f8c9 0x7f0f02d40a2f 0x7f0f031c670c 0x7f0f031c71bc 0x7f0f031c7ba6 0x7f0f03cc3130 0x7f0f0093d184 0x7f0f00669ffd
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"7F0F01CDB000","o":"1571CB1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7F0F01CDB000","o":"1570EC9"},{"b":"7F0F01CDB000","o":"1571536"},{"b":"7F0F00935000","o":"10330"},{"b":"7F0F00935000","o":"B7B0","s":"__pthread_mutex_unlock"},{"b":"7F0F01CDB000","o":"1EA602B"},{"b":"7F0F01CDB000","o":"1EAA395","s":"__wt_split_multi"},{"b":"7F0F01CDB000","o":"1EFB6AA","s":"__wt_evict"},{"b":"7F0F01CDB000","o":"1EF62F3"},{"b":"7F0F01CDB000","o":"1EF765A","s":"__wt_cache_eviction_worker"},{"b":"7F0F01CDB000","o":"1F5442B"},{"b":"7F0F01CDB000","o":"127B8A2","s":"_ZN5mongo22WiredTigerRecoveryUnit8_txnOpenEPNS_16OperationContextE"},{"b":"7F0F01CDB000","o":"127BBB0","s":"_ZN5mongo16WiredTigerCursorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmbPNS_16OperationContextE"},{"b":"7F0F01CDB000","o":"126FC48","s":"_ZN5mongo21WiredTigerRecordStore14_insertRecordsEPNS_16OperationContextEPNS_6RecordEm"},{"b":"7F0F01CDB000","o":"1270076","s":"_ZN5mongo21WiredTigerRecordStore12insertRecordEPNS_16OperationContextEPKcib"},{"b":"7F0F01CDB000","o":"9E4A62","s":"_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextERKNS_7BSONObjERKSt6vectorIPNS_15MultiIndexBlockESaIS8_EEb"},{"b":"7F0F01CDB000","o":"F12483"},{"b":"7F0F01CDB000","o":"F122F5"},{"b":"7F0F01CDB000","o":"10646C1"},{"b":"7F0F01CDB000","o":"10648C9"},{"b":"7F0F01CDB000","o":"1065A2F","s":"_ZN5mongo4repl10TaskRunner9_runTasksEv"},{"b":"7F0F01CDB000","o":"14EB70C","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE"},{"b":"7F0F01CDB000","o":"14EC1BC","s":"_ZN5mongo10ThreadPool13_consumeTasksEv"},{"b":"7F0F01CDB000","o":"14ECBA6","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE"},{"b":"7F0F01CDB000","o":"1FE8130"},{"b":"7F0F00935000","o":"8184"},{"b":"7F0F0056C000","o":"FDFFD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.5", "gitVersion" : "520b8f3092c48d934f0cd78ab5f40fe594f96863", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-121-generic", "version" : "#170-Ubuntu SMP Wed Jun 14 09:04:33 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "7F0F01CDB000", "elfType" : 3, "buildId" : "4CA0F472716BD03B90F2DCC8460A6D07F8994AD4" }, { "b" : "7FFD7D4F3000", "elfType" : 3, "buildId" : "1065C8F862FD32864124951EF77EC6AB63637C5A" }, { "b" : "7F0F01857000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "48A664AE6B0B4918A3EB0156C6364C4F084232FD" }, { "b" : "7F0F0147B000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "6B8997EA892A7FF37AC8CAA8F239D595251889BB" }, { "b" : "7F0F01273000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "AC72654C6338205F30190061C0D781CB0039B793" }, { "b" : "7F0F0106F000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "EED41ABB999C74882F001C53979CC820ED15BA82" }, { "b" : "7F0F00D69000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "8F0318B9CC6FD523C2587A15C5447ABBB8CD813D" }, { "b" : "7F0F00B53000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "36311B4457710AE5578C4BF00791DED7359DBB92" }, { "b" : "7F0F00935000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "F48E96A1F4A549776CA4167095AD7527720D4B0E" }, { "b" : "7F0F0056C000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "3217CA3A53A930C7BB1E5C83789D09B30B0F3B39" }, { "b" : "7F0F01AB6000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "37AFDBB933B8409476E845DF5FB11BC77CBCEEE6" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7f0f0324ccb1]
       mongod(+0x1570EC9) [0x7f0f0324bec9]
       mongod(+0x1571536) [0x7f0f0324c536]
       libpthread.so.0(+0x10330) [0x7f0f00945330]
       libpthread.so.0(__pthread_mutex_unlock+0x0) [0x7f0f009407b0]
       mongod(+0x1EA602B) [0x7f0f03b8102b]
       mongod(__wt_split_multi+0xA5) [0x7f0f03b85395]
       mongod(__wt_evict+0x92A) [0x7f0f03bd66aa]
       mongod(+0x1EF62F3) [0x7f0f03bd12f3]
       mongod(__wt_cache_eviction_worker+0x47A) [0x7f0f03bd265a]
       mongod(+0x1F5442B) [0x7f0f03c2f42b]
       mongod(_ZN5mongo22WiredTigerRecoveryUnit8_txnOpenEPNS_16OperationContextE+0x52) [0x7f0f02f568a2]
       mongod(_ZN5mongo16WiredTigerCursorC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEmbPNS_16OperationContextE+0x90) [0x7f0f02f56bb0]
       mongod(_ZN5mongo21WiredTigerRecordStore14_insertRecordsEPNS_16OperationContextEPNS_6RecordEm+0xB8) [0x7f0f02f4ac48]
       mongod(_ZN5mongo21WiredTigerRecordStore12insertRecordEPNS_16OperationContextEPKcib+0x46) [0x7f0f02f4b076]
       mongod(_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextERKNS_7BSONObjERKSt6vectorIPNS_15MultiIndexBlockESaIS8_EEb+0x102) [0x7f0f026bfa62]
       mongod(+0xF12483) [0x7f0f02bed483]
       mongod(+0xF122F5) [0x7f0f02bed2f5]
       mongod(+0x10646C1) [0x7f0f02d3f6c1]
       mongod(+0x10648C9) [0x7f0f02d3f8c9]
       mongod(_ZN5mongo4repl10TaskRunner9_runTasksEv+0xAF) [0x7f0f02d40a2f]
       mongod(_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockISt5mutexE+0x14C) [0x7f0f031c670c]
       mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0xBC) [0x7f0f031c71bc]
       mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x96) [0x7f0f031c7ba6]
       mongod(+0x1FE8130) [0x7f0f03cc3130]
       libpthread.so.0(+0x8184) [0x7f0f0093d184]
       libc.so.6(clone+0x6D) [0x7f0f00669ffd]
      -----  END BACKTRACE  -----
      
      
      Secondary crash:
      2017-06-23T20:10:42.964+0000 F -        [thread2] Invalid access at address: 0x78
      2017-06-23T20:10:43.011+0000 F -        [thread2] Got signal: 11 (Segmentation fault).
      
       0x7fe640585cb1 0x7fe640584ec9 0x7fe640585536 0x7fe63dc7e330 0x7fe63dc797b0 0x7fe640eba02b 0x7fe640ebe395 0x7fe640f0f6aa 0x7fe640f0a2f3 0x7fe640f0a687 0x7fe640f0c033 0x7fe640f75f36 0x7fe63dc76184 0x7fe63d9a2ffd
      ----- BEGIN BACKTRACE -----
      {"backtrace":[{"b":"7FE63F014000","o":"1571CB1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"7FE63F014000","o":"1570EC9"},{"b":"7FE63F014000","o":"1571536"},{"b":"7FE63DC6E000","o":"10330"},{"b":"7FE63DC6E000","o":"B7B0","s":"__pthread_mutex_unlock"},{"b":"7FE63F014000","o":"1EA602B"},{"b":"7FE63F014000","o":"1EAA395","s":"__wt_split_multi"},{"b":"7FE63F014000","o":"1EFB6AA","s":"__wt_evict"},{"b":"7FE63F014000","o":"1EF62F3"},{"b":"7FE63F014000","o":"1EF6687"},{"b":"7FE63F014000","o":"1EF8033","s":"__wt_evict_thread_run"},{"b":"7FE63F014000","o":"1F61F36","s":"__wt_thread_run"},{"b":"7FE63DC6E000","o":"8184"},{"b":"7FE63D8A5000","o":"FDFFD","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.5", "gitVersion" : "520b8f3092c48d934f0cd78ab5f40fe594f96863", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.13.0-121-generic", "version" : "#170-Ubuntu SMP Wed Jun 14 09:04:33 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "7FE63F014000", "elfType" : 3, "buildId" : "4CA0F472716BD03B90F2DCC8460A6D07F8994AD4" }, { "b" : "7FFE5D918000", "elfType" : 3, "buildId" : "1065C8F862FD32864124951EF77EC6AB63637C5A" }, { "b" : "7FE63EB90000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "48A664AE6B0B4918A3EB0156C6364C4F084232FD" }, { "b" : "7FE63E7B4000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "6B8997EA892A7FF37AC8CAA8F239D595251889BB" }, { "b" : "7FE63E5AC000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "AC72654C6338205F30190061C0D781CB0039B793" }, { "b" : "7FE63E3A8000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "EED41ABB999C74882F001C53979CC820ED15BA82" }, { "b" : "7FE63E0A2000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "8F0318B9CC6FD523C2587A15C5447ABBB8CD813D" }, { "b" : "7FE63DE8C000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "36311B4457710AE5578C4BF00791DED7359DBB92" }, { "b" : "7FE63DC6E000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "F48E96A1F4A549776CA4167095AD7527720D4B0E" }, { "b" : "7FE63D8A5000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "3217CA3A53A930C7BB1E5C83789D09B30B0F3B39" }, { "b" : "7FE63EDEF000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "37AFDBB933B8409476E845DF5FB11BC77CBCEEE6" } ] }}
       mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x7fe640585cb1]
       mongod(+0x1570EC9) [0x7fe640584ec9]
       mongod(+0x1571536) [0x7fe640585536]
       libpthread.so.0(+0x10330) [0x7fe63dc7e330]
       libpthread.so.0(__pthread_mutex_unlock+0x0) [0x7fe63dc797b0]
       mongod(+0x1EA602B) [0x7fe640eba02b]
       mongod(__wt_split_multi+0xA5) [0x7fe640ebe395]
       mongod(__wt_evict+0x92A) [0x7fe640f0f6aa]
       mongod(+0x1EF62F3) [0x7fe640f0a2f3]
       mongod(+0x1EF6687) [0x7fe640f0a687]
       mongod(__wt_evict_thread_run+0xD3) [0x7fe640f0c033]
       mongod(__wt_thread_run+0x16) [0x7fe640f75f36]
       libpthread.so.0(+0x8184) [0x7fe63dc76184]
       libc.so.6(clone+0x6D) [0x7fe63d9a2ffd]
      -----  END BACKTRACE  -----
      

      At first I thought it may have been a bad volume but after multiple new volumes and this happening 4 times, I don't believe it to be hardware.

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            marcusahle Marcus Ahle
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: