Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62198

Fix Shutdown error with Progress Monitor

    • Sharding NYC
    • Fully Compatible
    • ALL
    • v4.4
    • Sharding 2021-12-27, Sharding 2022-01-10, Sharding 2022-01-24, Sharding 2022-02-07
    • 19

      It is possible for progress monitor to access invalid memory during shutdown. This is from a patch build.

      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.487Z W  ASIO     22601   [thread1] "No TransportLayer configured during NetworkInterface startup"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.491Z I  HEALTH   5936503 [thread1] "Fault manager changed state ","attr":{"state":"StartupCheck"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.491Z W  ASIO     22601   [thread1] "No TransportLayer configured during NetworkInterface startup"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.495Z I  HEALTH   5936503 [thread1] "Fault manager changed state ","attr":{"state":"StartupCheck"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.495Z I  HEALTH   5936601 [thread1] "Shutting down periodic health checks"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.495Z D1 HEALTH   6136801 [thread1] "Done shutting down periodic health checks"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.495Z I  ASIO     22582   [thread1] "Killing all outstanding egress activity."
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.495Z W  ASIO     22601   [thread1] "No TransportLayer configured during NetworkInterface startup"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.499Z I  HEALTH   5936503 [thread1] "Fault manager changed state ","attr":{"state":"StartupCheck"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.499Z I  HEALTH   5936601 [thread1] "Shutting down periodic health checks"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.499Z D1 HEALTH   6136801 [thread1] "Done shutting down periodic health checks"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.499Z I  ASIO     22582   [thread1] "Killing all outstanding egress activity."
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.499Z D1 HEALTH   5956701 [thread1] "Instantiated health observers","attr":{"managerState":"StartupCheck","observersCount":1}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z D1 HEALTH   5936504 [thread1] "Fault manager recieved health check result","attr":{"state":"StartupCheck","result":{"type":1,"description":"resolved","severity":0},"passed":true}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z I  HEALTH   5936502 [thread1] "The fault manager initial health checks have completed","attr":{"state":"Ok"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z I  HEALTH   5936503 [thread1] "Fault manager changed state ","attr":{"state":"Ok"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z D1 HEALTH   5936504 [thread1] "Fault manager recieved health check result","attr":{"state":"Ok","result":{"type":1,"description":"error","severity":1},"passed":false}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z I  TEST     6007905 [thread1] "Clean up test resources"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z F  CONTROL  4757800 [Health checks progress monitor] "Writing fatal message","attr":{"message":"Invalid access at address: 0x5633662f5230"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z F  CONTROL  4757800 [Health checks progress monitor] "Writing fatal message","attr":{"message":"Got signal: 11 (Segmentation fault).\n"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z I  ASIO     22582   [FaultManagerTest] "Killing all outstanding egress activity."
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.500Z W  ASIO     22601   [thread1] "No TransportLayer configured during NetworkInterface startup"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.504Z I  HEALTH   5936503 [thread1] "Fault manager changed state ","attr":{"state":"StartupCheck"}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.504Z I  HEALTH   5936601 [thread1] "Shutting down periodic health checks"
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31380   [Health checks progress monitor] "BACKTRACE","attr":{"bt":{"backtrace":[{"a":"7F141D973AC2","b":"7F141D787000","o":"1ECAC2","s":"_ZN5mongo18stack_trace_detail12_GLOBAL__N_119printStackTraceImplERKNS1_7OptionsEPNS_14StackTraceSinkE.constprop.360","s+":"202"},{"a":"7F141D976029","b":"7F141D787000","o":"1EF029","s":"_ZN5mongo15printStackTraceEv","s+":"29"},{"a":"7F141D96F643","b":"7F141D787000","o":"1E8643","s":"abruptQuitWithAddrSignal","s+":"F3"},{"a":"7F141B8F4D80","b":"7F141B8E2000","o":"12D80","s":"funlockfile","s+":"50"},{"a":"7F141E2F5574","b":"7F141E2BB000","o":"3A574","s":"_ZNK5mongo14process_health12FaultManager9getConfigEv","s+":"1C4"},{"a":"7F141E31B5C4","b":"7F141E2BB000","o":"605C4","s":"_ZN5mongo14process_health15ProgressMonitor20progressMonitorCheckESt8functionIFvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE","s+":"EB4"},{"a":"7F141E31C310","b":"7F141E2BB000","o":"61310","s":"_ZN5mongo14process_health15ProgressMonitor20_progressMonitorLoopEv","s+":"F0"},{"a":"7F141E31C52C","b":"7F141E2BB000","o":"6152C","s":"_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN5mongo4stdx6threadC4IZNS3_14process_health15ProgressMonitorC4EPNS7_12FaultManagerEPNS3_14ServiceContextESt8functionIFvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEUlvE_JELi0EEET_DpOT0_EUlvE_EEEEE6_M_runEv","s+":"5C"},{"a":"7F141DC32AAF","b":"7F141DC21000","o":"11AAF","s":"execute_native_thread_routine","s+":"F"},{"a":"7F141B8EA2DE","b":"7F141B8E2000","o":"82DE","s":"start_thread","s+":"FE"},{"a":"7F141B618A63","b":"7F141B51C000","o":"FCA63","s":"clone","s+":"43"}],"processInfo":{"mongodbVersion":"unknown","gitVersion":"none","compiledModules":["unknown"],"uname":{"sysname":"Linux","release":"4.18.0-80.1.2.el8_0.x86_64","version":"#1 SMP Sun Apr 28 09:21:22 UTC 2019","machine":"x86_64"},"somap":[{"b":"7F141E2BB000","path":"/data/mci/ab122ce74c68dd7aa22d5574b08e6e3a/src/build/install/bin/../lib/libfault_manager.so","elfType":3,"buildId":"D11BAEAA8F2041055E5822E8A0D8499C644AFD07"},{"b":"7F141DC21000","path":"/data/mci/ab122ce74c68dd7aa22d5574b08e6e3a/src/build/install/bin/../lib/libthread_pool.so","elfType":3,"buildId":"0D1E3DC3FA81AE08F392CB1F56D68195CA011A9C"},{"b":"7F141D787000","path":"/data/mci/ab122ce74c68dd7aa22d5574b08e6e3a/src/build/install/bin/../lib/libbase.so","elfType":3,"buildId":"B8699F620DFEFF34E1AA092DEEBFAB59873103A9"},{"b":"7F141B8E2000","path":"/lib64/libpthread.so.0","elfType":3,"buildId":"5326B8728FA01B7149DAC943100F1405533E76CE"},{"b":"7F141B51C000","path":"/lib64/libc.so.6","elfType":3,"buildId":"0598B7D6A05E64AE676133CF6331AF5578888AD0"}]}}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141D973AC2","b":"7F141D787000","o":"1ECAC2","s":"_ZN5mongo18stack_trace_detail12_GLOBAL__N_119printStackTraceImplERKNS1_7OptionsEPNS_14StackTraceSinkE.constprop.360","s+":"202"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141D976029","b":"7F141D787000","o":"1EF029","s":"_ZN5mongo15printStackTraceEv","s+":"29"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141D96F643","b":"7F141D787000","o":"1E8643","s":"abruptQuitWithAddrSignal","s+":"F3"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141B8F4D80","b":"7F141B8E2000","o":"12D80","s":"funlockfile","s+":"50"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141E2F5574","b":"7F141E2BB000","o":"3A574","s":"_ZNK5mongo14process_health12FaultManager9getConfigEv","s+":"1C4"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141E31B5C4","b":"7F141E2BB000","o":"605C4","s":"_ZN5mongo14process_health15ProgressMonitor20progressMonitorCheckESt8functionIFvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEE","s+":"EB4"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141E31C310","b":"7F141E2BB000","o":"61310","s":"_ZN5mongo14process_health15ProgressMonitor20_progressMonitorLoopEv","s+":"F0"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141E31C52C","b":"7F141E2BB000","o":"6152C","s":"_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN5mongo4stdx6threadC4IZNS3_14process_health15ProgressMonitorC4EPNS7_12FaultManagerEPNS3_14ServiceContextESt8functionIFvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEEUlvE_JELi0EEET_DpOT0_EUlvE_EEEEE6_M_runEv","s+":"5C"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141DC32AAF","b":"7F141DC21000","o":"11AAF","s":"execute_native_thread_routine","s+":"F"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141B8EA2DE","b":"7F141B8E2000","o":"82DE","s":"start_thread","s+":"FE"}}
      [cpp_unit_test:fault_base_classes_test] | 2021-12-20T21:27:28.515Z I  CONTROL  31445   [Health checks progress monitor] "Frame","attr":{"frame":{"a":"7F141B618A63","b":"7F141B51C000","o":"FCA63","s":"clone","s+":"43"}}
      

            Assignee:
            backlog-server-sharding-nyc [DO NOT USE] Backlog - Sharding NYC
            Reporter:
            lamont.nelson@mongodb.com Lamont Nelson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: