-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
ALL
-
Sharding 2019-01-28, Sharding 2019-02-11, Sharding 2019-02-25, Sharding 2019-03-11, Sharding 2019-03-25, Sharding 2019-04-08
-
21
This line tries to pause a PeriodicRunner task. If this happens after this line in shutdown which stops the PeriodicRunner, the following invariant trips:
[js_test:auth] 2018-12-17T22:08:02.168-0500 d20027| 2018-12-17T17:08:02.168-0500 I ASIO [Replication] Dropping all pooled connections to redbeard:20029 due to HostUnreachable: Error connecting to redbeard:20029 (127.0.0.1:20029) :: caused by :: Connection refused [js_test:auth] 2018-12-17T22:08:02.168-0500 d20027| 2018-12-17T17:08:02.168-0500 I REPL_HB [replexec-0] Error in heartbeat (requestId: 984) to redbeard:20029, response status: HostUnreachable: Error connecting to redbeard:20029 (127.0.0.1:20029) :: caused by :: Connection refused [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 I REPL [replexec-1] can't see a majority of the set, relinquishing primary [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 I REPL [replexec-1] Stepping down from primary in response to heartbeat [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 I REPL [replexec-1] transition to SECONDARY from PRIMARY [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 I NETWORK [replexec-1] Skip closing connection for connection # 43 [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 I SHARDING [replexec-1] The ChunkSplitter has stopped and will no longer run new autosplit tasks. Any autosplit tasks that have already started will be allowed to finish. [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 F - [replexec-1] Invariant failure _execStatus == PeriodicJobImpl::ExecutionStatus::RUNNING src/mongo/util/periodic_runner_impl.cpp 143 [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 I NETWORK [conn1] end connection 127.0.0.1:47664 (1 connection now open) [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| 2018-12-17T17:08:02.180-0500 F - [replexec-1] [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| ***aborting after invariant() failure [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| [js_test:auth] 2018-12-17T22:08:02.180-0500 d20027| [js_test:auth] 2018-12-17T22:08:02.181-0500 d20027| 2018-12-17T17:08:02.180-0500 F - [replexec-1] Got signal: 6 (Aborted). [js_test:auth] 2018-12-17T22:08:02.181-0500 d20027| 0x7f93dcfda36a 0x7f93dcfd9c6e 0x7f93dcfd9d0f 0x7f93db02c3c0 0x7f93dae8dd7f 0x7f93dae78672 0x7f93dcf3616f 0x7f93df743740 0x7f93df744841 0x7f93e02702e4 0x7f93e01bfc71 0x7f93e01d3e23 0x7f93df1127b5 0x7f93df112eae 0x7f93df1ffbef 0x7f93df2003d0 0x7f93df200d65 0x7f93db113063 0x7f93db021a9d 0x7f93daf51b23 [js_test:auth] 2018-12-17T22:08:02.181-0500 d20027| ----- BEGIN BACKTRACE ----- SNIP [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libbase.so(mongo::printStackTrace(std::basic_ostream<char, std::char_traits<char> >&)+0x3A) [0x7f93dcfda36a] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libbase.so(+0x176C6E) [0x7f93dcfd9c6e] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libbase.so(+0x176D0F) [0x7f93dcfd9d0f] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libpthread.so.0(+0x123C0) [0x7f93db02c3c0] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libc.so.6(gsignal+0x10F) [0x7f93dae8dd7f] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libc.so.6(abort+0x125) [0x7f93dae78672] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libbase.so(mongo::invariantFailedWithMsg(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, unsigned int)+0x0) [0x7f93dcf3616f] [js_test:auth] 2018-12-17T22:08:02.182-0500 d20027| libperiodic_runner_impl.so(+0x4740) [0x7f93df743740] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libperiodic_runner_impl.so(mongo::PeriodicRunnerImpl::PeriodicJobHandleImpl::pause()+0x81) [0x7f93df744841] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libserveronly_repl.so(mongo::repl::ReplicationCoordinatorExternalStateImpl::shardingOnStepDownHook()+0xC4) [0x7f93e02702e4] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| librepl_coordinator_impl.so(mongo::repl::ReplicationCoordinatorImpl::_performPostMemberStateUpdateAction(mongo::repl::ReplicationCoordinatorImpl::PostMemberStateUpdateAction)+0x201) [0x7f93e01bfc71] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| librepl_coordinator_impl.so(mongo::repl::ReplicationCoordinatorImpl::_stepDownFinish(mongo::executor::TaskExecutor::CallbackArgs const&, mongo::executor::TaskExecutor::EventHandle const&)+0x183) [0x7f93e01d3e23] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libthread_pool_task_executor.so(mongo::executor::ThreadPoolTaskExecutor::runCallback(std::shared_ptr<mongo::executor::ThreadPoolTaskExecutor::CallbackState>)+0x175) [0x7f93df1127b5] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libthread_pool_task_executor.so(+0xCEAE) [0x7f93df112eae] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libthread_pool.so(mongo::ThreadPool::_doOneTask(std::unique_lock<std::mutex>*)+0x15F) [0x7f93df1ffbef] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libthread_pool.so(mongo::ThreadPool::_consumeTasks()+0xA0) [0x7f93df2003d0] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libthread_pool.so(mongo::ThreadPool::_workerThreadBody(mongo::ThreadPool*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x95) [0x7f93df200d65] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libstdc++.so.6(+0xBC063) [0x7f93db113063] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libpthread.so.0(+0x7A9D) [0x7f93db021a9d] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| libc.so.6(clone+0x43) [0x7f93daf51b23] [js_test:auth] 2018-12-17T22:08:02.183-0500 d20027| ----- END BACKTRACE -----[
I think this was likely introduced by this commit.
- duplicates
-
SERVER-39936 Use PeriodicRunner handles to simplify shutdown ordering
- Closed
- is duplicated by
-
SERVER-40174 PeriodicBalancerConfigRefresher is not thread-safe since it puts its PeriodicJobs on the ServiceContext's PeriodicRunner
- Closed