-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 4.0.12
-
Component/s: Networking
-
None
-
Fully Compatible
-
ALL
-
Service Arch 2019-09-23, Service Arch 2019-10-07, Service Arch 2019-10-21, Service Arch 2019-11-04
-
(copied to CRM)
There are a variety of bad races in global shutdown present in pre-4.2 versions of mongod and mongos. The principal problem is that we have a variety of background executors which are used for networking which are held on user threads by bare pointer. During global shutdown, we kill all user operations (which begins unwinding user stacks), while also shutting down and destroying those global objects. This can lead to use after frees, as well as scheduling errors, if those user threads unwind after global executor destruction.
These problems tend to be rare because we do process termination via _Exit, which doesn't usually join user threads, meaning we tend only to see these problems in the presence of many user threads, where the shutdown thread get's descheduled at particular points in execution.
Note that this problem also isn't present in 4.2 and later, as we've replaced the executor machinery so that executors are held by shared_ptr, and callbacks can see if they're being run on shutdown executors.
- is duplicated by
-
SERVER-43379 "Invariant failure _sessions.empty()" on mongos shutdown
- Closed