SERVER-70094 added code to synchronize the range deletion with stepdowns, specifically, it stores the executor of the range deletion thread so it can be joined when stopping the service.
This have an unintended consequence though, if a stepdown command comes in at a time that manages to grab the RSTL lock before the RangeDeleterService thread does, it will get stuck when trying to stop the service (because it is waiting for the range deleter service executor), when at the same time, the range deleter service thread is actually waiting for the RSTL lock.
So we have a thread with the RSTL lock held waiting for an executor that will finish only after it grabs the RSTL lock.
In order to solve this, besides the executor, we could also capture the operation context and cancel it before waiting for the executor.
- is caused by
-
SERVER-70094 Synchronize shutdown with resuming of range deletions
- Closed
- related to
-
SERVER-60161 Deadlock between config server stepdown and _configsvrRenameCollectionMetadata command
- Closed
-
SERVER-70864 Get rid of fine grained scoped range deleter lock
- Closed