-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Storage
-
None
-
ALL
-
-
Execution Team 2020-05-04
_buildIndex() is the method which performs collection scan , drain and commit phases of the index build. Drain and commit takes the stronger mode locks ( collection lock in S & X respectively). On master branch, we always run _buildIndex() method using index build coordinator. This means, we would be running _buildIndex() on a spawned thread (internal/system operation) which are not currently killable by the state transition thread (step down thread). This can result in 3 way deadlock where,
1) IndexBuildsCoordinatorMongod-X (internal thread) blocked on prepare conflict while holding RSTL in IX.
2) Step down enqueues RSTL lock in X mode. And blocked behind IndexBuildsCoordinatorMongod-X thread.
3) CommitTransaction cmd is waiting for RSTL lock to acquire in IX mode but blocked behind the step down thread.
To be noted, step down thread marks the the main thread(user connection thread which performs "createIndexes" cmd) as killed because the main thread previously acquired the RSTL in IX mode. Usually when the main thread gets interrupted by state transition, it kills the spawned IndexBuildsCoordinatorMongod-X thread NOT via opCtx channel. So, no way the internal thread (i..e.)IndexBuildsCoordinatorMongod-X waiting for the lock could be interrupted.
It seems, even on mongoDB 4.2, we will hit the 3 way deadlock if we set this server startup parameter enableIndexBuildsCoordinatorForCreateIndexesCommand to true. Because when "enableIndexBuildsCoordinatorForCreateIndexesCommand" is false, we run drain and commit index build phase on the main thread (user connection thread which performs "createIndexes" cmd) which is always interruptible by the step down thread.
Notes: We are acquiring collection lock in stronger mode in order to commit / abort.(X) and drain the side table writes (S). As, a result, this can lead to deadlocks involving prepared transactions, stepdown and indexBuildsCoordinator.
- depends on
-
SERVER-44791 Abort index builds by interrupting the OperationContext of the builder thread
- Closed
- duplicates
-
SERVER-46989 Index builds should hold RSTL to prevent replication state changes after deciding to commit or abort
- Closed
- is depended on by
-
SERVER-43216 Invariant internal operations that acquire strong locks are marked killable
- Closed
- is related to
-
SERVER-46704 Two phase index build can violate locking ordering and can lead to deadlocks.
- Closed
-
SERVER-71191 Deadlock between index build setup, prepared transaction, and stepdown
- Closed
-
SERVER-71198 Assert that unkillable operations that take X collection locks do not hold the RSTL
- Backlog
- related to
-
SERVER-42621 3 way deadlock can happen between hybrid index build, prepared transactions and stepdown thread.
- Closed
-
SERVER-78662 Deadlock with index build, step down, prepared transaction, and MODE_IS coll lock
- Closed