-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
Fully Compatible
-
Repl 2023-03-06, Repl 2023-03-20, Repl 2023-05-01
-
135
There are currently 30ish non-test calls to setSystemOperationKillableByStepdown(). Every time we introduce a new thread, there’s a non-obvious requirement to call that function.
Failing to do so results in the process crashing if the operation hits a prepare conflict. This is a rare occurence, which means we risk not catching crashing bugs in testing. In addition to the visual clutter, the API risks that developers create new internal threads that are unkilllable when they shouldn't be.
It seems that there are only a few system operations that actually need to be unkilllable and the vast majority of all threads should be killable.
We should consider changing the default such that system operations are always killable and have the limited set of special operations explicitly opt-in to being unkillable.
- causes
-
SERVER-75352 Make OplogBatcher's ReplBatcher thread unkillable
- Closed
- is depended on by
-
SERVER-74658 revisit if thread marked as unkillable is okay to be killable for sharding related
- Open
-
SERVER-74659 revisit if thread marked as unkillable is okay to be killable for service architecture related
- Open
-
SERVER-74656 revisit if thread marked as unkillable is okay to be killable for replication related
- Closed
-
SERVER-74657 revisit if thread marked as unkillable is okay to be killable for storage execution related
- Closed
-
SERVER-74660 revisit if thread marked as unkillable is okay to be killable for security related
- Closed
-
SERVER-74661 revisit if thread marked as unkillable is okay to be killable for serverless related
- Closed
-
SERVER-74662 Query work to revisit if threads currently marked as "unkillable"(meaning non-interruptible) should instead be interruptable
- Closed
-
SERVER-74953 Explore avoiding stepdowns during the early phases of index build setup
- Closed
- is related to
-
SERVER-60161 Deadlock between config server stepdown and _configsvrRenameCollectionMetadata command
- Closed
- related to
-
SERVER-43174 Designate the MigrationDestinationManager's migrateThread as system operation killable
- Closed
-
SERVER-58143 shardsvrDropCollectionParticipant should be killable on stepdown
- Closed
-
SERVER-58775 Mark ConfigsvrSetAllowMigrationsCommand's opCtx as killable on stepdown
- Closed
-
SERVER-59635 Mark ConfigSvrMoveChunkCommand as interruptible on stepdown
- Closed
-
SERVER-60521 Deadlock on stepup due to moveChunk command running uninterrupted on secondary
- Closed
-
SERVER-79026 Failing to cancel the JournalFlusher thread might lead to 3-way deadlock
- Closed