-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Server Programmability
-
Service Arch 2022-07-11, Service Arch 2022-11-28, Service Arch 2023-01-09, Service Arch 2023-01-23, Service Arch 2023-02-06, Service Arch 2023-02-20, Service Arch 2023-03-06, Service Arch 2023-03-20, Service Arch 2023-04-03, Service Arch 2023-04-17, Service Arch 2023-05-01, Service Arch 2023-05-15, Service Arch 2023-05-29, Service Arch 2023-06-12, Service Arch 2023-06-26
The original PrimaryOnlyService left it up to individual instances to insert and delete their state documents, but has an op observer for removing an instance from the in-memory map when the state document is deleted. This means an instance can end up in a "detached" state where it can continue executing logic after deleting its state doc, but no longer be in the in-memory map. This, in combination to the way services are interrupted on stepdown and shutdown, has led to oddities like making commands that wait on an instance's completion future call OperationContext::setAlwaysInterruptAtStepDownOrUp. (If they didn't, the state doc could be deleted while the node is primary, then the instance gets removed from the in-memory map, then the node steps down and the instance doesn't get interrupted since it's no longer in the map but its further work does get canceled since the scoped executor gets shut down, so the instance's completion promise is never fulfilled and the command would end up waiting forever.)
This ticket is to instead remove the instance from the in-memory map only after the instance's run() has complete and remove the calls to OperationContext::setAlwaysInterruptAtStepDownOrUp.
- depends on
-
SERVER-69236 Remove TTL index on tenant migration donor state document namespace
- Backlog
-
SERVER-65236 Make tenant migration donor delete its state doc in its run method
- Closed
-
SERVER-67372 Make tenant migration recipient delete its state document in its run method
- Closed
-
SERVER-67373 Make split donor delete its state document in its run method rather than using TTL
- Closed
-
SERVER-69235 Remove TTL index on tenant migration recipient state document namespace
- Closed
- is related to
-
SERVER-56390 Failed to construct ShardingDDLCoordinators do not get released
- Closed
- related to
-
SERVER-51650 Primary-Only Service's _rebuildCV should be notified even if stepdown happens quickly after stepup
- Closed
-
SERVER-69835 Add functionality for PrimaryOnlyService for cleaning up instances
- Backlog
-
SERVER-66351 Audit uses of OperationContext::setAlwaysInterruptAtStepDownOrUp
- Open
-
SERVER-65478 Fix race condition when removing tenant migration blockers in shard split
- Closed