-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
v4.0, v3.6
-
Repl 2018-06-04, Repl 2018-06-18, Repl 2018-07-02
replSetStepDown command waits for a majority of nodes to catch up and one of them to be an eligible candidate, but such event is only signaled when processing heartbeat responses, which adds more delay to the handoff.
The easiest and less efficient fix is to signal the condition variable whenever we update the last applied optime. The better solution is to replace the conditional variable with a waiter in _replicationWaiterList as in _awaitReplication_inlock(). A third solution is to call _awaitReplication_inlock(), which might not be desired since the condition stepdown command is waiting on is slightly different than w: majority + an eligible candidate specified in config.
- is related to
-
SERVER-53612 StepDown hangs until timeout if all nodes are caught up but none is immediately electable
- Closed
- related to
-
SERVER-35623 Send a replSetStepUp command to an eligible candidate on stepdown
- Closed