-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 3.4.18, 3.6.10, 4.0.5, 4.1.7
-
Component/s: Replication
-
None
-
Fully Compatible
-
v4.2, v4.0
-
Repl 2020-02-10, Repl 2020-02-24, Repl 2020-03-09
After a replica set node wins an election and transitions to PRIMARY state, it enters drain mode. In this mode, it will apply any oplog operations that were still left in its buffer from its time as a secondary. While in drain mode, a node is in PRIMARY state but cannot yet accept writes i.e. it will report isMaster:false. When the drain process has completed, the ReplicationCoordinator will be signaled by the oplog application logic in SyncTail. In the case that there are no operations to apply in drain mode, though, the newly elected primary should be able to complete drain mode immediately and begin accepting writes. This process may take up to a second or more, though, because of this hard coded 1 second timeout in the oplog application loop. This is wasted downtime where the primary could be accepting writes but is waiting for this timeout to trigger. This limits how quickly a node can step up and begin accepting writes. We should consider making this timeout configurable via an external parameter or hard-coding it at something less i.e. 100 milliseconds. Perhaps the ReplicationCoordinator could also signal the oplog application loop when it transitions to PRIMARY, letting it know it can check right away if drain mode can complete.
- is related to
-
SERVER-42219 Oplog buffer not always empty when primary exits drain mode
- Closed
- related to
-
SERVER-27342 Do not block unnecessarily on connecting to mongod or finishing initiate in ReplSetTest and ShardingTest
- Closed