-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
Fully Compatible
-
ALL
-
v8.0, v7.0
-
Repl 2024-05-13
-
(copied to CRM)
When replication waits for write concern we'll call awaitReplication, which inserts a new 'waiter' into a std::multimap called the _replicationWaiterList. Waiters are sorted by opTime, and when a primary advances its commit point it will check if any write concern waiters in the map can be satisfied with the new optime. It iterates through the map until it hits a write concern with an optime greater than the new optime to check against, at which point it will end the check.
Waiters in the list are only ever removed once the write concern is satisfied or there is an error returned from the function we are calling on the waiter. Even if a request with write concern times out (the future hits the deadline), the waiter exists in the list until its satisfied.
There are cases where unsatisfiable write concern values exist in the waiter list for an extended period of time, requiring any call to _wakeReadyWaiters to iterate through a large number of write concerns. This iteration happens under the replication coordinator mutex, slowing down any operations that are waiting on the mutex.
Consider a write concern value greater than w: majority: in a 3-node replica set with 1 node down, performing writes with w:3 will result in a timeout, with an unsatisfied write concern. The waiter will still exist in the list until the third node is brought back up. Any new majority write that moves the primary's timestamp forward will need to iterate through the list containing timed out write concern waiters.
This ticket should investigate this issue and fix it. A possible solution is to remove a waiter from the list if the future deadline is exceeded here.
- duplicates
-
SERVER-89185 Fix replication waiting during chunk migration catch-up cloning phase
- Backlog
- related to
-
SERVER-90213 Minimize the number of _doneWaitingForReplication_inlock calls needed to wake up writeConcern waiters
- Closed