Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89242

Unsatisfied write concern waiters can grow the replication waiter list unbounded

    • Replication
    • Fully Compatible
    • ALL
    • v8.0, v7.0
    • Repl 2024-05-13

      When replication waits for write concern we'll call awaitReplication, which inserts a new 'waiter' into a std::multimap called the _replicationWaiterList. Waiters are sorted by opTime, and when a primary advances its commit point it will check if any write concern waiters in the map can be satisfied with the new optime. It iterates through the map until it hits a write concern with an optime greater than the new optime to check against, at which point it will end the check.

      Waiters in the list are only ever removed once the write concern is satisfied or there is an error returned from the function we are calling on the waiter. Even if a request with write concern times out (the future hits the deadline), the waiter exists in the list until its satisfied.

      There are cases where unsatisfiable write concern values exist in the waiter list for an extended period of time, requiring any call to _wakeReadyWaiters to iterate through a large number of write concerns. This iteration happens under the replication coordinator mutex, slowing down any operations that are waiting on the mutex.

      Consider a write concern value greater than w: majority: in a 3-node replica set with 1 node down, performing writes with w:3 will result in a timeout, with an unsatisfied write concern. The waiter will still exist in the list until the third node is brought back up. Any new majority write that moves the primary's timestamp forward will need to iterate through the list containing timed out write concern waiters.

      This ticket should investigate this issue and fix it. A possible solution is to remove a waiter from the list if the future deadline is exceeded here.

            Assignee:
            lingzhi.deng@mongodb.com Lingzhi Deng
            Reporter:
            ali.mir@mongodb.com Ali Mir
            Votes:
            0 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: