High contention for ReplicationCoordinatorImpl::_mutex in w:majority workloads

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • Replication
    • ALL
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      bartle reports high contention for the replication coordinator mutex in heavy insert workloads with w:majority writes, which leads to low CPU utilization and bottlenecking on a synthetic resource (the mutex). This is problematic on deployments with many cores, but can even be a problem on 16-core machines, as he mentions in a comment on another ticket.

      Shortening the critical section under the mutex in setMyLastAppliedOpTimeForward and particularly in _wakeReadyWaiters_inlock is one possible approach to mitigating the problem. Finer grained locking around waiters might be another.

            Assignee:
            [DO NOT USE] Backlog - Replication Team
            Reporter:
            Andy Schwerin
            Votes:
            0 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: