Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-97201

Add a metric in FTDC for replication coordinator mutex wait time

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Replication
    • Repl 2025-03-31, Repl 2025-04-14
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Right now we approximate this mutex contention by looking at the ping time (since that ends up taking the replication coordinator mutex), we should add a metric that can track this directly. We should be able to time how long it takes to take the mutex. I'm not sure if this can cause enough increased load on the mutex to make the situation worse for customers.

      At the very least we could add a command that just takes the mutex and releases it if we wanted to time how long that command took. We don't have to call the command as part of collecting FTDC, but we could manually call it for clusters that we suspect mutex contention.

            Assignee:
            evelyn.wu@mongodb.com Evelyn Wu
            Reporter:
            samy.lanka@mongodb.com Samyukta Lanka
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None