Uninterruptible lock guard in election causes FCBIS to hang

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 6.0.3, 6.2.0-rc0
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • Fully Compatible
    • ALL
    • v6.1, v6.0
    • Repl 2022-10-03, Repl 2022-10-17
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      In order to avoid interrupting ourselves due to our own stepdown, we use uninterruptible locks when writing lastVote in an election. Unfortunately if we've taken the global lock in X mode for FCBIS storage change, this leads to a deadlock – we're trying to acquire a write lock on an uninterruptible opCtx while also attempting to kill the opCtx so we can change storage and release the lock.

      The quick fix might be to not acquire the uninterruptible lock guard during STARTUP2, but we should definitely add a test for this; since initial sync nodes are usually non-voting we don't have coverage for an election during it.

            Assignee:
            Matthew Russotto
            Reporter:
            Matthew Russotto
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: