Reclaiming oplog may block stepup/stepdown

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Replication
    • ALL
    • None
    • 3
    • TBD
    • None
    • None
    • None
    • None
    • None
    • None

      The OplogCapMaintainerThread deletes oplog with a method called reclaimOplog(), with global and RSTL locks held. This method can take a long time if there is a lot of oplog to reclaim, longer than 30 seconds, resulting in stepup or stepdown crashing due to not being able to obtain the RSTL.

      Fix might be to make reclaimOplog() interruptible, or to take the global lock without the RSTL if this is safe, or both.

            Assignee:
            Unassigned
            Reporter:
            Matthew Russotto
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: