getCoordinatorDoc May Fail If Called From Retry Loop Which Deletes It

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Fixed
    • Priority: Major - P3
    • 8.2.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • Fully Compatible
    • ALL
    • ClusterScalability Mar31-Apr14
    • 200
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      As seen in BF-37193, the call to getCoordinatorDoc here added by SERVER-100421 may throw when ultimately invoked from here because the state document has already been deleted. This is possible if a transient error occurs after the coordinator document was deleted, triggering a retry. The case seen in BF-37193 was a transient error cause by an operation being killed due to stepdown, but the retry could progress enough between this event and the stepdown token being cancelled to fail to find the coordinator document and cause a fatal error.

            Assignee:
            Brett Nawrocki
            Reporter:
            Brett Nawrocki
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: