Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-44607

Rollback of an interrupted setFCV cmd can result in the in-memory serverGlobalParams.featureCompatibility diverging from what's written on disk

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • ALL
    • v4.4, v4.2
    • Execution Team 2020-01-13, Execution Team 2020-01-27, Execution Team 2019-12-30, Execution Team 2020-03-09, Execution Team 2020-03-23
    • 23

      I recommend reloading the feature compatibility version from disk, parsing it, and resetting the in-memory serverGlobalParams.featureCompatibility value after a repl rollback has finished.

      Scenario:
      1) setFCV does the first write to set "downgrading to 4.2", which is majority committed
      2) setFCV does the second write to set "fully downgraded to 4.2", which is not majority committed due to a InterruptedDueToReplStateChange error
      3) repl rollback undoes the second write, so the FCV document on disk is back to "downgrading to 4.2"
      4) The serverGlobalParams.featureCompatibility value is still set to "fully downgraded to 4.2"
      5) a new setFCV(4.2) cmd comes in, sees the serverGlobalParams.featureCompatibility value is "fully downgraded to 4.2" and exits early.
      6) The node can now be restarted and load "downgrading to 4.2" into serverGlobalParams.featureCompatibility

      See the associated test failure for further details of how this happened in a test.

      This should be backported to at least v4.2, where there's a test failure as well. I haven't explored earlier versions for presence of the issue, but it seems likely.

            Assignee:
            dianna.hohensee@mongodb.com Dianna Hohensee (Inactive)
            Reporter:
            dianna.hohensee@mongodb.com Dianna Hohensee (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: