Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53894

setFCV transitions to fully upgraded/downgraded too early on command retry

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.9.0
    • Affects Version/s: 4.9.0
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • Repl 2021-02-08

      Currently, when a setFCV command fails, we can expect the FCV to end up in the intermediary "upgrading"/"downgrading" states. This should be safe because we expect the setFCV command to be idempotent and a user can simply call the setFCV command again to complete the upgrade.

      In SERVER-51474, we refactored and simplified a lot of the FCV code. This included adding an upgradeFeatureCompatibilityVersionDocument function that will update the FCV document to the next version if the requested version is a viable transition and is different from the current version. The setFCV command will call this twice – once as we expect to transition from downgraded -> upgrading, and then another time to transition from upgrading -> upgraded

      This is problematic because we often add upgrade/downgrade logic in the middle of a setFCV call. Examples are when we have to do a reconfig after transitioning to the "upgrading" state in the safe reconfig project and as part of SERVER-50423. The following scenario will have upgrade/downgrade concerns:
      1. Call setFCV(upgradeVersion). upgradeFeatureCompatibilityVersionDocument(upgradeVersion) is called and sets FCV to upgrading. setFCV fails and returns an error before it can complete.
      2. Call setFCV(upgrade) again. upgradeFeatureCompatibilityVersionDocument(upgradeVersion) now transitions from upgrading to upgraded. Node fails again before setFCV completes.
      3. Node is now fully upgraded but never completes the additional upgrade/downgrade behavior as part of the setFCV command.

      Ultimately, we do not want to ever enter the fully upgraded/downgraded FCV until the command has succeeded (and all upgrade/downgrade behavior is performed).

            Assignee:
            ali.mir@mongodb.com Ali Mir
            Reporter:
            jason.chan@mongodb.com Jason Chan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: