-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
ALL
-
v4.4, v4.2
-
Execution Team 2020-01-13, Execution Team 2020-01-27, Execution Team 2019-12-30, Execution Team 2020-03-09, Execution Team 2020-03-23
-
23
I recommend reloading the feature compatibility version from disk, parsing it, and resetting the in-memory serverGlobalParams.featureCompatibility value after a repl rollback has finished.
Scenario:
1) setFCV does the first write to set "downgrading to 4.2", which is majority committed
2) setFCV does the second write to set "fully downgraded to 4.2", which is not majority committed due to a InterruptedDueToReplStateChange error
3) repl rollback undoes the second write, so the FCV document on disk is back to "downgrading to 4.2"
4) The serverGlobalParams.featureCompatibility value is still set to "fully downgraded to 4.2"
5) a new setFCV(4.2) cmd comes in, sees the serverGlobalParams.featureCompatibility value is "fully downgraded to 4.2" and exits early.
6) The node can now be restarted and load "downgrading to 4.2" into serverGlobalParams.featureCompatibility
See the associated test failure for further details of how this happened in a test.
This should be backported to at least v4.2, where there's a test failure as well. I haven't explored earlier versions for presence of the issue, but it seems likely.
- duplicates
-
SERVER-46758 setFCV can be interrupted before an FCV change is majority committed and rollback the FCV without running the setFCV server logic
- Closed