Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-65184

Avoid concurrent election and stepdown in downgrade_default_write_concern_majority.js

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.9
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • Repl 2022-04-04, Repl 2022-04-18, Repl 2022-05-02
    • 44

      As part of downgrading the cluster, we stop the config server mongod. Part of the process includes stepping down the node before collection validation. However, there is a concurrent election that happens during stepdown. This causes the killOp thread to kill the stepDown with InterrupedDueToReplStateChange:

      [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.425+00:00 I ELECTION 21450 [ReplCoord-9] "Election succeeded, assuming primary role","attr":
      
      {"term":2}
      
      [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.425+00:00 I REPL 21358 [ReplCoord-9] "Replica set state transition","attr":
      
      {"newState":"PRIMARY","oldState":"SECONDARY"}
      
      ...
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.456+00:00 I COMMAND 21579 [conn94] "Attempting to step down in response to replSetStepDown command"
       ...
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.487+00:00 I REPL 21343 [RstlKillOpThread] "Starting to kill user operations"
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.490+00:00 I REPL 21344 [RstlKillOpThread] "Stopped killing user operations"
       [js_test:downgrade_default_write_concern_majority] c20781| 2022-03-28T15:18:37.490+00:00 I REPL 21340 [RstlKillOpThread] "State transition ops metrics","attr":\\{"metrics":{"lastStateTransition":"stepUp","userOpsKilled":1,"userOpsRunning":4}}
      

      One way to fix this is to either set the cluster secondaries to votes: 0 since we don't expect to test election behavior in this test. An alternative is to add InterruptedDueToReplStateChange https://github.com/10gen/mongo/blob/1cc143da4077560d714d99471b8006c0dec5f66a/jstests/libs/override_methods/validate_collections_on_shutdown.js#L87 of validate_collections_in_stepdown.js

            Assignee:
            jason.chan@mongodb.com Jason Chan
            Reporter:
            jason.chan@mongodb.com Jason Chan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: