Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31461

Resmoke stepdown hook should deal with NotMaster errors

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Testing Infrastructure
    • None
    • TIG 2017-10-23, TIG 2017-11-13

      Right now it seems like the stepdown thread's main loop doesn't wait for a new primary to be elected before sending another replSetStepDown command. This means that it's possible to send a replSetStepDown command to a server that's not a primary, and thus to receive a NotMaster error. Here's an example of a patch build where this happens (search for "not primary"):

      https://evergreen.mongodb.com/task_log_raw/mongodb_mongo_master_windows_64_2k8_ssl_jstestfuzz_concurrent_sharded_continuous_stepdown_patch_9e72a50f1ede62ab9f5899cf8f10dd93ca0c45d1_59d7e4d5e3c3312e74002b4d_17_10_06_20_18_01/0?type=T&text=true

      I think the StepDownThread should deal with these NotMaster errors and ignore them, just as it does with "connection failure" errors.

      Another solution would be for the thread to wait until a primary is elected before stepping a node down.

            Assignee:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Reporter:
            ian.boros@mongodb.com Ian Boros
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: