Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-2975

Replica set master failure detection

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 1.8.1
    • Component/s: Replication
    • None
    • Environment:
      linux, ec2, ebs

      During the massive EC2 fail earlier this morning, the master of one of our replica set was impacted, not responding to the clients still connected without closing the connections. The other members of the set did not pick the failure up, and it was not possible to send a "stepdown" command to it. (As the computer was not answering ssh, we did a remote reboot to force the replica set on its two other feet).

      • the replica failure detection should be less optimistic
      • It should be possible to trigger election from a secondary in such a situation

            Assignee:
            kristina Kristina Chodorow (Inactive)
            Reporter:
            kali Mathieu Poumeyrol
            Votes:
            2 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: