Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28857

Strange election on network failure

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.4.3
    • Component/s: Replication
    • None
    • ALL

      The subject replicaset has 3 nodes (see rs.conf() below).
      t1 IP address is 10.3.1.12
      t2 IP address is 10.3.1.13
      t3 IP address is 10.3.1.16

      After a transient network failure (switch ports were disabled and enabled back) on the secondary (t3) it became primary, causing rollbacks on the previous primary (t1) and other secondary (t2). All writes are done with w:majority, so this is really strange. Logs from all three machines are attached.

      rs.conf()

      {
              "_id" : "driveFS-temp-1",
              "version" : 4,
              "protocolVersion" : NumberLong(1),
              "writeConcernMajorityJournalDefault" : false,
              "members" : [
                      {
                              "_id" : 0,
                              "host" : "t1.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      },
                      {
                              "_id" : 1,
                              "host" : "t2.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      },
                      {
                              "_id" : 2,
                              "host" : "t3.s1.fs.drive.bru:27231",
                              "arbiterOnly" : false,
                              "buildIndexes" : true,
                              "hidden" : false,
                              "priority" : 1,
                              "tags" : {
      
                              },
                              "slaveDelay" : NumberLong(0),
                              "votes" : 1
                      }
              ],
              "settings" : {
                      "chainingAllowed" : true,
                      "heartbeatIntervalMillis" : 2000,
                      "heartbeatTimeoutSecs" : 10,
                      "electionTimeoutMillis" : 5000,
                      "catchUpTimeoutMillis" : 2000,
                      "getLastErrorModes" : {
      
                      },
                      "getLastErrorDefaults" : {
                              "w" : 1,
                              "wtimeout" : 0
                      },
                      "replicaSetId" : ObjectId("58c9657b40aba377920b23f2")
              }
      }
      

        1. t1.log.gz
          16 kB
        2. t2.log.gz
          15 kB
        3. t3.log.gz
          16 kB

            Assignee:
            kelsey.schubert@mongodb.com Kelsey Schubert
            Reporter:
            onyxmaster Aristarkh Zagorodnikov
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: