Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10982

Replica set may not fail over when primary is not responsive

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.6
    • Component/s: Replication
    • Environment:
      Linux mongo0 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux
    • Replication
    • Linux
    • Hide

      Not tested under controlled circumstances: Set up replica set, store MongoDB data on separate device on Primary; make that device unresponsive (but keep it mounted).

      Show
      Not tested under controlled circumstances: Set up replica set, store MongoDB data on separate device on Primary; make that device unresponsive (but keep it mounted).

      We had a hardware issue with our Mongo replica set primary. The exact reason is still unknown, but it appears that I/O commands to its SSD (which holds all MongoDB data but not the operating system or the MongoDB installation itself) did not return.

      dmesg output (full output is attached):
      [2195482.937229] INFO: task mongod:2731 blocked for more than 120 seconds.
      [2195482.937416] mongod D ffff88063fc13780 0 2731 1 0x00000000
      [2195482.937421] ffff88033147d1e0 0000000000000086 ffff880600000000 ffff880333239590
      [2195482.937426] 0000000000013780 ffff8803324adfd8 ffff8803324adfd8 ffff88033147d1e0
      [2195482.937432] ffffffff8101360a 00000001810660a1 ffff8803316822f0 ffff88063fc13fd0
      [.....]

      MongoDB's log file does not show anything out of the ordinary.

      Result:
      The replica set's heartbeat though that our primary was fine, but it was not actually doing any work (all it did is wait for a broken disc). Thus connections piled up and our entire application stalled. As soon as I manually shut down MongoDB on that machine, the failover happened as it should (although the Java driver didn't recover properly after that, but that's a separate issue).

      Now, I'm not even sure if this is a valid bug report, but I think there is some room for improvement in the replica set's heartbeat code. I can imagine various situations in which a machine is responding to heartbeat, but not actually working, e.g. "swap to death" situations, all sorts of I/O issues (e.g. NFS/iSCSI/whatever mounted file system with network problems), hardware issues similar to the ones we had.

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            dg@doodle.com David Gubler
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: