Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-95560

Majority secondaries can enter rollback state during a failover where the old primary is frozen momentarily

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 4.4.9, 7.0.11
    • Component/s: None
    • None
    • Replication
    • ALL
    • Hide

      Attached is the patch file with the JS test that reproduces this and associated code changes. We have been able to run this test with 100% consistency.

      Show
      Attached is the patch file with the JS test that reproduces this and associated code changes. We have been able to run this test with 100% consistency.
    • Repl 2024-10-14, Repl 2024-10-28

      We have observed two cases of failover on our mongo setup running v4.4.9 where majority secondaries enter rollback state. Chaining is disabled on our setup. We then attempted to reproduce this scenario on v7.0 using JS tests and believe the bug still exists. 

      Below is a rough sequence of events that can lead to rollback and the associated JS test is attached as a patch file . Note that we have sleeps added in the source code to help better simulate what we saw on our setup.

      • Old primary is frozen - threads are not making progress.
      • Meanwhile, write requests are issued to the old primary and these get stuck too.
      • Election triggers by way of not seeing a progressing primary and a new primary wins the election.
      • During the catch up phase on the new primary, writes from (2) unfreeze on the old primary and make their way to Oplog
      • All secondaries sync these writes to their Oplog
      • New primary exits catch up phase and declares ready to accept writes
      • Secondaries switch sync source to new primary and realize that Oplog has diverged, enter rollback state for several minutes
      • During (7), the cluster is unavailable for reads and writes rendering the cluster down

            Assignee:
            wenbin.zhu@mongodb.com Wenbin Zhu
            Reporter:
            preeti.murthy@gmail.com Preeti Murthy
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: