Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-77748

movePrimary coordinator does not clear database metadata in case of stepdown

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.1.0-rc0, 7.0.0-rc4
    • Affects Version/s: 7.0.0-rc2
    • Component/s: None
    • None
    • Sharding EMEA
    • Fully Compatible
    • ALL
    • v7.0
    • Hide

      Stepdown on the coordinator shard during movePrimary coordinator after completion of kCommit phase and beginning of kExitCriticalSection.

      Show
      Stepdown on the coordinator shard during movePrimary coordinator after completion of kCommit phase and beginning of kExitCriticalSection.
    • Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26
    • 113

      If a primary failover happens during movePrimary operation, we could miss to clear database metadata on the original primary node of the coordiantor shard, leading to possible data loss.

      As part of movePrimary coordinator, database metadata on primary node is explicitly cleared in kCommit phase, while on secondary nodes metadata is cleared indirectly when we exit the database recoverable critical section in kExitCriticalSection phase.

      If a step-down happens between these two phases and a new primary node is elected on the coordinator shard we could miss clearing metadata on the new primary.

      Consider the following scenario:

      • kCommit
        • N1 (primary)    ->   db metadata cleared
        • N2 (secondary) -> db metadata not cleared
      • kExitCriticalSection
        • N1 (secondary) ->  db metadata cleared
        • N2 (primary)     ->   db metadata not cleared

            Assignee:
            enrico.golfieri@mongodb.com Enrico Golfieri
            Reporter:
            tommaso.tocci@mongodb.com Tommaso Tocci
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: