Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-93707

ShardRegistry::scheduleReplicaSetUpdateOnConfigServerIfNeeded can write an incorrect config version

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Catalog and Routing
    • Fully Compatible
    • ALL
    • v8.0
    • CAR Team 2024-09-02
    • 0

      SERVER-21185 made the shard primary responsible for updating it's corresponding connection string in config.shards in the CSRS.

      This job is started both during step-up and reconfig. The task's code doesn't use a single ReplSetConfig snapshot, and instead fetches the connection string first, and the config version later separately. The config version is used to prevent overwritting newer concurrent updates. However, due to the non-snapshotted nature of the code, it is possible for an update job to read the connection string, and by the time it fetches the config version, it is a newer version. This results in writing an old connection string with the new config version. 

      Attached SERVER-93707.diff :

      1. Node steps up, run scheduleReplicaSetUpdateOnConfigServerIfNeeded
      2. Task launched with StepUp starts
        1. Read connection string
        2. Pause after reading connection string.
      3. Reconfig, remove secondary node, yet another scheduleReplicaSetUpdateOnConfigServerIfNeeded call.
      4. Task launched with reconfig starts 
        1. Pause before update.
      5. Task launched with StepUp
        1. Resumes execution.
        2. Reads config version.
        3. Updates config.shards with ConnString before node removal, but with config version for node removal.
      6. Task launched with reconfig
        1. Try to update with ConnString with removed node, and current config version.
        2. Update does nothing because current config version is already in config.shards

            Assignee:
            yuhong.zhang@mongodb.com Yuhong Zhang
            Reporter:
            yujin.kang@mongodb.com Yujin Kang Park
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: