Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-76848

[false alarm] $out does not ensure the node remains primary throughout the internal rename

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.1.0-rc0
    • Affects Version/s: 7.1.0-rc0, 6.0.6, 5.0.17, 4.4.21, 7.0.0-rc1
    • Component/s: None
    • Sharding EMEA
    • Fully Compatible
    • v7.0, v6.3, v6.0, v5.0, v4.4
    • Sharding EMEA 2023-07-10, Sharding EMEA 2023-07-24, QI 2023-05-15
    • 2

      The implementation of $out created a special internal rename command (InternalRenameIfOptionsAndIndexesMatchCmd). However, this command implements its own locks to avoid concurrent modifications, but there is an error in the implementation. On this line there is a call to assertIsPrimaryShardForDb, but there is no guarantee this node will remain the primary through the entire execution of $out. The usual pattern to ensure the node remains a primary is: 

      1. Wait for ShardingDDLCoordinator service recovery.
      2. Take database DDL lock to serialize with concurrent movePrimary operations that would change the db primary shard.
      3. Check if this shard is primary for the database.
      4. Acquire additional DDL locks if needed.
      5. Execute operation while holding the locks.

      However, there is an existing _shardsvrRenameCollection command that already has the correct locking mechanism and ensures the database is the primary shard. We should see if we can use _shardsvrRenameCollection in $out, or we should fix $out to work with concurrent movePrimary commands. We will also need to expand our testing, since the current tests don't allow $out to be run in suites that kill the primary node and we should add movePrimary commands to the current concurrency test.

      This came up in SERVER-76626 during a bug investigation with concurrent rename and shard collection commands were failing with $out writing to time-series collections.

      -------------

      [UPDATE - 8th of September 2023]: This is not a bug, movePrimary and the internal rename of $out are correctly serialized (here and here) through the check of isMovePrimaryInProgress flag.

            Assignee:
            silvia.surroca@mongodb.com Silvia Surroca
            Reporter:
            gil.alon@mongodb.com Gil Alon
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: