Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-67725

Check catalog consistency on shards as precondition for rename

    • Fully Compatible
    • v6.0, v5.0
    • Sharding EMEA 2022-07-25, Sharding EMEA 2022-08-08
    • 2

      In order to ensure the correctness of renameCollection for sharded collections (supported since v5.0), we introduced some logic in rename coordinator/participant to make sure UUIDs are aligned across all shards.
      If a catalog inconsistency is detected (namely different UUIDs for the source/target collection on different shards), the rename operation hangs spamming the logs with a message aimed to push the user to manual intervene.

      This is an example of error emitted in the logs:

      {"t":{"$date":"2022-05-21T00:37:40.719Z"},"s":"E","c":"SHARDING","id":6372200,"ctx":"RenameCollectionParticipantService-223","msg":"Error executing rename collection participant. Going to be retried.","attr":{"fromNs":"foo.sourceColl","toNs":"foo.TargetColl","error":"CommandFailed: Source Collection foo.sourceColl UUID does not match provided uuid."}}
      

      Given that a bunch of users hit the error but got their collection stuck not knowing how to fix the catalog inconsistency, purpose of this ticket is to prevent ending up in this situation.

      A possible way would be to broadcast a message to all shards in the checkPreconditions phase in order to early fail the operation in case an inconsistency is detected. (E.g. call a listCollections filtered by ns on all shards).

      This would not fully prevent the hang to happen because after checking preconditions and before instantiating participants some direct client could create the source/target collection with different UUIDs on other shards. But the time frame for the bad interleaving will be so short to prevent 99% of the hangs.

            Assignee:
            enrico.golfieri@mongodb.com Enrico Golfieri
            Reporter:
            pierlauro.sciarelli@mongodb.com Pierlauro Sciarelli
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: