Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-77685

Server can return CollectionUUIDMismatch with actual collection null if collection exists

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • Fully Compatible
    • v7.0
    • Execution EMEA Team 2023-06-26, Execution EMEA Team 2023-07-10

      In mongosync, I am seeing that in some cases, on v7.0+ and in sharded clusters, that Collection UUID errors are being returned like the following:

      [j1:cl1:s1:prim] | 2023-05-31T20:56:32.038+00:00 D1 ASSERT   23074   [conn96] "User assertion","attr":
      {
          error: "CollectionUUIDMismatch{ db: \"test\", collectionUUID: UUID(\"86ac3d10-3d4b-4951-927c-156c87640af7\"), expectedCollection: \"use_cases\", actualCollection: null }: Collection UUID does not match that specified",
          file: "src/mongo/db/catalog/collection_uuid_mismatch.cpp",
          line: 70
      }
      

      In these errors the actualCollection field is null, which should only happen if the collection hasn't been created yet, or has been dropped. However, the collection with this UUID should exist on the destination cluster and there is no drop that should be happening for the collection with that UUID.

      The following rough order of events happen in the BF I am seeing:

      1. the previous collection use_cases is dropped
      2. a temporary collection mongosync.tmp.<uuid> is created
      3. the temporary collection is renamed to use_cases
      4. a document tries to be inserted by mongosync into the new collection use_cases and fails with the CollectionUUIDMismatch error

      Note that 2, 3 and 4 don't necessarily happen one after the other (they can happen at different relative times because of the parallelism in mongosync). Because of step 3, we expect some CollectionUUIDMismatch errors, but in this scenario, I either expect it not to throw one, or I expect it to have the actual collection field be mongosync.tmp.<uuid>, and not null. The end result of this is that mongosync ends up not applying some CRUD events (typically inserts) and has data inconsistency between the source and destination clusters.

      The logs are here: https://parsley.mongodb.com/resmoke/98b363f8d81a6e1ffbe259c78c5b19f9/all?bookmarks=0,1089885,1090430,1090446,1090603,1090721,1090805,1091107,1193110&highlights=CollectionUUIDMismatch.%2Ause_cases,use_cases&shareLine=1090721

      This is blocking mongosync v7.0 support, and is causing some BFs on our waterfall (because we run some integration tests on latest).

            Assignee:
            jordi.olivares-provencio@mongodb.com Jordi Olivares Provencio
            Reporter:
            rohan.sharan@mongodb.com Rohan Sharan
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated:
              Resolved: