Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-33956

A sequence of rename and create collections that do not arrive at the correct end state

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.5, 3.7.4
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • Fully Compatible
    • ALL
    • Hide
      var ops = [{ "ts" : Timestamp(1520987616, 1), "t" : NumberLong(1), "h" : NumberLong("4979632704283227657"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:36.010Z"), "o" : { "create" : "bar", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.bar" } } },
      	   { "ts" : Timestamp(1520987616, 1), "t" : NumberLong(1), "h" : NumberLong("4979632704283227657"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:36.010Z"), "o" : { "create" : "shardedColl", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.shardedColl" } } },
      	   { "ts" : Timestamp(1520987617, 29), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("78f0578a-1ecf-4ba3-af46-bfc13ebb46fa"), "wall" : ISODate("2018-03-14T00:33:37.571Z"),
      	     "o" : { "renameCollection" : "test.bar", "to" : "test.shardedColl", "stayTemp" : false,
      		     "dropTarget" : false } },
      	   { "ts" : Timestamp(1520987618,  3), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:38.073Z"),
      	     "o" : { "renameCollection" : "test.foo", "to" : "test.bar", "stayTemp" : false,
      		     "dropTarget" : true } },
      	   { "ts" : Timestamp(1520987621, 2), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:41.532Z"),
      	     "o" : { "create" : "foo", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.foo" } } },
      	   { "ts" : Timestamp(1520987622,  2), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:42.103Z"),
      	     "o" : { "renameCollection" : "test.bar", "to" : "test.shardedColl", "stayTemp" : false,
      		     "dropTarget" : UUID("78f0578a-1ecf-4ba3-af46-bfc13ebb46fa") } },
      	   { "ts" : Timestamp(1520987622,  9), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:42.437Z"),
      	     "o" : { "renameCollection" : "test.foo", "to" : "test.bar", "stayTemp" : false,
      		     "dropTarget" : true } }]
      db.adminCommand({applyOps: ops})
      
      Show
      var ops = [{ "ts" : Timestamp(1520987616, 1), "t" : NumberLong(1), "h" : NumberLong("4979632704283227657"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:36.010Z"), "o" : { "create" : "bar", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.bar" } } }, { "ts" : Timestamp(1520987616, 1), "t" : NumberLong(1), "h" : NumberLong("4979632704283227657"), "v" : 2, "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:36.010Z"), "o" : { "create" : "shardedColl", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.shardedColl" } } }, { "ts" : Timestamp(1520987617, 29), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("78f0578a-1ecf-4ba3-af46-bfc13ebb46fa"), "wall" : ISODate("2018-03-14T00:33:37.571Z"), "o" : { "renameCollection" : "test.bar", "to" : "test.shardedColl", "stayTemp" : false, "dropTarget" : false } }, { "ts" : Timestamp(1520987618, 3), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:38.073Z"), "o" : { "renameCollection" : "test.foo", "to" : "test.bar", "stayTemp" : false, "dropTarget" : true } }, { "ts" : Timestamp(1520987621, 2), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:41.532Z"), "o" : { "create" : "foo", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.foo" } } }, { "ts" : Timestamp(1520987622, 2), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:42.103Z"), "o" : { "renameCollection" : "test.bar", "to" : "test.shardedColl", "stayTemp" : false, "dropTarget" : UUID("78f0578a-1ecf-4ba3-af46-bfc13ebb46fa") } }, { "ts" : Timestamp(1520987622, 9), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:42.437Z"), "o" : { "renameCollection" : "test.foo", "to" : "test.bar", "stayTemp" : false, "dropTarget" : true } }] db.adminCommand({applyOps: ops})
    • Repl 2018-04-09
    • 4

      Consider a node initial syncing from a primary. The initial syncing node, before the oplog application phase has the following collection -> UUID mappings:

      test.foo -> Does not exist
      test.bar -> UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0")
      test.shardedColl -> UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb")
      

      It also happens to be that this is the correct goal state.

      Now consider the sequence of oplog entries that are played as part of the oplog application step of initial sync:

      { "ts" : Timestamp(1520987617, 29), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("78f0578a-1ecf-4ba3-af46-bfc13ebb46fa"), "wall" : ISODate("2018-03-14T00:33:37.571Z"),
        "o" : { "renameCollection" : "test.bar", "to" : "test.shardedColl", "stayTemp" : false,
      	  "dropTarget" : true } }
      
      { "ts" : Timestamp(1520987618,  3), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:38.073Z"),
        "o" : { "renameCollection" : "test.foo", "to" : "test.bar", "stayTemp" : false,
      	  "dropTarget" : true } }
      
      { "ts" : Timestamp(1520987621, 2), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:41.532Z"),
        "o" : { "create" : "foo", "idIndex" : { "v" : 2, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "test.foo" } } }
      
      { "ts" : Timestamp(1520987622,  2), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb"), "wall" : ISODate("2018-03-14T00:33:42.103Z"),
        "o" : { "renameCollection" : "test.bar", "to" : "test.shardedColl", "stayTemp" : false,
      	  "dropTarget" : UUID("78f0578a-1ecf-4ba3-af46-bfc13ebb46fa") } }
      
      { "ts" : Timestamp(1520987622,  9), "op" : "c", "ns" : "test.$cmd", "ui" : UUID("7b447a86-5a12-42bc-916f-edcbfcb71cc0"), "wall" : ISODate("2018-03-14T00:33:42.437Z"),
        "o" : { "renameCollection" : "test.foo", "to" : "test.bar", "stayTemp" : false,
      	  "dropTarget" : true } }
      

      Those operations result in the initial syncing node to only have test.shardedColl -> UUID("16ce21b8-320c-4bfc-af8c-05cae93bdadb") while missing the expected test.bar collection.

      I can make statements about behavior change that fix this specific sequence, but I can't speak to their correctness globally.

      I do have two observations that might illuminate what's going wrong.

      1. judah.schvimer notes that when collections have UUIDs, a replicated renameCollection's dropTarget should never be true. It should be changed to false if the target did not exist, or it should be the UUID of the collection that was dropped.
      2. Perhaps more subtly, in this sequence (which is true for using applyOps, as well as what was observed on the initial syncing node), the creation of test.foo fails because the UUID already exists from the collection that was dropped as the target of the previous rename.
      2018-03-16T19:13:59.519-0400 I COMMAND  [conn1] CMD: create test.foo - existing collection with conflicting UUID 7b447a86-5a12-42bc-916f-edcbfcb71cc0 is in a drop-pending state: test.system.drop.1521242039i3t1.bar
      

      One clarification about the reproduction script. The first two "oplog entries" are simply to recreate the state of the initial syncing node before it began oplog application. Had a primary produced that sequence of oplog entries, including the initial creates would be a very different problem.

            Assignee:
            benety.goh@mongodb.com Benety Goh
            Reporter:
            daniel.gottlieb@mongodb.com Daniel Gottlieb (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: