Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48209

Migration coordinator step-up recovery is not robust to intervening shard key refinement

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Sharding
    • ALL
    • 0

      During migration coordinator recovery, it might need to figure out if the chunk migration was committed or not, ie. whether or not the chunk still belongs to itself (if there was no decision persisted into the migration coordinator recovery document). keyBelongsToMe() calls into the ChunkManager's keyBelongsToShard(), passing its own shard id. That function then calls findIntersectingChunk() to get the chunkInfo, and confirms that the returned chunk does indeed actually contain the shard key.

      However, if there is an election on the shard near the end of the migration, and the shard key is concurrnetly refined (after the migration succeeded, but before the new primary processes the migration recovery document), then migration coordinator recovery will invariant.

      Affected timeline:

      1. Migration is committed. Collection has shard key { a: 1 }.
      2. Shard server primary steps down.
      3. Shard server is unable to persist migration commit decision to config.migrationCoordinators.
      4. New shard server primary steps up.
      5. New shard server primary step up commences MigrationCoordinatorStepupRecovery.
      6. Intervening refineCollectionShardKey starts and completes (on the configsvr). Shard key is now { a: 1, b: 1 }.
      7. Migration coordinator recovery processes the recovery doc, which contains the old shard key of { a: 1 }.
      8. Since keyBelongsToShard() expects the schema of the given shard key value to match those contained in the ChunkMap, and doesn't attempt to extend it, this results in the intersecting chunk apparently not containing the key that was used to find it.
        {"t":{"$date":"2020-05-12T07:09:23.720+00:00"},"s":"F",  "c":"-",        "id":23079,   "ctx":"MigrationCoordinatorStepupRecovery","msg":"Invariant failure","attr":{"expr":"chunkInfo->containsKey(shardKey)","file":"src/mongo/s/chunk_manager.cpp","line":301}}
        {"t":{"$date":"2020-05-12T07:09:23.720+00:00"},"s":"F",  "c":"-",        "id":23080,   "ctx":"MigrationCoordinatorStepupRecovery","msg":"\n\n***aborting after invariant() failure\n\n"}
        {"t":{"$date":"2020-05-12T07:09:23.720+00:00"},"s":"F",  "c":"CONTROL",  "id":4757800, "ctx":"MigrationCoordinatorStepupRecovery","msg":"{message}","attr":{"message":"Got signal: 6 (Aborted).\n"}}
        

      Presumably one possible solution is to make ChunkInfo::containsKey() handle being passed shard key values that are a prefix. Another might be to update the MigrationCoordinatorStepupRecovery to notice when the shard key definition in the refreshedMetadata differs from the shard key value in the recovery doc, and in the case where doc.getRange() is a prefix of the actual collection shard key, extend it before passing to keyBelongsToMe().

            Assignee:
            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            Reporter:
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: