Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45752

opCtx interruption during migration critical section commit triggers fassert in FCV 4.2

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.3.4
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • Sharding 2020-02-10
    • 0

      As part of the critical section in a migration, the donor shard will send _configsvrCommitChunkMigration to the config server to complete the migration. If the command fails and the donor shard is in FCV 4.2, the donor attempts to recover the migration's outcome by doing a write on the config server to recover the latest configOpTime. If this recovery fails, the donor will fassert.

      The same operation context is used to send the commit and for the recovery operations, so if it is interrupted (e.g. by a killOp command), the commit and recovery will both fail leading to a crash.

      Notably in FCV >= 4.4, the donor shard instead recovers from a failed commit by repeatedly sending _configsvrEnsureChunkVersionIsGreaterThan, which also reuses the commit's operation context but checks for interrupt, so the migration aborts instead of crashes on interruption.

            Assignee:
            esha.maharishi@mongodb.com Esha Maharishi (Inactive)
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: