-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 5.0.5, 5.1.1
-
Component/s: None
-
None
-
Sharding EMEA 2021-12-27
-
(copied to CRM)
It has been observed on a cluster the presence of 4 migration coordinator documents on one shard that led to hit this invariant on step-up.
The documents were all relative to migrations for different namespaces and the states were:
- 2 aborted
- 1 committed
- 1 without decision
The range deletions seemed to have been correctly handled both on donor and recipients:
- No range deletion documents for the aborted migrations (range deletion tasks already executed)
- Ready range deletion task on the donor for the committed migration
- Pending range deletions on donor/receiver for the migration without decision
Given the state of "decided" migrations, we can consider that:
- _abortMigrationOnDonorAndRecipient worked well.
- _commitMigrationOnDonorAndRecipient worked well.
It is then very likely that something odd happened right after, as part of the call to forgetMigration that did not remove the migration coordinators.
- related to
-
SERVER-62245 MigrationRecovery must not assume that only one migration needs to be recovered
- Closed
-
SERVER-62243 Wait for vector clock document majority-commit without timeout
- Closed