-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 5.0.13, 6.0.2, 6.1.0-rc4, 6.2.0-rc0
-
Component/s: None
-
Fully Compatible
-
ALL
-
v6.2, v6.0, v5.0
-
-
Sharding EMEA 2022-12-12, Sharding EMEA 2022-12-26, Sharding EMEA 2023-01-09
The donor of a chunk migration calls ShardingStateRecovery::endMetadataOp() that is persisting the configOpTime inclusive of the migration commit, this is to ensure that in case of stepdown when the next primary node will read from the config server it will see the effect of the commit performed by the previous primary.
The problem is that endMetadataOp() is not called after recovering a failed migration, so in case the donor experiences an error during the commit (network error) and a subsequent stepdown, there is no guarantee that the next primary node will install the correct filtering metadata inclusive of the last migration.
The proposed solution is to add a VectorClock::waitForDurableConfigTime() just before writing down the commit decision in the migration coordinator document.
This will be execute both if no error occur during the commit as well as during migration recovery.