-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 6.0.0, 7.0.0, 8.0.0-rc0, 7.3.0, 8.1.0-rc0
-
Component/s: None
-
None
-
Catalog and Routing
-
2
From the donor side of a chunk migration, before starting the core steps of the migration (clone, catch up, commit...), MigrationSourceManager needs to check or wait until several preconditions are met. However:
- Those precondition checks are done from the constructor. This is fragile since MigrationSourceManager contains a SharedPromise field and an exception from the constructor will not automatically fulfill it (which should be done by SharedPromise's contract). Note that we do this explicitly since
SERVER-92381.
- MigrationSourceManager's constructor will also expose the class's instance through the CollectionSharingRuntime in a way that makes it accessible by other concurrently running operations. Any code in the constructor that runs after the instance is exposed through the CollectionSharingRuntime needs to be careful about clean up since the destructor will not run when throwing from the constructor.
On SERVER-92381 we implemented an ad-hoc fix for a BrokenPromise error that could get propagated to other threads trying to about the migration. However, to make the cleanup logic more dependable:
- Move the precondition checks into a separate method. This ensures that MigrationSourceManager's constructor never throws and that other operations can not interact with a partially constructed MigrationSourceManager.
- Create a new MigrationSourceManager::State between kCreated and kCloning for this code.
- In that new state, use the same cleanup pattern as other states (e.g. awaitToCatchUp which ensure proper cleanup (final state `kDone`, error logs, etc.). We can get rid of
SERVER-92381's ad-hoc clean up logic while doing this.
- is related to
-
SERVER-92381 Ensure MigrationSourceManager fulfills its promise when aborting in early stages
- Closed