-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Fully Compatible
Defragmentation script
Python degragmentation script improvements:
- Phase 1 is not idempotent thus executing the same phase twice will over merge chunks, possibly producing chunks with sizes that are double the target chunk size.
- Phase 2 always splits chunks in half without taking into account the real chunk size.
- There is no way of throttling the migrations performed in phase 2
- The progression bar is not accurate because it always accounts for all the chunks in the collection even if they are not going to be migrated.
Performance evaluation
In order to evaluate the impact of migrations after the defragmentation process we decided to run a few migrations after phase 1 and measure the time taken by the catalog cache refreshes and possibly the impact on a synthetic read workload. In order to perform this evaluation we will provide a new version of the defragmentation script with the following improvements:
- Allow to run each phase separately, so that it will be possible to assess the performance after each phase.
- Expose a parameter to limit the number of migrations performed in phase 2. So that the script can be also used to easily trigger a specific amount of migrations.