-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Block Manager
-
Storage Engines
-
StorEng - Defined Pipeline
The _bm_write calls wt_capacity_throttle before calling wt_block_write to reserve a time to perform the write operation. There is a problem to this as the size used to calculate the reservation time is different from the size written to disk. More Specifically, wt_capacity_throttle uses buf->size to calculate reservation time, and _wt_block_write uses align_size (rounding up buf->size to the nearest multiple of block->allocsize), these two values are different:
// below are the values of buf->size, align_size, block->allocsize printed in __bm_write
buf size is 28602, align size is 28672, block alloc size is 4096
buf size is 28663, align size is 28672, block alloc size is 4096
buf size is 16309, align size is 16384, block alloc size is 4096
buf size is 20485, align size is 24576, block alloc size is 4096
buf size is 211, align size is 4096, block alloc size is 4096
There are arguments that buf->size should be aligned by the time we're into _bm_write, another suggestion is to use align_size to calculate reservation time in wt_capacity_throttle. However a thorough investigation is required as _wt_capacity_throttle is used in multiple places the suggestions could be so wrong.