-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Reconciliation
-
Storage Engines
-
8
-
StorEng - Defined Pipeline
The time aggregate calculates the following values:
wt_timestamp_t newest_start_durable_ts; /* default value: WT_TS_NONE */ wt_timestamp_t newest_stop_durable_ts; /* default value: WT_TS_NONE */ wt_timestamp_t oldest_start_ts; /* default value: WT_TS_NONE */ uint64_t newest_txn; /* default value: WT_TXN_NONE */ wt_timestamp_t newest_stop_ts; /* default value: WT_TS_MAX */ uint64_t newest_stop_txn; /* default value: WT_TXN_MAX */ uint8_t prepare;
The newest_start_durable_ts is to track the maximum start durable timestamp that is available on the page.
The newest_stop_durable_ts is to track the maximum stop durable timestamp available on the page.
Both newest_start_durable_ts and newest_stop_durable_ts values are used in rollback to stable to identify whether this file/page has any unstable data or not.
The newest_stop_durable_ts is used to remove the obsolete page when all the keys are removed on the page during checkpoint cleanup. The checkpoint cleanup can only remove an entire page when it is obsolete. It cannot remove individual obsolete records present on the page.
Due to the lack of page level stop timestamp, the checkpoint cleanup unnecessarily reads the pages when they have newest_stop_durable_ts and it cannot do anything because that entire page is not yet removed. To avoid this unnecessary read problem, the newest_stop_durable_ts can be changed to track the page level stop timestamp as newest_page_stop_durable_ts and the regular stop timestamps are also tracked by renaming the newest_start_durable_ts into newest_durable_ts.
With the above time aggregate changes, the checkpoint cleanup doesn't read the pages into the cache that do not have entire pages removed.