-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Compaction
-
Storage Engines
-
(copied to CRM)
-
5
-
2024-08-06 - Withholding Tax, 2024-09-03 Q3 Streams v1
-
v8.0, v7.0, v6.0
Background
The compaction algorithm involves walking the btree pulling internal pages into the cache. There are two occasions where we check the cache during this btree walk. First, we periodically check whether the cache is stuck and if so we return EBUSY and halt compaction.
if (__wt_cache_stuck(session)) WT_ERR(EBUSY);
Secondly, we check if eviction needs to do work and "throttle" compact if so.
/* * Compact pulls pages into cache during the walk without checking whether the cache is * full. Check now to throttle compact to match eviction speed. */ WT_ERR(__wt_cache_eviction_check(session, false, false, NULL));
Problem
This cache eviction check appears to be redundant and there is no throttling involved. When compact enters wt_cache_eviction_check it always returns here
/* * LSM sets the "ignore cache size" flag when holding the LSM tree lock, in that case, or when * holding the handle list, schema or table locks (which can block checkpoints and eviction), * don't block the thread for eviction. */ if (F_ISSET(session, WT_SESSION_IGNORE_CACHE_SIZE) || FLD_ISSET(session->lock_flags, WT_SESSION_LOCKED_HANDLE_LIST | WT_SESSION_LOCKED_SCHEMA | WT_SESSION_LOCKED_TABLE)) return (0);
without doing any eviction work.
This is because we are both holding the schema lock WT_SESSION_LOCKED_SCHEMA and have the flag WT_SESSION_IGNORE_CACHE_SIZE.
Here's where we set the ignore cache size flag and the reason behind it:
/* * The compaction thread should not block when the cache is full: it is holding locks blocking * checkpoints and once the cache is full, it can spend a long time doing eviction. */ if (!F_ISSET(session, WT_SESSION_IGNORE_CACHE_SIZE)) { ignore_cache_size_set = true; F_SET(session, WT_SESSION_IGNORE_CACHE_SIZE); }
it is holding locks blocking checkpoints
There could be historical reasons for this, however, at this time, compact does not block checkpoints.
Acceptance Criteria
- Assess how wt_cache_eviction_check should be used and if it's still necessary.
- Decide if compact should throttle in any way. If so, also
- Add a stat to indicate this
- Add a python unit test