The issue was observed in HELP-42841 and BF-28971, the reconciliation process can be significantly impacted by the eviction of history store pages.
The eviction process within WiredTiger was hindered by lock contention during in-memory splits. When multiple threads concurrently attempt to access and modify the same page, the eviction process may fail to acquire an exclusive lock, leading to retries and prolonged reconciliation times. This can negatively impact the overall performance of the system.
DIAGNOSIS
The longer the eviction holds the page, all the operations that are waiting for the same page can get blocked and the performance of those operations can be affected.
REMEDIATION AND WORKAROUNDS
Avoid Unnecessary Reconciliation Attempts
To optimize system performance and prevent resource wastage, it is essential to avoid unnecessary reconciliation attempts when a session is configured to disable reconciliation. Without this check, the system may mistakenly attempt to reconcile a page that requires writing content to the history store, even if there is no chance of successfully evicting the history store page. This can lead to unnecessary delays and resource consumption.
This issue is fixed in MongoDB 7.0 and backported till 5.0.
Original Description
WiredTiger avoids the session that is performing the reconciliation for any eviction-related activities as part of the page read mechanism except when the page is possible for the in-memory split.
The eviction performing an in-memory split can fail due to not getting an exclusive lock on the page as other threads use it in parallel. In this situation, that session trying to force eviction of the history store page can retry for a longer time by stalling the reconciliation process. This led to a longer reconciliation of a single page as part of the eviction.
The longer the eviction holds the page, all the operations that are waiting for the same page can get blocked and the performance of those operations can affect.
To fix this problem,
- Try to reproduce the problem with a WT test case.
- Verify the test after fixing the problem by not to retry if the session is set for reconciliation.