-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Statistics
-
Storage Engines
-
5
-
2024-01-09 - I Grew Tired
-
v7.0, v6.0, v5.0
Here is a portion of the code from the reconciliation, which computes the time taken for various stages and stores in the connection statistics:
if (WT_CLOCKDIFF_SEC(session->reconcile_timeline.reconcile_finish, session->reconcile_timeline.reconcile_start) > conn->rec_maximum_milliseconds) conn->rec_maximum_milliseconds = WT_CLOCKDIFF_MS(session->reconcile_timeline.reconcile_finish, session->reconcile_timeline.reconcile_start);
WT-11922 fixed this as follows:
- if (WT_CLOCKDIFF_SEC(session->reconcile_timeline.reconcile_finish, + if (WT_CLOCKDIFF_MS(session->reconcile_timeline.reconcile_finish,
Earlier likely we were not storing the statistics most of the time.
Having fixed this, still doesn't explain abnormal values seen for this (and other eviction) statistics. eg:
In FTDC (attached), we saw an instance of rec_maximum_milliseconds value as 1.6 G for milliseconds. 1.6 G = 1.6 x 10^9 milliseconds = 18 days. This seems implausible.
Similarly, we saw 1.6 G milliseconds for maximum milliseconds spent in moving updates to the history store in a reconciliation.
Investigate how such a high value is being stored for this statistic. It would be good to review the eviction/reconciliation statistics timing code again.