-
Type: Bug
-
Resolution: Fixed
-
Priority: Critical - P2
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines
-
(copied to CRM)
-
8
-
TheMoon-StorEng - 2023-09-19, NachoCheese - 2023-10-03, Joker - StorEng - 2023-10-17
-
v7.1, v7.0, v6.0, v5.0, v4.4
In WT-7534, we investigated why FTDC stalls when a checkpoint occurs. Since checkpointing and retrieving statistics require a lock on the tables they work on, they cannot happen at the same time. The lock required by a checkpoint is WT_WITH_TABLE_READ_LOCK and the locks required by the statistics processing are WT_WITH_SCHEMA_LOCK and WT_WITH_TABLE_WRITE_LOCK in __wt_curstat_table_init.
Reproducing the issue:
The issue can be reproduced through the many-coll-test. It is possible to add a sleep inside WT_WITH_TABLE_READ_LOCK when the checkpoint requires it to emphasize the stalls. See this comment:
Scope of the ticket:
- Find a solution to avoid those stalls coming from:
- __wt_schema_open_indices
- Suggestion: Before calling __wt_schema_open_indices, can we know if there are any indices ? Can we give the caller the possibility to skip those indices ?
- And __wt_schema_get_table
- __wt_schema_open_indices
Definition of done:
Agree on the best solution for the issue and create a new ticket to implement the solution.