Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: WT10.0.0, 4.9.0, 4.4.4
Affects Version/s: None
Component/s: None
Labels:
None

Story Points:
8
Sprint:
Storage - Ra 2020-11-30, Storage - Ra 2020-12-14, Storage - Ra 2020-12-28, Storage - Ra 2021-01-11, Storage - Ra 2021-01-25

In diagnosing the root cause for ~~WT-6681~~, we observed very high cache usage coincident with running checkpoints. In some instances, cache usage spiked to ~433% of the configured cache size. Our initial analysis shows that checkpointing non-history store (HS) pages can generate considerable HS content. As HS file only gets reconciled at the end of the checkpoint and there is no cache size check when inserting new HS contents, the cache usage can spike during checkpoint. Few points to be worked on for this ticket:

1 - What is the role of flag WT_SESSION_IGNORE_CACHE_SIZE in this scenario?

2 - A heuristic that prioritises HS pages for eviction was described in ~~WT-6681~~ that helped bring down the cache usage down to ~135%. A valid question is why existing heuristics that were designed to prioritise eviction for cache dominating files didn't help?

3 - We never fail checkpoint as of now. But how do we manage cases where checkpoint can not continue because cache is full?

4 - Can we evict HS pages while checkpoint is running? If so, what are the restrictions (e.g., write gen)?

5 - Can we improve urgent eviction mechanism for this scenario?

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2020-11-18-13-58-44-858.png
110 kB
Nov 18 2020 02:58:47 AM UTC
image-2020-11-23-11-40-14-538.png
272 kB
Nov 23 2020 12:40:16 AM UTC
image-2020-11-23-17-26-16-721.png
227 kB
Nov 23 2020 06:26:18 AM UTC
image-2020-11-24-12-49-26-839.png
402 kB
Nov 24 2020 01:49:34 AM UTC
image-2020-12-18-17-05-11-998.png
341 kB
Dec 18 2020 06:05:18 AM UTC

depends on

SERVER-53708 Excess memory usage during shutdown in durable history tests

Closed

is related to

WT-6681 Rapid history store growth compared with lookaside in 4.2

Closed

related to

WT-7106 Increase how often delta encoding is used for history store records

Closed

WT-7168 History store ignores cache size during update heavy workload

Closed

WT-7190 Limit eviction of non-history store pages when checkpoint is operating on history store

Closed

WT-7096 Improve the mechanism that collects cache usage stats for the history store

Backlog

(1 related to)

Assignee:: Haseeb Bokhari (Inactive)

Reporter:: Haseeb Bokhari (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Created:: Nov 17 2020 05:47:57 AM UTC

Updated:: Oct 29 2023 04:42:42 PM UTC

Resolved:: Jan 22 2021 01:13:54 AM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates