Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-11071

Fix the root cause of latency spikes in Mongodb 4.4

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Performance
    • Storage Engines
    • 5
    • Megabat - 2024-05-14, 2024-05-28 - FOLLOW ON SPRINT, 2024-06-11 - Dinosaurs go rawr

      Recently we have had a lot of help tickets about the latency spikes from customers upgrading from 4.2 to 4.4.

      We believe the root cause is the following sequence:

      • checkpoint starts and eviction on a table is blocked.
      • more writes on the table happen and the pages continuously to grow.
      • the pages have grown to a point that is much larger than the configured maximum page size.
      • checkpoint finishes and forced eviction kicks in to evict these big pages. Because they are very big, it takes longer to evict them and the reads and writes on these pages are blocked for a longer time causing the spikes.

      However, the same logic applies to 4.2 as well. There must be something in 4.4 that exacerbates this. e.g., Reconciliation now takes more time in 4.4 because of the history store, IO overhead of the time points we store to disk, or checkpoint cleanup overhead.

      We need to understand what is really driving this vicious cycle.

            Assignee:
            jie.chen@mongodb.com Jie Chen
            Reporter:
            chenhao.qu@mongodb.com Chenhao Qu
            Votes:
            3 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated: