Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-12609

Improve checkpoint cleanup and page eviction logic

    • Storage Engines
    • 3
    • 2024-03-19 - PacificOcean, 2024-04-02 - GreatMugshot, 나비 (nabi) - 2024-04-16
    • v7.0, v6.0, v5.0

      With WiredTiger's durable history, the WiredTiger checkpoint never reads any leaf pages into the cache. It only reads the internal pages into the cache to perform the checkpoint cleanup of obsolete pages.

      I can see two problems with the current checkpoint evict or checkpoint cleanup logic.

      • The internal pages read by the checkpoint for checkpoint cleanup are possible to be evicted asap due to their readgen being marked with READ_WONT_NEED. This can cause the next checkpoint to read these pages again into the cache.
              /* Read pages with history store entries and evict them asap. */
              LF_SET(WT_READ_WONT_NEED);
      
      • As we never read any leaf pages into the cache by checkpoint, the eviction logic tries to evict any leaf page that is marked with READ_WONT_NEED. This unnecessarily slows down the checkpoint operation for the pages that are not read into the cache.
                  if (!is_internal &&
                    (page->read_gen == WT_READGEN_WONT_NEED ||
                      FLD_ISSET(conn->timing_stress_flags, WT_TIMING_STRESS_CHECKPOINT_EVICT_PAGE)) &&
                    !tried_eviction && F_ISSET(session->txn, WT_TXN_HAS_SNAPSHOT)) {
      

      The solutions to the above problems are:

      • Let the normal eviction take care of the internal pages read by the checkpoint.
      • Allow checkpoint to perform eviction only when the timing stress is configured (testing purpose only).

            Assignee:
            haribabu.kommi@mongodb.com Haribabu Kommi
            Reporter:
            haribabu.kommi@mongodb.com Haribabu Kommi
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: