Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9035

Asynchronously roll back transactions due to cache pressure

    • Type: Icon: New Feature New Feature
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • 2023-05-16 Chook-n-Nuts Farm
    • Not Needed

      Summary
      Allow WiredTiger to roll back transactions that are blocking eviction without requiring those sessions to call into the storage engine API.

      Motivation

      MongoDB's use of sessions does not conform to the model required by WiredTiger. Specifically, we regularly use multiple sessions per thread (see SERVER-61116), and in the case of multi-document transactions, we allow operations to write a significant amount of uncommitted data and sit idle for extended periods of time (see SERVER-64982).

      The WiredTiger API currently assumes that all sessions make active calls into the API so that they may be rolled back if they are blocking eviction. The problem, in the case of SERVER-64982, is that MongoDB allows users to keep transactions idle for up to 1 minute, which means WiredTiger cannot evict a transaction that is preventing the entire system from making progress.

      If the problem does occur, what are the consequences and how severe are they?
      Extended lack of availability in MongoDB.

      Is this issue urgent?
      This has been a problem since MongoDB 4.2, and no customers that I know of have been hit with this problem. As we move from single-tenant to multi-tenant, this becomes a more concerning attack vector.

      Acceptance Criteria (Definition of Done)
      It would be acceptable to close this is if we decide it is not possible/feasible, and instead pursue other features like supporting transactions larger than cache.

      Suggested Solution
      My proposal is that WiredTiger roll back transactions without them actively making calls into the storage engine. I opened this to start a discussion, because I'm also curious about why this could be a bad idea.

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: