Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-3961

The all_committed timestamp should be less than any in-flight transaction

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.4, 3.7.3, WT3.1.0
    • Affects Version/s: None
    • Component/s: None
    • Storage Non-NYC 2018-03-12

      Originally we thought it was fine to use min(largest committed timestamp, all active timestamps) for the all_committed timestamp. However, it would be more useful to return:

      min(largest committed timestamp, active timestamp - 1)

      This behavior will be useful on one-voting-node replica sets. With such sets, the primary node could immediately set the majority commit level after every write is durable, since a majority of 1 is 1. Unfortunately, because writes commit out of timestamp order, it means they can become durable out of timestamp order. We need to be able to set the majority level according to the latest durable timestamp that has no uncommitted operations with timestamps less than it. The all_committed value can help provide this: in a thread loop, we can query what the all_committed value X is, wait for log flush, and then mark X as the new majority commit level. Thereafter, we cannot commit any operation with a timestamp equal to or less than X.

            Assignee:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Reporter:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: