Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45880

Flow Control lag detection mechanism can overstate lag if there are oplog holes

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • ALL
    • Execution Team 2020-07-27, Execution Team 2021-02-08, Execution Team 2021-02-22

      Flow Control uses the lastApplied wall clock time minus the lastCommitted wall clock time as a proxy for replication lag. This measure can overstate the lag if there are oplog holes, since lastApplied can include operations after oplog holes, which cannot be replicated by secondaries due to the oplog hole.

      One proposed fix to address this is to use the wall clock time associated with the all_durable timestamp or the oplog visibility point instead of the lastApplied wall clock time, since these points do not include operations after oplog holes.

      Any solution to this issue that involves changing the components of the lag detection mechanism should ensure that 1) a wall clock time is available for the proposed timestamp 2) the proposed timestamp is accessible in-memory and is kept up-to-date.

      SERVER-46114 represents another case for reconsidering whether lastApplied minus lastCommitted is the best measure for lag.

            Assignee:
            dianna.hohensee@mongodb.com Dianna Hohensee (Inactive)
            Reporter:
            maria.vankeulen@mongodb.com Maria van Keulen
            Votes:
            0 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated:
              Resolved: