Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-90495

Support start or resume from deleted recordId on natural order scan

    • Query Execution
    • v8.0, v7.0
    • QE 2024-10-14, QE 2024-10-28, QE 2024-11-11, QE 2024-11-25

      Context:
      We (Atlas Search) plan on using $natural order collection scan in monogt for logical initial-sync to address a frequent performance bottleneck seen by multiple customers in production (one example HELP-55062). Today, mongot uses an aggregate command with a sort stage over _id field to enable resuming initial sync either on normal flow or transient error. The issue arises when mongot rebuilds search indexes (e.g. new, definition change, unrecoverable exception) and the layout of documents in WT does not correlate well _id order. In such cases we see severe performance degradation in server performance due to disk latency and available IOPS. Natural order collection scan will solve the issue. However, it has a main drawback (see problem below).

      Problem:
      Search logical initial-sync algorithm relies heavily on resume support to avoid mongot in buffering change-stream (op-log) updates during collection scan. We do so by alternating between collection scan and catching up with the collection's change stream. The current implementation of aggregate command is unable to start or resume a natural order scan if `$_resumeAfter` points to a deleted document / recordId. This limitation is a significant concern for us as it could lead to a new set of production issues depending on customers workload.

      Ask:
      Mongod to support start or resuming $natural collection scans after a deleted recordId.

      UPDATE:
      Following HELP-59576, we distilled the ask to mongod:

      1. Provide $gt(e) / $lt(e) semantics for a recordId in aggregation pipeline.
      2. Aggregation will (can) provide a resume token in conjunction with $gt(e) / $lt(e).
      3. Provided resume token in (2) can be passed to $gt(e) / $lt(e) aggregation stage in (1) to restart a query. And resume won't fail if passed recordId does not exist / deleted (implicit).

            Assignee:
            adi.agrawal@mongodb.com Adi Agrawal
            Reporter:
            mor.levy@mongodb.com Mor Levy
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated: