Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-75564

Avoid executing DocumentSourceInternalUnpackBucket::doOptimizeAt() twice for sharded time-series collections

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Integration
    • QI 2023-05-01, QI 2023-05-15, QI 2023-05-29, QI 2023-06-12, QI 2023-06-26, QI 2023-07-10, QI 2023-07-24, QI 2023-08-07, QI 2023-08-21, QI 2023-09-04, QI 2023-09-18, QI 2023-10-02, QI 2023-10-16, QI 2023-10-30, QI 2023-11-13, QI 2023-11-27, QI 2023-12-11, QI 2023-12-25, QI 2024-01-08, QI 2024-01-22, QI 2024-02-05

      For queries on sharded views, we do the optimization on the pipeline twice. The first time, when the query is sent on the view definition to the primary shard, and the second time when the query is send on the base collection (after the kickback to mongos). For normal views, this is ok, because the pipeline optimizations are generally idempotent. But the DocumentSourceInternalUnpackBucket::doOptimizeAt() may not idempotent. There are known issues (like SERVER-60373), where this function generates duplicates stages. We should use this ticket to evaluate any other potential issues with running the function twice and address them.

            Assignee:
            backlog-query-integration [DO NOT USE] Backlog - Query Integration
            Reporter:
            naama.bareket@mongodb.com Naama Bareket
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: