Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-56419

Push down $match past $setWindowFields when it keeps/drops whole partitions

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization

      We should consider swapping $match with $setWindowFields. I think it's valid when:

      • within each partition, the predicate is either always true or always false
      • the $match doesn't depend on any 'output' field

      For example, in this query:

      {$setWindowFields: {
          partitionBy: ["$state", "$city"],
          output: {total: {$sum: "$x"}},
      }},
      {$match: {state: "NY"}},
      

      Doing the $match first shouldn't change the result, because it drops whole partitions.

      However, this could be tricky given how we desugar $setWindowFields:

      {$set: {__tmp: ["$state", "$city"]}},
      {$sort: {__tmp: 1}},
      {$_internalSetWindowFields: {
          partitionBy: "$__tmp",
          output: {total: {$sum: "$x"}},
      }},
      {$unset: 'tmp'},
      {$match: {state: "NY"}},
      

      It will be hard for the optimizer to see the relationship between {state: "NY"} and partitionBy: "$__tmp". Some things that could help are:

      • a new analysis (functional dependency)
      • ability to $sort by expression, instead of a __tmp field
      • a way to defer desugaring $setWindowFields until after some optimization

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            david.percy@mongodb.com David Percy
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: