Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-79622

[CQF] Improve intersectDNFIntervals behavior for variable bounds

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • QO 2023-09-18, QO 2023-10-02, QO 2023-10-16, QO 2023-10-30, QO 2023-11-13, QO 2023-11-27, QO 2023-12-11, QO 2023-12-25

      intersectDNFIntervals produces an overly complicated result for simple inputs:

      Input:

      {{
          {<Const [""]}
       ^ 
          {>FunctionCall [getParam] Const [0]}
      }}
      

      Output:

      {
          {{[If [] BinaryOp [And] BinaryOp [And] BinaryOp [Or] BinaryOp [And] BinaryOp [Lt] FunctionCall [getParam] Const [0] Const [""] Const [true] BinaryOp [Or] BinaryOp [And] BinaryOp [Lt] Const [minKey] FunctionCall [getParam] Const [0] BinaryOp [Lt] FunctionCall [getParam] Const [0] Const [""] Const [false] BinaryOp [Lt] FunctionCall [getParam] Const [0] Const [maxKey] Const [false] Const [minKey] Const [maxKey], Const [minKey]]}}
       U 
          {{[Const [maxKey], If [] BinaryOp [And] BinaryOp [And] BinaryOp [Or] BinaryOp [And] BinaryOp [Lt] FunctionCall [getParam] Const [0] Const [""] Const [true] BinaryOp [Or] BinaryOp [And] BinaryOp [Lt] Const [minKey] FunctionCall [getParam] Const [0] BinaryOp [Lt] FunctionCall [getParam] Const [0] Const [""] Const [false] BinaryOp [Lt] FunctionCall [getParam] Const [0] Const [maxKey] Const [false] Const [maxKey] Const [minKey]]}}
       U 
          {{(If [] BinaryOp [Gte] Const [minKey] FunctionCall [getParam] Const [0] Const [minKey] FunctionCall [getParam] Const [0], Const [""])}}
      }
      

      The optimal result from this function would be a single interval, (getParam(0), ""). intersectDNFIntervals likely does not have special cases for intervals where the low bound is MinKey or the high bound is MaxKey.

      The number of intervals unioned in the result approximately doubles for each additional variable interval. We get 6 intervals with 3 variable intervals, 14 intervals for 4 variable intervals, etc. This is likely because the function intersects two intervals at a time rather than analyzing the intervals all at once for a better result.

      This is relevant for parameterization of match expressions for the plan cache.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            matt.boros@mongodb.com Matt Boros
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: