Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-94545

PBT - prototype a "interdependent-stages" model using the DISTINCT_SCAN optimization as the target

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization

      The DISTINCT_SCAN optimization can only be triggered if the entire pipeline and indexes need to have a specific shape and reference specific fields in a specific order.

      This can be used as an exercise to construct such pipelines in PBT:

      • generate a list of field names
      • create indexes over said fields
      • $sort over that same list of fields
      • optionally $match over that same list of fields
      • use the first few columns for the _id field of the $group, and the remainder for the actual accumulators
      • Add additional stages, e.g. $project, etc. that would also be using the initial list of fields

       
      Then enforce the property that the indexed plan returns the same results as the collection scan. Given the way the $sort and the $group are generated from the same input list of column names, the output will be deterministic
       
      This can be further extended by adding additional stages, e.g. $project 

            Assignee:
            Unassigned Unassigned
            Reporter:
            philip.stoev@mongodb.com Philip Stoev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: