Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62405

Partially-streaming compound $sort

    • Type: Icon: New Feature New Feature
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization

      In a pipeline like this:

      {$sort: {x: 1}}
      ... // order-preserving stages
      {$sort: {x: 1, y: 1}}
      

      we could optimize the second $sort, to take advantage of the fact that the input is already sorted on {x: 1}. Any two documents with the same {x: 1} key are already next to each other, so whenever we see 'x' increase, we can start returning results before asking for more input. Depending on how big these runs of equal 'x' are, this could save a lot of space, and maybe avoid spilling.

      Also, we only need to compare values of 'y', since we're only breaking ties within each run of equal 'x' values. Maybe that would be useful if 'x' is large, or the overall document is small.

      This could be useful for cases like:

      • {$densify ...} {$sort: {partitionField: 1, someOtherField: 1}}, because the output of $densify will be sorted by {partitionField: 1}

        .

      • {$unpackBucket ...} {$sort: {meta: 1, temperature: 1}}, because we could push down the sort on {meta: 1}

        .

      • {$unwind ...} {$sort ...} similar to $unpackBucket.

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            david.percy@mongodb.com David Percy
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: