-
Type: New Feature
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
This is an idea to improve SERVER-88087 optimization further but benefits do not stop at just optimization and this provides users with a clear way to express the common sort pattern across multiple $top/$bottom.
The motivation is to avoid generating the same sort key multiple times for multiple accumulators that use the same sort key.
In SERVER-88087, we group $top(N)/$bottom(N) with the same sort pattern into either one $top, $topN, $bottom, or $bottomN due to the current limitation of $group specification. So, in the worst case, we need to generate the same sort key 4 times though they are the same sort key.
Update: in the worst case, it's actually unbounded because for $topN/$bottomN, "n" argument should be part of grouping key.
If $group was able to define the common sort key for multiple $top(N)/$bottom(N) accumulators using a new syntax, we could generate the sort key only once and let $top(N) and $bottom(N) refer to the sort key.
Off the top of my head, I could think of this syntax.
{ $group: { sortKeys: {k1: {time: 1, tag: -1}}, _id: ..., tm: {$top: {sortBy: "$$k1", output: "$m"}}, bi: {$bottom: {sortBy: "$$k1", output: "$i"}} } } // This is equivalent to the following syntax { $group: { _id: ..., tm: {$top: {sortBy: {time: 1, tag: -1}, output: "$m"}}, bi: {$bottom: {sortBy: {time: 1, tag: -1}, output: "$i"}} } }
Basically, the common sort pattern itself does not need to be a part of accumulator's spec. There could be multiple sort keys for different $tops and $bottoms and so we need a new syntax to support multiple sort keys. Referring to the defined sort keys can be expressed by prefixing the "$$" to a defined sort key.
The SERVER-88087 could have leveraged this syntax.
Any idea for the better syntax will be welcomed.
I think this is a small to medium size project since we need to
- define a new syntax for $group
- support the new syntax in the classic pipeline
- support the new syntax in the SBE group
- support the new syntax in the SBE block group
- apply this new syntax to
SERVER-88087optimization
Benefits of this proposal are
- By exposing this syntax to users,
we can encourage users to write optimized $group queries.users can express their intention more clearly when there are shared common sort pattern across multiple $top/$bottoms. This is the most important benefit and motivation. - We can avoid the overhead of generating the same sort key multiple times completely, which has been found to be big.
- It's a common timeseries query pattern to have $sort + $group w/ $first/$last which can be optimized into $group w/ $top/$bottom. This proposal will optimize the pattern further.
- is related to
-
SERVER-88087 Rewrite many $topNs/$bottomNs that have the same sort pattern so that it only creates one sort key
- Closed