-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
-
Fully Compatible
-
ALL
-
QE 2024-03-04, QE 2024-03-18
In a query
[{ $match: {state: {$in: [1,3,5]}} },{ $group: {_id: "$state", "avgHR": {$avg: "$heartrate"}} }])
where the group key can be processed in block mode but the accumulator can't (until $avg is supported, after that moment the accumulator should become $stdev or something else) the plan that is generated is
[3] group [s22] [s26 = aggDoubleDoubleSum(s20), s27 = sum( if ((typeMatch(s20, 1088) ?: true) || !(isNumber(s20))) then 0ll else 1ll )] spillSlots[s23, s24] mergingExprs[aggMergeDoubleDoubleSums(s23), sum(s24)] [3] project [s25 = cellFoldValues_P(cellBlockGetFlatValuesBlock(s10), s10)] [3] block_to_row blocks[s10, s11, s19] row[s20, s21, s22] s14 [3] project [s19 = valueBlockFillEmpty(cellFoldValues_P(cellBlockGetFlatValuesBlock(s11), s11), null)] ... [2] ts_bucket_to_cellblock s2 pathReqs[s10 = ProjectPath(Get(heartrate)/Id), s11 = ProjectPath(Get(state)/Id), s12 = FilterPath(Get(state)/Traverse/Id)]
The plan computes in s25 the block version of the $heartrate variable, even if the block_to_row was already inserted and the $group is going to read the scalar version of $heartrate from s20.
Needless to say that running cellFoldValues_P on every time measurement is going to be a waste of time, at least we should run it once per block