-
Type: New Feature
-
Resolution: Duplicate
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: Aggregation Framework
-
Query
It might be nice if the agg. framework had expressions in $project for generating things like sequential numbers.
Here's one use for such a thing: given some inputs of the form
{ A : <Avalue>, B : <Bvalue> }produce groupings on A values, remove exactly 1 instance of the minimum B value per group. (AFAICT, this problem can't be solved yet in the aggregation framework.)
If $project could join serial numbers into those inputs, then it would be possible by constructing a unique minimum Bvalue subdocument like this:
[ /* Add a serial number to every input using a new $generate operator. */ { $project : { A : 1, B : 1, s : { $generate : { $serial : 1 } } } }, /* Group by A, computing a minimum (B, s) value for the next stages */ { $group : { _id:"$A", B:{$push:"$B"}, s:"$s", min: { $min : { B:"$B" , s:"$s" } }, /* Unwind on B so as to project&filter later. */ { $unwind : "$B" }, /* Figure out if we're looking at the minimum B. */ { $project : { _id:1, B:1, isMin:{ $eq : [ { B:"$B", s:"$s" }, "$min" ] } } }, /* Filter out the isMin=true cases */ { $match : { isMin: false } }, /* Re-group by A (which is called _id at this point) */ { $group: { _id: "$_id", B: { $push: "$B" } } } ]
Of course there are other (and probably better) aggregation extensions that would solve this problem, but the requested feature both helps with this one and might be useful elsewhere.
(In the made-up $generate expression above, I stuck the keyword $serial in there in case it turns out to be useful to have things other than serial numbers in future, e.g., random numbers, ObjectIds, timestamps, etc.)
Doc changes: if we do it, we oughtta doc it.
- duplicates
-
SERVER-9377 Allow collecting "top" N values for each group
- Closed