-
Type: New Feature
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Aggregation Framework
-
None
-
Major Change
-
Quint Iteration 4, Quint Iteration 5, Quint Iteration 6
In the aggregation framework, arrays are interpreted as literals, which is not the intuitive behavior. Take the following code as an example:
> db.foo.drop() true > db.foo.insert({_id: 0, a: ['foo', 'bar', 'baz'], b: 'bar', c: 'Baz'}) WriteResult({ "nInserted" : 1 }) > db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}]) { "_id" : 0, "intersection" : [ ] } // Intuitively would expect [ 'bar', 'baz' ] > db.foo.insert({_id: 1, a: ['foo', '$b'], b: 'bar', c: 'Baz'}) WriteResult({ "nInserted" : 1 }) > db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}]) { "_id" : 0, "intersection" : [ ] } { "_id" : 1, "intersection" : [ "$b" ] } // Instead, it is matching against literals '$b' and {$toLower: '$c'}
Instead of evaluating '$b' to be the value of the field 'b' in the current document, as is done with '$a', '$b' is treated as a literal when it is parsed inside of an array. Because arrays are treated as literals, it also prevents a projection stage from creating arrays of fields:
> db.bar.drop() true > db.bar.insert({_id: 0, point: {x: 10, y: 20}}) WriteResult({ "nInserted" : 1 }) > db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}]) assert: command failed: { "errmsg" : "exception: disallowed field type Array in object expression (at 'coords')", "code" : 15992, "ok" : 0 } : aggregate failed
The aggregation framework should parse arrays in the same ways it parses expressions elsewhere in the aggregation framework. This would yield the following results from above:
// With documents from collection 'foo' from above. > db.foo.aggregate([{$project: {intersection: {$setIntersection: ['$a', ['$b', {$toLower: '$c'}]]}}}]) { "_id" : 0, "intersection" : [ "baz", "bar" ] } { "_id" : 1, "intersection" : [ ] }
// With document from collection 'bar' from above. > db.bar.aggregate([{$project: {coords: ['$point.x', '$point.y']}}]) { "_id" : 0, "coords" : [ 10, 20 ] }
Note this is a backwards breaking change.
=======================================
Original description:
Builds an array.
input: {a:1, b:2, c:3} operation: {$project: {array: {$array: ['$a', '$c', {$add:['$b', '$c'] }] } } } output: {array: [1, 3, 5]}
An issue is how to handle missing fields (eg {$push:['$d']}). When building an object we would omit that field from the output. This would be expected if the array is a list or set, but could be a problem if indexed access is important. Possible solutions:
- Omit the field, shrinking the array.
- Replace the field with null, keeping the array the same size, but creating a value and differing from object behavior.
- Error out.
- Separate operators like $array when indexes are important and $list for when they aren't.
- related to
-
CSHARP-1365 Avoid treating arrays as literals in the aggregation pipeline
- Closed
-
DRIVERS-234 Aggregation Builder Support for 3.2
- Closed