Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89419

Aggregation is inconsistent in applying $sum vs. $min/$max/$avg for non-existing fields

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • ALL
    • Hide

      Insert these documents:

       db.abc.insertMany([{a: "A", b: 1}, {a: "B"}, {a: "B"}, {a: "A"}]) 

      , and run this pipeline:

      db.abc.aggregate([{$group: {_id: "$a",
          sum: {$sum: "$b"}, avg: {$avg: "$b"},
          min: {$min: "$b"}, max: {$max: "$b"}}
        }])

      This result is returned:

      [
        { _id: 'B', sum: 0, avg: null, min: null, max: null },
        { _id: 'A', sum: 1, avg: 1, min: 1, max: 1 }
      ] 

      $sum should yield consistent results, compared to its counterparts.

      Take SQL for example:

      create table abc (a char, b int NULL);
      insert into abc values ('A', 1); 
      insert into abc values ('A', NULL);
      insert into abc values ('B', NULL);
      insert into abc values ('B', NULL);

      , which will yield expected results - mixed input ignores NULL and NULL-only input yields NULL:

      a           sum(b)      avg(b)      min(b)      max(b)
      ----------  ----------  ----------  ----------  ----------
      A           1           1.0         1           1
      B           <NULL>      <NULL>      <NULL>      <NULL> 
      Show
      Insert these documents: db.abc.insertMany([{a: "A" , b: 1}, {a: "B" }, {a: "B" }, {a: "A" }]) , and run this pipeline: db.abc.aggregate([{$group: {_id: "$a", sum: {$sum: "$b"}, avg: {$avg: "$b"}, min: {$min: "$b"}, max: {$max: "$b"}} }]) This result is returned: [   { _id: 'B', sum: 0, avg: null, min: null, max: null },   { _id: 'A', sum: 1, avg: 1, min: 1, max: 1 } ] $sum should yield consistent results, compared to its counterparts. Take SQL for example: create table abc (a char , b int NULL); insert into abc values ( 'A' , 1); insert into abc values ( 'A' , NULL); insert into abc values ( 'B' , NULL); insert into abc values ( 'B' , NULL); , which will yield expected results - mixed input ignores NULL and NULL -only input yields NULL : a           sum(b)      avg(b)      min(b)      max(b) ----------  ----------  ----------  ----------  ---------- A           1           1.0         1           1 B           <NULL>      <NULL>      <NULL>      <NULL>

      $sum should behave consistently with other grouping operators, such as $avg, when applied against non-existing fields, but instead it manufactures bogus zero values and there is no simple way to work it around.

      Mongo DB also won't process $cond or $ifNull with enclosed $sum (it really should strip off conditionals to get to the accumulators, but that's another issue), and applying something like $first, with enclosed $cond, with enclosed $sum won't generate errors, but will yield bogus results instead of returning the one and only sum of the result.

      These aggregations mess up things like charts, where instead of missing data zeros are output, which is misleading.

            Assignee:
            chris.kelly@mongodb.com Chris Kelly
            Reporter:
            cis74633@bell.net Andre M
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: