Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85424

$group relies on stable sorting

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • ALL
    • Hide

      > db.coll.insertMany([
        {a: 1, b: 2, c: 3},
        {a: 1, b: 2, c: 4},
        {a: 1, b: 3, c: 4},
        {a: 1, b: 10, c: 20},
        {a: 2, b: 20, c: 30},
        {a: 2, b: 30, c: 40},
        {a: 2, b: 10, c: 30}
      ])
      {{{}}
              "acknowledged" : true,
              "insertedIds" : [
                      ObjectId("65a9a670f6bb772ae439c829"),
                      ObjectId("65a9a670f6bb772ae439c82a"),
                      ObjectId("65a9a670f6bb772ae439c82b"),
                      ObjectId("65a9a670f6bb772ae439c82c"),
                      ObjectId("65a9a670f6bb772ae439c82d"),
                      ObjectId("65a9a670f6bb772ae439c82e"),
                      ObjectId("65a9a670f6bb772ae439c82f")
              ]
      }
      > db.coll.aggregate([
        {$sort: {b: 1, c: 1,}}
        {$group: {_id: "$a", f: {$first: "$c"}}}
      ])
      {{

      { "_id" : 1, "f" : 3 }

      }}
      {{

      { "_id" : 2, "f" : 30 }

      <---- This should be {"f": 20} because we sorted by [b, c].}}

      Show
      > db.coll.insertMany([   {a: 1, b: 2, c: 3},   {a: 1, b: 2, c: 4},   {a: 1, b: 3, c: 4},   {a: 1, b: 10, c: 20},   {a: 2, b: 20, c: 30},   {a: 2, b: 30, c: 40},   {a: 2, b: 10, c: 30 } ]) {{{}}         "acknowledged" : true,         "insertedIds" : [                 ObjectId("65a9a670f6bb772ae439c829"),                 ObjectId("65a9a670f6bb772ae439c82a"),                 ObjectId("65a9a670f6bb772ae439c82b"),                 ObjectId("65a9a670f6bb772ae439c82c"),                 ObjectId("65a9a670f6bb772ae439c82d"),                 ObjectId("65a9a670f6bb772ae439c82e"),                 ObjectId("65a9a670f6bb772ae439c82f")         ] } > db.coll.aggregate([   {$sort: {b: 1, c: 1 ,}}   {$group: {_id: "$a", f: {$first: "$c" }}} ]) {{ { "_id" : 1, "f" : 3 } }} {{ { "_id" : 2, "f" : 30 } <---- This should be {"f": 20} because we sorted by [b, c] .}}

      We would like to use std::sort (unstable sorting) instead of std::stable_sort in sorter.cpp because of the performance gains. While experimenting with this change, I found that $group with $first relies on stable sorting. With unstable sorting, the behavior does not match the documentation at https://www.mongodb.com/docs/manual/reference/operator/aggregation/first/#behaviors, specifically

      > To define the document order for $first in other pipeline stages, add a preceding $sort stage.

      The sorter change is in https://github.com/10gen/mongo/pull/17859.

      cc david.percy@mongodb.com This may be related to SERVER-85337.

            Assignee:
            david.percy@mongodb.com David Percy
            Reporter:
            brad.cater@mongodb.com Brad Cater
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: