Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7568

Aggregation framework favors non-blocking sorts

    • Fully Compatible
    • Query 2017-05-08, Query 2017-05-29, Query 2019-10-07, Query 2019-10-21, Query 2019-11-04
    • 15

      In some situations, an aggregation pipeline with a sort will attempt to use an index to cover the sort. In such situations, the aggregation pipeline will always favor plans that do not have a blocking sort, even if other plans may be faster.

      We should somehow integrate the aggregation framework and the query system more tightly so that we can consider both blocking and non-blocking plans at the same time, or we should try to otherwise determine if the blocking plan would outperform the non-blocking plan.

      Original Description

      We have a dataset with some hundred thousand documents with aggregation framework we select a small amount of it and sort it by _id. Match works very fast (by index) but sort {_id: 1} seems to load all data instead of sorting only selected documents. If we sort by some other field it works like expected.

      Example
      function test(num) {
          function time(fun) {
              var start = (new Date).getTime();
              fun.apply();
              return (new Date).getTime() - start;
          }
      
          if (num === undefined) {
              num = 1000;
          }
          db.demo.remove();
          for (var i = 0; i < num; ++i) {
              db.demo.insert({num: i});
          }
          db.demo.ensureIndex({num: 1});
          var query = {num: {$lt: 10}};
          var sortId = {_id: 1};
          var sortNum = {num: 1};
          printjson(db.demo.find(query).explain());
          print("find sort by id: " + time(function ()
          {
              db.demo.find(query).sort(sortId);
          }));
          print("aggregate sort by id: " + time(function ()
          {
              db.demo.aggregate([{$match: query}, {$sort: sortId}]);
          }));
          print("find sort by num: " + time(function ()
          {
              db.demo.find(query).sort(sortNum);
          }));
      
          print("aggregate sort by num: " + time(function ()
          {
              db.demo.aggregate([{$match: query}, {$sort: sortNum}]);
          }));
      }
      
      test(100000)
      {
      	"cursor" : "BtreeCursor num_1",
      	"isMultiKey" : false,
      	"n" : 10,
      	"nscannedObjects" : 10,
      	"nscanned" : 10,
      	"nscannedObjectsAllPlans" : 10,
      	"nscannedAllPlans" : 10,
      	"scanAndOrder" : false,
      	"indexOnly" : false,
      	"nYields" : 0,
      	"nChunkSkips" : 0,
      	"millis" : 0,
      	"indexBounds" : {
      		"num" : [
      			[
      				-1.7976931348623157e+308,
      				10
      			]
      		]
      	},
      	"server" : "mongodb01:27228",
      	"millis" : 0
      }
      find sort by id: 0
      aggregate sort by id: 373
      find sort by num: 0
      aggregate sort by num: 1
      

            Votes:
            25 Vote for this issue
            Watchers:
            57 Start watching this issue

              Created:
              Updated:
              Resolved: