Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14299

For sharded limit=N queries with sort, mongos can request >N results from shard

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.6.6, 2.7.4
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • ALL

      If a user issues a limit=N query that includes a sort against a sharded collection and all of the first N results come back from one shard, mongos will request an extra batch of results from that shard.

      This bug has a particularly harmful interaction with the "top-K sort" logic introduced in version 2.6.0 of the server. If mongos requests >N results from a shard for a query that has a limit of N and uses an unindexed sort, then the shard will perform a full "unlimited" sort (and return "getMore runner error: Overflow sort stage buffered data usage..." to the user if the size of the data to be sorted exceeds the 32MB sort stage limit; see script attached to this ticket for an example that exhibits this failure mode).

      Reproduce with the following:

      var st = new ShardingTest({shards: 2, other: {shardOptions: {verbose: 1}}});
      st.stopBalancer();
      var coll = st.s0.getDB("test").foo;
      
      assert.commandWorked(coll.getDB().adminCommand({enableSharding: coll.getDB().getName()}));
      assert.commandWorked(coll.getDB().adminCommand({movePrimary: coll.getDB().getName(),
                                                      to: "shard0000"}));
      assert.commandWorked(coll.getDB().adminCommand({shardCollection: coll.getFullName(),
                                                      key: {_id: 1}}));
      assert.commandWorked(coll.getDB().adminCommand({split: coll.getFullName(), middle: {_id: 0}}));
      assert.commandWorked(coll.getDB().adminCommand({moveChunk: coll.getFullName(),
                                                      find: {_id: 0},
                                                      to: "shard0001"}));
      
      // Write 10 documents to shard 0, and 10 documents to shard 1.
      for (var i=1; i<=10; ++i) {
          assert.writeOK(coll.insert({_id: i, x: i, y: i}));
          assert.writeOK(coll.insert({_id: -i, x: -i, y: i}));
      }
      
      coll.find().sort({x:1}).limit(9).itcount(); // Shard 0 contributes all 9 results.
      coll.find().sort({y:1}).limit(9).itcount(); // Both shards contribute to the 9 results.
      

      The query on line 21 requests 9 documents from both shards, and then an unnecessary additional batch from shard 0:

       m30000| 2014-06-18T17:36:14.380-0400 [conn6] query test.foo query: { query: {}, orderby: { x: 1.0 } } planSummary: COLLSCAN, COLLSCAN cursorid:70249920950 ntoreturn:9 ntoskip:0 keyUpdates:0 numYields:0 locks(micros) r:201 nreturned:9 reslen:380 0ms
       m30001| 2014-06-18T17:36:14.380-0400 [conn4] query test.foo query: { query: {}, orderby: { x: 1.0 } } planSummary: COLLSCAN, COLLSCAN cursorid:23467375332 ntoreturn:9 ntoskip:0 keyUpdates:0 numYields:0 locks(micros) r:214 nreturned:9 reslen:380 0ms
       m30000| 2014-06-18T17:36:14.380-0400 [conn4] getmore test.foo cursorid:70249920950 ntoreturn:9 keyUpdates:0 numYields:0 locks(micros) r:133 nreturned:1 reslen:60 0ms
      

      Whereas the query on line 22 requests 9 documents from both shards and does not follow up with any getmore requests:

       m30000| 2014-06-18T17:36:14.381-0400 [conn6] query test.foo query: { query: {}, orderby: { y: 1.0 } } planSummary: COLLSCAN, COLLSCAN cursorid:70393723465 ntoreturn:9 ntoskip:0 keyUpdates:0 numYields:0 locks(micros) r:162 nreturned:9 reslen:380 0ms
       m30001| 2014-06-18T17:36:14.382-0400 [conn4] query test.foo query: { query: {}, orderby: { y: 1.0 } } planSummary: COLLSCAN, COLLSCAN cursorid:22786735268 ntoreturn:9 ntoskip:0 keyUpdates:0 numYields:0 locks(micros) r:149 nreturned:9 reslen:380 0ms
      

            Assignee:
            david.storch@mongodb.com David Storch
            Reporter:
            rassi J Rassi
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: