Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13487

Incorrect mongos shard routing for some range based finds on single shard key

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: 2.6.0-rc2
    • Component/s: Sharding
    • None
    • ALL
    • Hide
      • Create a sharded DB environment - 3 shards (just one mongod per shard), 1 config server and 1 mongos. Important to have more than 2 shards to easily distinguish the erroneous 2 shard routing behaviour from a general scatter-gather to all 3 shards behaviour. Using mongo shell, via mongos ensure you have called addShard() for each of the 3 shards
      • Load up a database (eg. 'test') with a collection of randomly generated documents - 2 million small docs should be sufficient to allow some chunking to occur.
       sh.enableSharding('test')
       use test
       db.rawdata.ensureIndex({myfield: 1})
       sh.shardCollection('test.rawdata', {myfield: 1})
       for (var i = 0; i < 2000000; ++i) {
         db.rawdata.insert({myfield: Math.floor(Math.random()*10000)});
       }
      
      • See what chunks are on what shards
      sh.status()
      • Run a query on a subset of data that lives on just once shard (eg. choose one chunk start and end value from the chunk list above and just query that), IMPORTANT - choose a value where the next chunk after it lives on different shard <-- THIS IS KEY
         // Change the 2 values as appropriate
         db.rawdata.find({"myfield" : {"$gte" : 80, "$lt" : 1059}}).explain()
        
      • Notice in the explain output, 2 shards are being targeted when should only need to target one, because the shard key range used in the find() belongs to just one shard
      • Modify the find to use $lte and decrement the value by 1
         // Change the 2 values as appropriate
         db.rawdata.find({"myfield" : {"$gte" : 80, "$lte" : 1058}}).explain()
        
      • This time notice that now only one shard is correctly being targeted in the explain plan
      • The aggregation framework exhibits the same behaviour.
         // Change the 2 values as appropriate
         var pipeline = [{"$match" : {"myfield" : {"$gte" : 80, "$lt" : 1059}}}]
         db.rawdata.aggregate(pipeline, {explain: true});
        
      Show
      Create a sharded DB environment - 3 shards (just one mongod per shard), 1 config server and 1 mongos. Important to have more than 2 shards to easily distinguish the erroneous 2 shard routing behaviour from a general scatter-gather to all 3 shards behaviour. Using mongo shell, via mongos ensure you have called addShard() for each of the 3 shards Load up a database (eg. 'test') with a collection of randomly generated documents - 2 million small docs should be sufficient to allow some chunking to occur. sh.enableSharding( 'test' ) use test db.rawdata.ensureIndex({myfield: 1}) sh.shardCollection( 'test.rawdata' , {myfield: 1}) for ( var i = 0; i < 2000000; ++i) { db.rawdata.insert({myfield: Math .floor( Math .random()*10000)}); } See what chunks are on what shards sh.status() Run a query on a subset of data that lives on just once shard (eg. choose one chunk start and end value from the chunk list above and just query that), IMPORTANT - choose a value where the next chunk after it lives on different shard <-- THIS IS KEY // Change the 2 values as appropriate db.rawdata.find({ "myfield" : { "$gte" : 80, "$lt" : 1059}}).explain() Notice in the explain output, 2 shards are being targeted when should only need to target one, because the shard key range used in the find() belongs to just one shard Modify the find to use $lte and decrement the value by 1 // Change the 2 values as appropriate db.rawdata.find({ "myfield" : { "$gte" : 80, "$lte" : 1058}}).explain() This time notice that now only one shard is correctly being targeted in the explain plan The aggregation framework exhibits the same behaviour. // Change the 2 values as appropriate var pipeline = [{ "$match" : { "myfield" : { "$gte" : 80, "$lt" : 1059}}}] db.rawdata.aggregate(pipeline, {explain: true });

      MongoDB version: 2.6.0rc2
      OS: Linux x86-64

      When running a query against a sharded collection using a range in the find() criteria based on values in the shard key, depending on the range of values used, the mongos / query optimiser is incorrectly targeting 2 shards when it only needs to target 1 shard. Specifically, the problem is for a sharded collection that has a single (non-compound) shard key, and $lte (or $gte) is used in the range query. The exact behaviour can be seen when running .explain().

      Say I have a sharded collection on shard key "myfield" and in shard s1 a chunk exists for {"myfield":80}-->>{"myfield":1059} and then in shard s0 a chunk exists for {"myfield":1059}-->>{"myfield":2492}, then when I issue a find() with criteria "$lt: 1059", both shards s1 and s0 are routed to when only one shard (s1) needs to be routed to.

      As a workaround, if I modify the criteria to be "$lte: 1058", then only one shard (s1) is correctly routed to. HOWEVER, this means my generic query code for my more real world application has to be explicitly aware of field types (no schema) and the knowledge that the field actually contains an integer that can be possible decremented by 1, just to try to optimise the sharded query.

            Assignee:
            greg_10gen Greg Studer
            Reporter:
            paul.done@mongodb.com Paul Done
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: