Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14319

Counts on sharded clusters should use the same algorithm that find.explain uses

    • Type: Icon: Improvement Improvement
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • None

      In a sharded cluster it is known that counts do not return the correct values. If there are orphaned documents then even a count with a query will return an inflated value.

      It is possible to get around this by using the (nigh undocumented) itcount() function on a cursor. This is very slow however.

      A faster solution would be to use the explain function on a find. The code in explain already exists to provide the correct count and in our tests it provides this count through explain four times faster than itcount(). A much more detailed explanation of the problem and the proposed solution to follow :

      I am using our current production system, where this discreptancy was identified, as the example case here. All of these examples are run on the same sharded cluster which is currently in production.
      In a sharded collection where there exists orphaned documents a count returns the following

      mongos> db["message.blast.20140617"].find({_id : {$regex : "^2749336"}}).count()
      1001904
      

      However if you run through the find cursor and count every document you get 999999, this is obviously the correct number. Running an explain on the find cursor gets the following:

      mongos> db["message.blast.20140617"].find({_id : {$regex : "^2749336"}}).explain()
      {
      	"clusteredType" : "ParallelSort",
      	"shards" : {
      		"messageA/HOSTNAME:27017,HOSTNAME:27017" : [
      			{
      				"cursor" : "BtreeCursor _id_ multi",
      				"isMultiKey" : false,
      				"n" : 344267,
      				"nscannedObjects" : 344267,
      				"nscanned" : 344268,
      				"nscannedObjectsAllPlans" : 344267,
      				"nscannedAllPlans" : 344268,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 1406,
      				"nChunkSkips" : 0,
      				"millis" : 4235,
      				"indexBounds" : {
      					"_id" : [
      						[
      							"2749336",
      							"2749337"
      						],
      						[
      							/^2749336/,
      							/^2749336/
      						]
      					]
      				},
      				"server" : "HOSTNAME:27017"
      			}
      		],
      		"messageB/HOSTNAME:27017,HOSTNAME:27017" : [
      			{
      				"cursor" : "BtreeCursor _id_ multi",
      				"isMultiKey" : false,
      				"n" : 337542,
      				"nscannedObjects" : 339447,
      				"nscanned" : 339448,
      				"nscannedObjectsAllPlans" : 339447,
      				"nscannedAllPlans" : 339448,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 312,
      				"nChunkSkips" : 1905,
      				"millis" : 2197,
      				"indexBounds" : {
      					"_id" : [
      						[
      							"2749336",
      							"2749337"
      						],
      						[
      							/^2749336/,
      							/^2749336/
      						]
      					]
      				},
      				"server" : "HOSTNAME:27017"
      			}
      		],
      		"messageC/HOSTNAME:27017,HOSTNAME:27017" : [
      			{
      				"cursor" : "BtreeCursor _id_ multi",
      				"isMultiKey" : false,
      				"n" : 318190,
      				"nscannedObjects" : 318190,
      				"nscanned" : 318191,
      				"nscannedObjectsAllPlans" : 318190,
      				"nscannedAllPlans" : 318191,
      				"scanAndOrder" : false,
      				"indexOnly" : false,
      				"nYields" : 201,
      				"nChunkSkips" : 0,
      				"millis" : 1904,
      				"indexBounds" : {
      					"_id" : [
      						[
      							"2749336",
      							"2749337"
      						],
      						[
      							/^2749336/,
      							/^2749336/
      						]
      					]
      				},
      				"server" : "HOSTNAME:27017"
      			}
      		]
      	},
      	"cursor" : "BtreeCursor _id_ multi",
      	"n" : 999999,
      	"nChunkSkips" : 1905,
      	"nYields" : 1919,
      	"nscanned" : 1001907,
      	"nscannedAllPlans" : 1001907,
      	"nscannedObjects" : 1001904,
      	"nscannedObjectsAllPlans" : 1001904,
      	"millisShardTotal" : 8336,
      	"millisShardAvg" : 2778,
      	"numQueries" : 3,
      	"numShards" : 3,
      	"millis" : 4236
      }
      

      The explain has the results that we want, namely

      "n" : 999999
      "nChunkSkips" : 1905
      

      n is the correct count, and if one adds the n and the nChunkSkips you get the value that the count replies. The count does not exclude these orphaned documents causing an inflated count. This can be gotten around by using itcount() on the find cursor, this was timed versus how long it takes for an explain to return, as follows:

      $ time mongo blast --port 17018 --eval 'db["message.blast.20140617"].find({_id : {$regex : "^2749336"}}).itcount();'
      MongoDB shell version: 2.4.8
      connecting to: 127.0.0.1:17018/blast
      999999
      
      real	0m17.725s
      user	0m5.248s
      sys	0m0.060s
      
      $ time mongo blast --port 17018 --eval 'db["message.blast.20140617"].find({_id : {$regex : "^2749336"}}).explain().n;'
      MongoDB shell version: 2.4.8
      connecting to: 127.0.0.1:17018/blast
      999999
      
      real	0m3.353s
      user	0m0.020s
      sys	0m0.008s
      

      while a count currently runs faster than any of these but it returns the wrong value

      $ time mongo blast --port 17018 --eval 'db["message.blast.20140617"].find({_id : {$regex : "^2749336"}}).count();'
      MongoDB shell version: 2.4.8
      connecting to: 127.0.0.1:17018/blast
      1001904
      
      real	0m0.952s
      user	0m0.024s
      sys	0m0.008s
      

      A faster way to implement count on a sharded cluster that would return the correct value is to use the same code that is in explain to return the n and use that to return the count if the count has a query.

      Solution : For sharded clusters any count that is implemented with a query should return the same value, in the same method, that explain().n does. This will be faster than iterating through the cursor as well as providing correct results during migrations and in the presance of orphaned documents.

            Assignee:
            greg_10gen Greg Studer
            Reporter:
            jberger@sailthru.com Jeffrey Berger
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: