Summary
Count does not filter out unowned (orphaned) documents and can therefore report larger values than one will find via a normal query, or using itcount() in the shell.
Causes
The following conditions can lead to counts being off:
- Active migrations
- Orphaned documents (left from failed migrations)
- Non-Primary read preferences (see
SERVER-5931)
Workaround
A workaround to get accurate counts is to ensure all migrations have been cleaned up and no migrations are active. To query non-primaries you must also ensure that there is no replication lag including any migration data, in addition to the above requirements.
Non-Primary Reads
For issues with counts/reads from non-primaries please see SERVER-5931
Behavior of "fast count" and non-"fast count"
A "fast count" is a count run without a predicate. It is "fast" because the implementation only reads the metadata, without fetching any documents.
The problem of count() reporting inaccurate results has been fixed for non-"fast counts," that is, starting in 4.0, counts which are run with a predicate are accurate when run on sharded clusters. "Fast counts" (count() run without a predicate) may still report too many documents (see SERVER-33753).
In general, if one needs an accurate count of how many documents are in a collection, we do not recommend using the count command. Instead, we suggest using the $count aggregation stage, like this:
db.foo.aggregate([{$count: "nDocs"}]);
See the docs.
For users who need the performance of "fast count", and are okay with approximate results, we suggest using $collStats instead of the count command:
db.matrices.aggregate( [ { $collStats: { count: { } } } ] )
- causes
-
SERVER-39191 Performance regression for counts post-sharding
- Closed
- duplicates
-
SERVER-48685 very high IO utilization after upgrade from mongo 3.6 to 4.0.18
- Closed
- is depended on by
-
SERVER-5366 balancer did not remove chunk from old shard
- Closed
-
SERVER-5902 check fileMD5 failure for count() problems
- Closed
- is duplicated by
-
SERVER-24079 Different result between db.count and db.find
- Closed
-
SERVER-5665 count command does not check chunk ranges
- Closed
-
SERVER-8178 Odd Count / Document Results Differential On Sharded Collection
- Closed
-
SERVER-8405 sharded count may incorrectly count migrating or orphaned documents (does not filter using chunk manager)
- Closed
-
SERVER-12082 count() on a sharded cluster includes orphan documents
- Closed
-
SERVER-15092 count is greater than itcount
- Closed
-
SERVER-26038 Count and distinct operations do not include the shard filter stage
- Closed
-
SERVER-29742 mongodump only creates a partial dump
- Closed
-
SERVER-14319 Counts on sharded clusters should use the same algorithm that find.explain uses
- Closed
- is related to
-
SERVER-13116 distinct isn't sharding aware
- Backlog
-
SERVER-70810 SHARDING_FILTER stage missing on shards from cluster count command explain with query predicate
- Backlog
-
SERVER-30708 _id index returning more than one document with same _id in aggregations and counts.
- Closed
-
SERVER-8948 Count() can be wrong in sharded collections
- Closed
-
SERVER-26316 cleanupOrphaned command is too slow
- Closed
-
SERVER-50857 Improve count() performance in sharded clusters
- Closed
- related to
-
SERVER-33753 count without predicate should be sharding aware
- Backlog
-
SERVER-5931 Secondary reads in sharded clusters need stronger consistency
- Closed