-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Performance, Sharding
-
None
-
ALL
-
Generally a shard only has documents belonging to chunks owned by that shard. However, if a migration is in progress the shard may additionally have documents belonging to a chunk that is not owned by the shard but is in the process of being migrated. After an aborted migration, a shard may have "orphaned" documents belonging to a chunk that never completed migration.
When a query (not count) runs on a sharded cluster, each shard checks the documents that match the query to see if they also belong to chunks owned by that shard. If they do not belong to owned chunks, they are not returned.
However when a count runs, the check to see if matching documents belong to owned chunks is not performed. This implementation has the advantage that sharded counts can use covered indexes. (Checking whether a document belongs to a valid chunk cannot currently be done from an index only - it requires loading the full document - see SERVER-5022.) However it means sharded counts can return incorrect results. The counts on the individual shards don't filter out unowned documents that may exist because of in flight migrations or aborted migrations.
Here are some potential ways of handling this, to promt discussion:
- Filter count results using a chunk manager, the same as we currently do for queries.
- Give a mongod chunk manager knowledge of whether or not there may be an in flight migration or orphaned documents, and filter using a chunk manager only when necessary.
- For indexes that include the shard key, add covered index support for checking chunk membership, otherwise fall back to checking the full document for chunk membership.
- Track disk locations of documents participating in migrations (including aborted migrations) and exclude these documents from count, query, etc results. Filtering based on disk location does not require reading the document. If the list of disk locs is in memory only, orphans might be deleted on startup.
- duplicates
-
SERVER-3645 Sharded collection counts (on primary) can report too many results
- Closed
- is duplicated by
-
SERVER-8178 Odd Count / Document Results Differential On Sharded Collection
- Closed