-
Type: Task
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
Sharding
Since unsharded collections are always assigned _id as their shard key, shardCollection will behave like "resharding" the collection with a new shard key, but only if the collection has only one chunk. Additionally, it will set “allowSplit: true” for the collection.
As in 3.6, it will be possible to call shardCollection for a nonexistent collection only to continue to support mapReduce with output to a new sharded collection. However, in this case, shardCollection will create the collection with the same visibility rules as if createCollection had been called explicitly by the user. That is, it is legal for the collection to be dropped between when the createCollection logic and shardCollection logic is executed.
_configsvrShardCollection logic
- ScopedDistLock dbLock(dbName)
- ScopedDistLock collLock(collName)
- If config.collections does not have an entry for the collection, drop the distlocks and execute the createCollection logic once. Then re-obtain the distlocks and check again, and if the entry still doesn’t exist, return ConflictingOperationInProgress.
- If config.collections shows the collection is already sharded with the same key, return OK
- If config.collections has a different key, check if the collection only has a single chunk, otherwise return error.
- Ensure the primary shard and shard that owns the (single) chunk have an index on the requested shard key (create the indexes if necessary)
- Replace the single chunk with a new chunk with the major version bumped while preserving the same epoch and UUID.
- Update the config.collections entry’s “key” field and set allowSplit to true
Whenever a node does an incremental refresh, it needs to check whether the shardKey field changed and make sure the new ChunkManager/CollectionMetadata is assigned the new shard key (today, the old shard key is blindly copied, because shard keys are never expected to change).