-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
v4.2, v4.0
-
Sharding 2019-11-04, Sharding 2019-11-18
-
(copied to CRM)
Background and motivation
Both replica set and sharded cluster MongoDB installations support implicit database and collection creation. In a sharded cluster, by default, implicitly created databases do not support creating sharded collections under them and because of this, sharding provides the enableSharding command, which explicitly creates the database and marks it as permitting sharded collections.
Currently, both implicitly created databases and those created through enableSharding (partially) use the balancer's statistics gathering logic to find the shard with the smallest data size and place the database's primary on it.
We have seen pathological cases where multiple concurrent implicit database creations end-up placing all database primaries on the same shard. In addition, because the implicit database placement doesn't use the complete balancer placement logic, it also does not take into account zones, which may lead to database primaries violating location requirements such as GDPR.
Proposed solution
Expose an optional string parameter called primaryShard on the enableSharding command.
If this parameter is present, it must contain the id of a valid shard, and the new database's primary should be placed on that shard. If the database already exists and its current primary is the same as the one specified through primaryShard, the command succeeds. Otherwise, the command should fail with error code NamespaceExists = 48.
If the parameter is omitted, the command should behave like it does currently and place the database's primary on the shard with the currently smallest data size.
- is related to
-
SERVER-31020 Sharding database creation is slow because of shard disk-usage statistics gathering
- Blocked