Since we are moving shardCollection to the config server, and shardCollection is not idempotent in that:
- it writes chunks to config.chunks before it inserts an entry for the collection into config.collections
- at the start of shardCollection, it fails if no entry for the collection exists in config.collections but there are chunks for the collection in config.chunks
and we have a continuous config stepdown suite, which can cause the shardCollection to fail mid-way and be retried from the start by mongos, we must add a workaround in the continuous stepdown suite override.
The workaround should clean up partially written chunks and retry a failed shardCollection if it failed because it saw partially written chunks.
Cons: We have to assume the partially written chunks were due to a config server stepdown, and if there was actually a bug, we will gloss over it by deleting the chunks and retrying.
Pros: It allows us to have a config stepdown suite that uses all the existing jstests in jstests/sharding without modifying any of those tests. Since basically every test calls shardCollection, we can't get away with just blacklisting affected tests.
Note: This was not an issue while shardCollection was on mongos, because we do not have a mongos stepdown suite; if there were config stepdowns during shardCollection, only a particular read or write from mongos to the config servers was retried (and DuplicateKeyError was handled gracefully by mongos).
- has to be finished together with
-
SERVER-29107 move shardCollection logic into new _configsvrShardCollection command on config server
- Closed