-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 3.2.12, 3.4.4, 3.5.6
-
Component/s: Sharding
-
Minor Change
-
Sharding 2020-11-02, Sharding 2020-11-16
-
0
If a shard encounters a shardVersion mismatch in checkShardVersionOrThrow, it returns a stale version error even if the sender was more fresh.
When a shard is returning a stale version error, it also refreshes its own routing table just before sending the response.
In the case the mongos was more fresh, the mongos is forced to refresh its routing table cache and send the request again. Then, since the shard refreshed just before responding, the shard will accept the request (unless another migration/dropCollection/unshardCollection has happened).
This wastes two network round-trips: the mongos has to refresh from the config servers even though it's not stale, and the mongos has to re-send the request to the shard.
It would be better if the shard refreshed and re-tried checkShardVersionOrThrow, and only responded with a stale version error if the sender was more stale.
This is an improvement/optimization and not that easy of a change, since the collection is locked when checkShardVersionOrThrow is called (so we can't simply move the shard's routing table refresh to checkShardVersionOrThrow).
- causes
-
SERVER-57051 Shard may fail to notify that router was stale for command in multi-statement transaction
- Closed
- is related to
-
SERVER-29630 bump number of stale version retries from 3 to 10 in mongos
- Closed