-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Fully Compatible
-
ALL
-
v4.0, v3.6
-
Sharding 2019-01-28, Sharding 2019-02-11
-
12
If a node steps down while waiting for write concern, it can return a PrimarySteppedDown error code in the WriteConcernError field in its command response. PrimarySteppedDown is treated as a retryable error in the AsyncRequestsSender (for the kIdempotent retry policy), but the ARS doesn't look for retryable errors in the WriteConcernError field.
This means that retryable writes (writes with a txnNumber) won't automatically be retried by mongos if a failover led to a write concern error, because they rely on the ARS to retry them. The replica set monitor for the targeted shard should also be updated, since if there was a retry attempt, it would target the old primary.
For reference, ShardRemote::_runCommand does inspect write concern errors and update the replica set monitor for them.