-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding, Testing Infrastructure
-
Cluster Scalability
Given the changes from SERVER-34665 which exposes a Mongo.prototype._markHostAsFailed() function to call ReplicaSetMonitor::failedHost(), it shouldn't be necessary to use multiple retry attempts as a way to wait for the ReplicaSetMonitor to discover a new primary has been elected because retargeting can be triggered explicitly. The auto_retry_on_network_error.js override could then use this mechanism rather than setting kMaxNumRetries=3 and could similarly remove TestData.overrideRetryAttempts=3 from the YAML suite definition.
Note: SERVER-34608 describes a case where after receiving an InterruptedDueToReplStateChange error response that an "isMaster" command could still observe ismaster=true and could therefore cause server selection to pick a node which is still in the midst of stepping down. We could avoid decrementing the numRetries counter in this case of an InterruptedDueToReplStateChange error response because the first retry (i.e. the second attempt) will synchronize with the stepdown to finish and the mongo shell would observe a network error. A second retry (i.e. a third attempt) would be successfully targeted at whichever node is then elected the new primary.
- depends on
-
SERVER-36128 ReplicationCoordinatorImpl::fillIsMasterForReplSet should return isMaster:false while in shutdown
- Closed
-
SERVER-34665 The mongo shell should retry writes on a WriteConcernFailure error response from the server
- Closed
- is duplicated by
-
SERVER-35225 retryOnNetworkErrors does not subtract from number of retries
- Closed
- is related to
-
SERVER-34608 Drivers may still see ismaster=true from primary in midst of stepping down immediately after operations are killed with InterruptedDueToReplStateChange
- Closed