-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Replication
-
Replication
-
(copied to CRM)
EDIT: Cancelling the reporter on changing sync source is a more robust solution.
syncSourceFeedbackNetworkTimeoutSecs is currently hardcoded to 30s. In case of network partition, the sync source feedback report might need to take 30s before timing out on the replSetUpdatePosition remote command against the old sync source even though the node has selected a new sync source. This could result in majority commit point lag after failovers. One idea is to have the syncSourceFeedbackNetworkTimeoutSecs the same as the feedback reporter's interval (or plus a buffer). Another idea is to hardcode the syncSourceFeedbackNetworkTimeoutSecs to a smaller number because we don't generally expect replSetUpdatePosition to block and the current 30s seem too much for a socket timeout.