-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Internal Code, Networking
-
Cluster Scalability
-
ALL
-
(copied to CRM)
-
6
It's possible that ServerDiscoveryMonitor::requestImmediateCheck can be called so frequently each subsequent request can cancel the previous request before it has a chance to run, leading to none of them ever succeeding.
This flag is supposed to short circuit rescheduling when there's already an outstanding 'hello' request, but that doesn't get set until after the request is actually scheduled, which can happen at a delay from the time requestImmediateCheck is called, so that doesn't help us in this case.
Note that this applies to both 4.4 and master so we should make sure any fix is backportable.
Acceptance criteria:
Unit test to demonstrate the problem and add throttle to fix the test.
- related to
-
SERVER-54739 Race in ServerDiscoveryMonitor::requestImmediateCheck could lead to multiple outstanding exhaust requests
- Closed