-
Type: Task
-
Resolution: Won't Do
-
Priority: Minor - P4
-
None
-
Affects Version/s: 4.4.0-rc9
-
Component/s: Networking
-
None
Two mongod processes crashed at point A in the attached graph, and were re-introduced into the replica set after being repaired at point B. During this period from A to B the numSlowDNSOperations metric spiked, and immediately plateaued at point B once the nodes were restored. The logs for the period A to B are not available, but we can simulate this scenario to see if this behavior is repeatable.
Since this metric is only counting the number of DNS operations past a threshold, it could be the case that the latency is roughly equal to the threshold level and the observed behavior is benign. It also could be the case that there was some actual network problem external to the server. Unfortunately, we'll never know for sure without logs. The fact that the slow DNS operations stopped exactly at point B makes this last scenario less likely.