-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.30, 6.0.15, 7.0.8, 5.0.27
-
Component/s: None
-
Server Programmability
-
ALL
-
-
Service Arch 2024-04-01, Service Arch 2024-04-15
We discovered a strange phenomenon. After in-depth research, we found that it was a bug in the implementation of ScanningReplicaSetMonitor.
First, prepare a single-shard cluster, with one primary and two secondary, and a script query secondary, like this
```
import pymongo from pymongo import MongoClient import time c = MongoClient("mongodb://xxxxx/admin?readPreference=secondaryPreferred") while True: for _ in c.db.coll。find(): pass
```
Then , let’s look at a series of common operations and the phenomena behind them.
* Set secondary 1 hidden = true , a few seconds later,set secondary 1 hidden = false.
at this time , I will find that only node 2 has query operation. And node 1 have a large replicaSetPingTimesMillis in
mongos>db.adminCommand("getDiagnosticData").data.connPoolStats.replicaSetPingTimesMillis { "mongo109" : "x1:27017" : 2.459, "x2:27017" : 2.289, "x3:27017" : 9223372036854776 }, }
- Restart node 1
Then everything will return to normal, queries are distributed normally, and replicaSetPingTimesMillis is normal.
- Set secondary 1 and secondary 2 both hidden = true and a few seconds later revert to hidden = false ;
at this time , I will find queries are distributed normally but all secondary replicaSetPingTimesMillis is large ;
* After restart node 1 ,onle secondary 1 has query operation.
The key reason behind the above phenomenon is that : * ServerPingMonitor::onTopologyDescriptionChangedEvent just remove monitors that are missing from the topology ; but don't add new monitors;
- int struct LatencyWindow , Due to the following code, there will be a Window (max(),max())
upper = (lowerBound == HelloRTT::max()) ? lowerBound : lowerBound + windowWidth;
I think ServerPingMonitor is a bug, LatencyWindow is a feature
- is related to
-
DRIVERS-2899 Verify drivers can target newly-unhidden nodes
- Backlog
-
SERVER-62079 remove scanning RSM
- Closed