-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 2.0.2, 2.0.5
-
Component/s: Replication
-
Environment:Debian stable with latest .deb from 10gen
-
Linux
I have two machines running mongos locally which connects to 4 replicasets in a sharded environment.
One set has a primary, secondary and arbiter. Sometimes when I flip secondary / primary mongos has troubles figuring out which one is the primary.
The new primary was elected when I used rs.reconfigure() to put a new priority on the other.
This is what the error looks like after switching primary:
Wed May 23 11:14:28 [conn386] DBException in process: could not initialize cursor across all shards because : socket exception @ DuegoB/mongo1:27017,mongo4:27017Wed May 23 11:14:29 [conn389] ns: xxx.communications could not initialize cursor across all shards because : stale config detected for ns: xxx.communications ParallelCursor::_init @ DuegoB/mongo1:27017,mongo4:27017 attempt: 0
Also see attached mongos.log on what happens sometimes when I try to restart mongos (since it never seemed to found the new primary)
It freezes up and never restarts.
Another weird entry in that log is:
Wed May 23 11:15:55 [conn458] Socket say send() errno:32 Broken pipe 172.16.49.111:27017
Wed May 23 11:15:55 [conn458] DBException in process: could not initialize cursor across all shards because : socket exception @ Duego2/mongo2:27027,mongo3:27027
We
Where 172.16.49.111 is one of the servers in the replicaset that switched primary. But the log also mentions Duego2/mongo2:27027,mongo3:27027 which is not part of this set?