After replica set failover, I get the following exception in my apps:
com.mongodb.MongoException: setShardVersion failed host[mongodb04.example.com:27018]
{ errmsg: "not master", ok: 0.0 }Strangely, I do not get this message for every request. It seems to happen about 50% of the time. The other 50% of the requests succeed without errors. I have seen this go on for several minutes (e.g. 15-20). The only way I can resolve it is by failing back to the original master, or by restarting the mongos on the appserver. Reproducing this problem is very simple:
1.) Run "watch -n 1 curl -v http://example.com/something/that/queries/mongodb" on the appserver
2.) Run "rs.stepDown()" on a MongoDB master
3.) Watch your curl command intermittently fail (seemingly) forever
4.) Run "rs.stepDown()" on the new MongoDB master (fail back to original master)
5.) Watch your curl command succeed
Additionally after failover, I see several messages like this in my mongos.log (on the order of 30-40 per second):
Thu May 5 16:34:43 [conn78] ReplicaSetMonitor::_checkConnection: mongodb05.example.com:27018
{ setName: "2", ismaster: true, secondary: false, hosts: [ "mongodb05.example.com:27018", "mongodb04.example.com:27018" ], arbiters: [ "mongodb06.example.com:27018" ], maxBsonObjectSize: 16777216, ok: 1.0 }These go away as soon as I fail back to the original master. I don't know if this is related to the same issue, so I created another ticket for this: https://jira.mongodb.org/browse/SERVER-3040