ISSUE SUMMARY
New sharded connections may fail to connect if any shard primary is down.
This issue is part of 4 related issues which impact cluster availability when there is no primary available for a shard. See SERVER-7246, SERVER-5625, SERVER-11971 and SERVER-12041 for more details.
USER IMPACT
When any replica set in a sharded cluster has no available primary, new connections may fail to perform secondary reads due to an initial heuristic shard version check, or initial authorization check.
It is present in versions of MongoDB prior to and including v2.4.8.
SOLUTION
Ignore failures of initial version check during connection and allow authorization against secondaries (primary is preferred when available).
In v2.4.9 only (this is set by default in v2.6.0 and later), it is necessary to use the following two startup parameters for mongos:
--setParameter ignoreInitialVersionFailure=true --setParameter authOnPrimaryOnly=false
WORKAROUNDS
There is no workaround.
PATCHES
Production release v2.4.9 contains the fix for this issue, and production release v2.6.0 will contain the fix as well.
Original Description
... if a shard is down, we get a socket exception or no master exception. Issue is in _check in s/shardConnection.cpp
- is duplicated by
-
SERVER-4498 Creating a ShardConnection calls checkShardVersion on every shard
- Closed
-
SERVER-7257 mongos log message seems wrong / referring to the wrong shard???
- Closed
- is related to
-
SERVER-6134 stale mongod view of shards causes m/r error
- Closed
-
SERVER-7246 Mongos cannot do slaveOk queries when primary is down
- Closed
-
SERVER-9646 MongoS checkVersion fails if any shard is down
- Closed