-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 3.2.0-rc0
-
Component/s: None
-
None
-
Fully Compatible
-
ALL
Following an upgrade of mmapv1 SCCC config servers to CSRS, I sometimes (about 60% of the time) see the new replica set get stuck without a primary after the first config server is restarted without --configsvrMode=sccc set and enters the REMOVED state. The remaining 3 replica set members stay in SECONDARY state.
This is with commit dbbc9a2e3d8c4d7fe1748fa980ba7d01b9489dbe.
rs.status():
csrs:REMOVED> rs.status() { "set" : "csrs", "date" : ISODate("2015-10-21T21:51:22.697Z"), "myState" : 10, "term" : NumberLong(1), "configsvr" : true, "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "neurofunk.local:9007", "health" : 1, "state" : 10, "stateStr" : "REMOVED", "uptime" : 53, "optime" : { "ts" : Timestamp(1445464229, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2015-10-21T21:50:29Z"), "infoMessage" : "could not find member to sync from", "configVersion" : 3, "self" : true }, { "_id" : 1, "name" : "neurofunk.local:53836", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 52, "optime" : { "ts" : Timestamp(1445464217, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2015-10-21T21:50:17Z"), "lastHeartbeat" : ISODate("2015-10-21T21:51:22.161Z"), "lastHeartbeatRecv" : ISODate("2015-10-21T21:51:22.111Z"), "pingMs" : NumberLong(0), "configVersion" : 3 }, { "_id" : 2, "name" : "neurofunk.local:53835", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 52, "optime" : { "ts" : Timestamp(1445464217, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2015-10-21T21:50:17Z"), "lastHeartbeat" : ISODate("2015-10-21T21:51:22.161Z"), "lastHeartbeatRecv" : ISODate("2015-10-21T21:51:22.111Z"), "pingMs" : NumberLong(0), "configVersion" : 3 }, { "_id" : 4, "name" : "neurofunk.local:53834", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 52, "optime" : { "ts" : Timestamp(1445464217, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2015-10-21T21:50:17Z"), "lastHeartbeat" : ISODate("2015-10-21T21:51:22.161Z"), "lastHeartbeatRecv" : ISODate("2015-10-21T21:51:22.111Z"), "pingMs" : NumberLong(0), "configVersion" : 3 } ], "ok" : 1, "$gleStats" : { "lastOpTime" : Timestamp(0, 0), "electionId" : ObjectId("000000000000000000000000") } } csrs:REMOVED>
I will attach logs.
- related to
-
SERVER-21110 pv1 should not call TopologyCoordinatorImpl::_isOpTimeCloseEnoughToLatestToElect()
- Closed