-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Replication
The only secondary in a three member PSA replica set mistakenly has a priority of zero. The secondary is not lagging and is completely caught up.
When trying to step down the Primary, the command fails as follows:
In 3.0.9
test> rs.stepDown() { "ok" : 0, "errmsg" : "No electable secondaries caught up as of 2016-03-06T07:32:04.336+0200", "code" : 50 }
In 3.2.3
test> rs.stepDown() { "ok" : 0, "errmsg" : "No electable secondaries caught up as of 2016-03-06T09:24:08.753+0200. Please use {force: true} to force node to step down.", "code" : 50 }
The "caught up as of" clause in the errmsg is confusing as the user is led to believe the reason the primary cannot step down is due to a lagging secondary. The causes the user to pointlessly check and recheck rs.printreplicationinfo() and rs.printSlaveReplicationInfo() to verify there is not any lag.
In this case where there are no secondaries that are electable, the errmsg would be more helpful if it just said "No electable secondaries". In this case, the clause "caught up as of" could be skipped.
The streamlined error message would correctly prod the user to check the rs.conf(), looking for why the secondary is unelectable, instead of pointlessly checking lag times.
Thanks