-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
v4.2, v4.0
-
20
This test does:
1. start a 3-node set
2. configure key files on nodes 1 and 2 so node 0 gets auth errors in replSetHeartbeat
3. confirm node 0 goes into RECOVERING due to auth errs (SERVER-3715)
4. stop nodes 1 and 2
5. confirm node 0 goes into SECONDARY
6. restart nodes 1 and 2
7. stop node 0 <-- BF
8. more test operations....
Rarely, after nodes 1 and 2 are restarted in step 6, node 0 has time to heartbeat another node, get an auth error, and go from SECONDARY back to RECOVERING. As a result, when ReplSetTest tries to stop node 0 in step 7, it fails, because ReplSetTest's procedure for stopping a node includes validating collections, which does not work on a RECOVERING node.
If we stop node 0 before restarting nodes 1 and 2 that should fix the race.
I believe the test has had this bug since it was first written for MongoDB 2.2 and backported to 2.0.