-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
The resync procedure recommends dropping everything from the dbpath. However, this also means that we would lose the lastVote document, which could result in the node revoting in a term it has already voted in during the syncing process.
We should explore the most feasible way to modify the resync procedure safely. There are two options:
- Preserve the lastVote document from the local database after clearing the dbpath, similar to what we do for FCBIS (
SERVER-69861). - Use reconfig to first remove the node to be resynced, then clear the dbpath, and finally add it back. This approach would leverage the Initial Sync Semantics project, which prevents newly added nodes from voting (SPM-1096), (we need to prove this approach is safe, see
SERVER-48257).
Since this method is widely used for initial syncs, we should consult with TSEs and the Production team to evaluate the procedure.
To move forward, we need to:
- Propose a new procedure.
- Discuss the procedure with TSEs and Production for evaluation.
- Test the new procedure and show its correctness.
- Obtain approval from stakeholders.
- File documentation tickets.
- is related to
-
SERVER-48257 Reject heartbeat reconfig when running for election
- Closed
-
SERVER-69861 Uninterruptible lock guard in election causes FCBIS to hang
- Closed
-
SERVER-94710 It may be possible for a PSA replica set to have two primaries during upgrade / downgrade
- Closed