Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.4.0-rc3, 4.7.0
Affects Version/s: None
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.4
Sprint:
Repl 2020-04-06, Repl 2020-04-20
Linked BF Score:
42
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

There are currently two problems.

1) We do not check if we are still primary before writing down a new config document locally. Consider the following scenario:

Node1 receives a reconfig command
Node1 begins stepping down because it hears of a new term
Node1 starts killing both writes (and some system ops) that hold the global lock in X, IX, or S mode and reads that encounter prepare conflicts. The replSetReconfig command does not fall into either category.
Node1 finishes killing ops and steps down, transitioning to secondary
Node1 writes down the new config document, which takes the DB lock in X mode but will not be killed since we already finished stepping down

Node1's config will continue to get propagated via heartbeats even though it already stepped down.

2) The replSetReconfig command does a no-op write, but does not check that the node is still primary before doing so (Similar example, readConcern: linearizable)

We end up calling onInternalOpMessage, which will pass in an empty namespace. Because of this, we don't actually do the primary check in _logOpsInner. This would mean that we can allow the reconfig no-op write to occur on a secondary.

Since these two things should happen together to avoid any inconsistent states, we should consider refactoring the code so we can do the primary check once.

depends on

SERVER-47205 Stopping dropping snapshots after safe reconfig that does not change writeConcernMajorityJournalDefault

Closed

is duplicated by

SERVER-46516 Majority write concern is blocked by dropping snapshot on reconfig

Closed

is related to

SERVER-47206 Majority commit point is not set backward after force reconfig or reconfig that changes writeConcernMajorityJournalDefault

Backlog

SERVER-46516 Majority write concern is blocked by dropping snapshot on reconfig

Closed

SERVER-47636 Force reconfig running concurrently with step up can cause reconfig in drain mode to fail

Closed

SERVER-47205 Stopping dropping snapshots after safe reconfig that does not change writeConcernMajorityJournalDefault

Closed

related to

SERVER-47184 replSetReconfig command should check if the node is primary before no-op write

Closed

SERVER-47369 doReplSetReconfig should fail during primary drain mode

Closed

SERVER-47973 Address TODOs in SERVER-47142

Closed

(1 is related to, 3 related to)

Assignee:: Siyuan Zhou

Reporter:: Pavithra Vetriselvan

Participants:: Githook User, Pavithra Vetriselvan, Siyuan Zhou, Tess Avitabile

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: Mar 26 2020 11:46:35 PM UTC

Updated:: Oct 29 2023 10:10:16 PM UTC

Resolved:: Apr 17 2020 03:33:44 PM UTC

Confidence Status Last Update:: 07/Apr/20 10:04 PM

GA Target Date:: None

Public Preview Target Date:: None

Private Preview Target Date:: None

Experiment Target Date:: None

Details

Description

Attachments

Issue Links

Activity

People

Dates