Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.2.1, 4.3.1, 4.0.17
Affects Version/s: 4.2.0-rc2
Component/s: Replication
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.2, v4.0, v3.6
Steps To Reproduce:
Hide

Add a failpoint to sync_tail.cpp and if it is set, sleep for a second after this line.

Set the failpoint in drain.js after setting the rsSyncApply failpoint.

Run jstests/replsets/drain.js
Show
Add a failpoint to sync_tail.cpp and if it is set, sleep for a second after this line . Set the failpoint in drain.js after setting the rsSyncApply failpoint . Run jstests/replsets/drain.js
Sprint:
Repl 2019-08-12, Repl 2019-08-26, Repl 2019-09-09
Linked BF Score:
68
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

If a new primary is in drain mode and the thread getting the next batch from the oplog buffer is slow to run, then it can exit drain mode prematurely here because it didn't get a new batch after 1 second. This is problematic because the oplog buffer could still have oplog entries for the node to apply. Once the node exits drain mode, it will write an oplog entry in the new term. Since we don't stop the thread running oplog application when we exit drain mode, it could then get a new batch of oplog entries that are before the new term oplog entry. When it tries to apply them, it will lead to this fassert because we cannot apply oplog entries that are before our lastApplied.

related to

SERVER-42910 Oplog query with higher timestamp but lower term than the sync source shouldn't time out due to afterClusterTime

Closed

SERVER-39112 Primary drain mode can be unnecessarily slow

Closed

Assignee:: Siyuan Zhou
Reporter:: Samyukta Lanka
Participants:: Githook User, Judah Schvimer, Samyukta Lanka, Siyuan Zhou, Will Schultz
Votes:: 0 Vote for this issue
Watchers:: 12 Start watching this issue

Created:: Jul 12 2019 09:06:54 PM UTC
Updated:: Oct 29 2023 10:19:04 PM UTC
Resolved:: Aug 23 2019 11:13:13 PM UTC
Confidence Status Last Update:: 07/Aug/19 9:59 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates