-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 4.2.0-rc2
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
v4.2, v4.0, v3.6
-
-
Repl 2019-08-12, Repl 2019-08-26, Repl 2019-09-09
-
68
If a new primary is in drain mode and the thread getting the next batch from the oplog buffer is slow to run, then it can exit drain mode prematurely here because it didn't get a new batch after 1 second. This is problematic because the oplog buffer could still have oplog entries for the node to apply. Once the node exits drain mode, it will write an oplog entry in the new term. Since we don't stop the thread running oplog application when we exit drain mode, it could then get a new batch of oplog entries that are before the new term oplog entry. When it tries to apply them, it will lead to this fassert because we cannot apply oplog entries that are before our lastApplied.
- related to
-
SERVER-42910 Oplog query with higher timestamp but lower term than the sync source shouldn't time out due to afterClusterTime
- Closed
-
SERVER-39112 Primary drain mode can be unnecessarily slow
- Closed