it's possible for the JournalFlusher to miss the killOp interrupt by timing the opCtx reset right: the killOp marks the JournalFlusher's opCtx killed, but then the JournalFlusher resets the opCtx and never throws the expected error.
The opId that the test fetches via currentOp is associated with the JournalFlusher's opCtx at that moment, and then the opCtx has changed by the time that the test tries to kill the journal flusher thread via killOp. It's a small window of time.
The test sets the JournalFlusher interval (how frequently it runs) to 500 ms. We could decrease the frequency (higher interval), but then we also need the run the JournalFlusher to run in order to get that error thrown.
I recommend a new FAILPOINT, to stop the JournalFlusher before the currentOp and then release it after the killOp is sent.
- related to
-
SERVER-79810 make JournalFlusher::waitForJournalFlush() interruptible when waiting for write concern
- Closed