-
Type: Bug
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
ALL
-
Execution Team 2024-08-19, Execution Team 2024-09-02, Execution Team 2024-10-28, Repl 2024-12-09, Repl 2024-12-23, Repl 2025-01-06, Repl 2025-01-20
This a problem only to capped collection.
I noticed in the help ticket where a slow query log message reporting that a single capped insert took 2 seconds (i.e, durationMillis ), but the totalOplogSlotDurationMicros took 13 seconds (but there were zero write conflicts and oplog visibility lag is under 2secs)
When performing multiple writes within a single storage transaction and oplog slots reserved separately for each write, a timer is started for each oplog slot. As a result, totalOplogSlotDurationMicros can report the cumulative duration of all open oplog slots in that single storage transaction. Basically, we do this
WUOW start Insert Doc55 -> (Reserves oplog slot1; Timer1 starts) Delete 1 -> (Reserves oplog slot2; Timer2 starts) Delete2 -> (Reserves oplog slot2; Timer3 starts ) Delete 3 -> (Reserves oplog slot3; Timer4 starts) WUOW commit ( totalOplogSlotDurationMicros = (Timer1+Timer2+Timer3+Timer4).elapsed())
This would be a problem for capped collections because deletes are performed within the same storage transaction as inserts, with optimes reserved separately for each delete op.
In vectored inserts, even though multiple writes occur within a single storage transaction, we track the longest duration of an open oplog slot for each storage transaction (WUOW) and sum it up. This is because we reserve slots in bulk for each WUOW.
Currently, the calculation for totalOplogSlotDuration varies depending on how we reserve oplog slots (bulk vs individually) within a storage transaction (in a given WUOW) and causes confusion.
- is related to
-
SERVER-88911 WiredTigerStats reported in curOp can accumulate from previous operations in bulk writes
- Open
-
SERVER-84467 Add duration of how long an oplog slot is kept open to FTDC
- Open
-
SERVER-95422 Move CurOp code out of local_oplog_info
- Open