Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Minor - P4
Fix Version/s: 8.2.0-rc0
Affects Version/s: None
Component/s: None
Labels:
- storex-ranked

Assigned Teams:

Replication
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
Execution Team 2024-08-19, Execution Team 2024-09-02, Execution Team 2024-10-28, Repl 2024-12-09, Repl 2024-12-23, Repl 2025-01-06, Repl 2025-01-20, Repl 2025-02-03, Repl 2025-02-17, Repl 2025-03-03, Repl 2025-03-17, Repl 2025-03-31, Repl 2025-04-14
Linked BF Score:
200
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

This a problem only to capped collection.

I noticed in the help ticket where a slow query log message reporting that a single capped insert took 2 seconds (i.e, durationMillis ), but the totalOplogSlotDurationMicros took 13 seconds (but there were zero write conflicts and oplog visibility lag is under 2secs)

When performing multiple writes within a single storage transaction and oplog slots reserved separately for each write, a timer is started for each oplog slot. As a result, totalOplogSlotDurationMicros can report the cumulative duration of all open oplog slots in that single storage transaction. Basically, we do this

WUOW start
Insert Doc55 -> (Reserves oplog slot1; Timer1 starts)
Delete 1 -> (Reserves oplog slot2; Timer2 starts)
Delete2 -> (Reserves oplog slot2; Timer3 starts )
Delete 3 -> (Reserves oplog slot3; Timer4 starts)
WUOW commit (
totalOplogSlotDurationMicros = (Timer1+Timer2+Timer3+Timer4).elapsed())

This would be a problem for capped collections because deletes are performed within the same storage transaction as inserts, with optimes reserved separately for each delete op.

In vectored inserts, even though multiple writes occur within a single storage transaction, we track the longest duration of an open oplog slot for each storage transaction (WUOW) and sum it up. This is because we reserve slots in bulk for each WUOW.

Currently, the calculation for totalOplogSlotDuration varies depending on how we reserve oplog slots (bulk vs individually) within a storage transaction (in a given WUOW) and causes confusion.

is related to

SERVER-88911 WiredTigerStats reported in curOp can accumulate from previous operations in bulk writes

Closed

SERVER-84467 Add duration of how long an oplog slot is kept open to FTDC

Open

SERVER-95422 Move CurOp code out of local_oplog_info

Closed

Assignee:: Solomon Lifshits
Reporter:: Suganthi Mani
Participants:: Githook User, Solomon Lifshits, Suganthi Mani
Votes:: 0 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Apr 09 2024 05:48:03 AM UTC
Updated:: Apr 10 2025 05:44:19 AM UTC
Resolved:: Apr 08 2025 02:11:16 PM UTC
Confidence Status Last Update:: 17/Mar/25 5:58 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates