-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Execution
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Current gauge metrics (serverstatus_globallock_currentqueue_readers/writers) miss short-lived queue spikes between monitoring intervals (5-minute pings).
We need counter metrics that track cumulative operations added to queues since server start to calculate rates over arbitrary time periods.
For pre-7.0 versions, the calculation (serverStatus.globalLock.currentQueue.reads + serverStatus.globalLock.activeClients.readers - serverStatus.wiredTiger.concurrentTransactions.read.out) is used to determine queue size, but this still only provides point-in-time values.
Proposed Solution is to add two new counters:
- serverStatus.globalLock.totalQueuedReaders
- serverStatus.globalLock.totalQueuedWriters
These would increment when operations join respective queues and reset only on node restart, allowing rate calculations between arbitrary timestamps.
Use Case
Detect system overload conditions by tracking queue growth rates, especially short-lived spikes that occur between regular monitoring intervals. 5-minute monitoring intervals can miss significant queue spikes.