-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Server Programmability
We're using futexes to wait on an atomic in the TicketPool, which is a component of the Execution Control rate-limiter and scheduler. This is how we queue waiters and wake them up when a ticket is available.
The problem is that this wait is not interruptible by the MongoDB interruption mechanism, which just sets an atomic flag. To work around that problem, we wait for 500ms and then re-queue.
If we enter a state where most operations queue for more than 500ms, we'll likely enter an undesirable metastable failure state where every operation is queueing, timing out, and re-queueing, which does extra context switching and wastes CPU. 500ms is a lot, but when there are thousands of client threads, this could be problematic.
It would be nice to have a version of OperationContext::waitForAtomicOrInterrupt. The futex syscall supports waiting on multiple atomics at once, so I don't believe this would be complicated to support. Our error codes are all less than the uint32_t max value. One challenge is that just using a futex would circumvent the existing behavior of waitForConditionOrInterrupt, which allows waiters to actively participate in network IO. That said, the ticketholder today is already not participating in this system, so this would not be a change of behavior.