Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63883

Report operation latency from receiving request to sending response

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Service Arch
    • Service Arch 2022-09-19, Service Arch 2022-10-03, Service Arch 2023-02-20
    • 105

      The slow query latencies reported in the mongod log and in ftdc are based on a timer in CurOp that does not include substantial mongod code path from the time the request is received until the timer is started, and from when the timer is stopped until the response is transmitted. This can result in the mongod latencies being substantially underreported in the logs and in ftdc. Some examples:

      • If the server is bottlenecked on CPU with a large number of concurrent simple queries, we've seen client-side latencies of >10 ms (most of which was CPU time in mongod) whereas latencies reported in FTDC and logs are <<1 ms, because the untimed overhead was much larger than the time needed to actually execute the query.
      • We have seen bottlenecks in places like the allocator or SSL processing that resulted in significant query latency but which is not reported in FTDC or logs because it was outside the timed code path. Reporting the entire duration including the entire request/response lifecycle code path would help us diagnose problems outside the currently timed code path.

      Would it be possible to start a timer when recv() returns a new request from the o/s, and stop it when send() returns from sending the response to the o/s? TBD how or where this would be reported - in place of current latency metrics in FTDC and logs? As new field in slow query logs alongside durationMillis? As new separate log entry with its own durationMillis?

            Assignee:
            shameek.ray@mongodb.com Shameek Ray
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated: