Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32351

The task-specific totalTimeSpentExecutingMetrics should not count time they are not actually running due to recursion

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Diagnostics
    • Service Arch
    • ALL

      The above chart presents some overall and some task-specific serviceExecutorTaskStats.metricsByTask.* execution time metrics introduced in SERVER-30567.

      • "tasks total currently executing" is the overall number of tasks currently executing, derived from serviceExecutorTaskStats.totalTimeExecutingMicros
      • "tasks source currently executing" is sourceMessage tasks currently executing, derived from serviceExecutorTaskStats.metricsByTask.sourceMessage.totalTimeExecutingMicros
      • "tasks process currently executing" is processMessage tasks currently executing, derived from serviceExecutorTaskStats.metricsByTask.processMessage.totalTimeExecutingMicros

      Note that, surprisingly, the sum of the two task-specific execution time metrics is 2x the total execution time metrics, making it appear that there are ~500 total tasks executing at the same time (~250 processMessage tasks and ~250 sourceMessage tasks), even though there are only ~250 threads.

      I believe this is because a sourceMessage task can be called recursively on the same stack as a processMessage task by dispatching the task rather than posting it, but when this happens the timer for the processMessage task continues running even though the task itself is not executing. So if this recursion occurs the time for the recursive sourceMessage task gets added to the time for the outer processMessage task, so that time is double-counted.

      (On the other hand the overall serviceExecutorTaskStats.totalTimeExecutingMicros is correct and it does not double-count time in this way because the timer is only started and stopped based on posted, not dispatched, tasks.)

            Assignee:
            backlog-server-servicearch [DO NOT USE] Backlog - Service Architecture
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: