Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-88387

Have the classic find sort & DocumentSourceSort release its sort table when it has done with outputting all data

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution
    • Fully Compatible

      In the context of the cursor storm issue, I was examining the find sort and DocumentSourceSort code and have found that it does not release its sort table even when it's done with outputting all data.

      Consider releasing the sort table when the sort is done with outputting all data, if this is really the case.

      Update #1:
      To limit the scope of this ticket, decided to remove this goal.

      -And also consider defining a desired behavior (or protocol) of blocking stages about their memory when it's done with outputting their data.

      Currently, DocumentSourceGroup dispose itself when it's done with outputting its data.

      And also consider defining a desired behavior (or protocol) of blocking stages about child's memory when it's exhausted the child's input.

      The SBE stages close its child when it has but the classic find/aggregate blocking stages don't.

      Update #2: we had an offline discussion.

      For the DocumentSourceSort issue, the in-memory sort table is actually released when SortExecutor resets _output. The issue actually does not exist.

      Conceptually, any blocking stage has two phases: input phase & output phase.

      • The input phase is the period when the blocking stage read input from the source or child
      • The output phase is the period when the blocking stage output data from its internal state

      I think it would be desirable that either 1) every blocking stage consistently releases its memory when it's done with outputting data or 2) every blocking stage consistently releases child's or source's memory when it exhausts input. It's important for each stage to take the consistent behavior about query resource management. Otherwise, it's hard to track overall query resource usage in the mental model. It would be worth checking DocumentSourceGroup in this aspect.

      There are also other cases like It would be better for DocumentSourceUnionWith to release each source's memory resource before reading other source's input because each source may have blocking stages or buffering stages. It would be worth checking DocumentSourceUnionWith in this aspect. If all sources of the unionWith stage release its resource when it's done with outputting data, it also should be fine.

      Lastly, DocumentSourceCursor is a buffering stage which has deque container to buffer the input. It would be worth checking if DocumentSourceCursor release its deque container after outputting its data.

      While we're doing this, it would be better to focus on big memory consumers.

            Assignee:
            stephanie.eristoff@mongodb.com Stephanie Eristoff
            Reporter:
            yoonsoo.kim@mongodb.com Yoon Soo Kim
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: