Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-101722

Add StageConstraints to LiteParsedDocumentSource

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Integration
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      For context, there are two types of DocumentSource / Pipeline classes:

      • LiteParsedDocumentSource/LiteParsedPipeline: Each stage is a direct representation of the BSON the user provided. As such the pipeline represents the original user provided query very closely. If a stage is implemented by desugaring or introducing internal-only stages, that will not yet be reflected here. Neither will optimizations or view resolutions. This happens early in query processing, does minimal validation, and is mostly used for reference by the main query processing code.
      • DocumentSource / Pipeline: The stages / list of stages that actually execute the get the results of the query. These stages may have been de-sugared / converted into other and/or internal-only stages, and further re-ordered by the optimizer, and as such may not obviously represent the original user query (but logically produces the results of it). This concept is introduced further down query processing.

      The main DocumentSource class has a notion of "StageConstraints", which are conceptual tags about the behavior of the stage. For example, the StageConstraints may specify that a stage can only be the first stage in a pipeline, or that a stage produces sorted results (which could be true for many stages). With these StageConstraints available for any DocumentSource, generic validation, processing, or optimization logic can be written in a function that is abstract to any particular stage, and avoid special casing. For example, Pipeline::validateCommon() can validate that the structure of a pipeline valid prior to executing it, without knowing what the specific stages in the pipeline are, by validating that the StageConstraints on the abstract DocumentSources are respected.

      Ideally, we should have a similar concept for LiteParsedDocumentSource. All the same aforementioned concepts apply when we have a set of stages that is representative of the original BSON the user provided. This way we could for example, agnostically validate that some stage must be the first stage in a pipeline, without know what exact stage it is in the LiteParsedPipeline logic (i.e. LiteParsedPipeline::validate()). Some logic that currently takes place in the Pipeline code, could even be migrated over to the LiteParsedPipeline code after this change, if all the necessary information exists earlier in query processing.

      Its unclear if the exact StageConstraints structure should be added to LiteParsedDocumentSource, or something similar.

            Assignee:
            Unassigned Unassigned
            Reporter:
            joseph.shalabi@mongodb.com Joe Shalabi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None