Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-91197

Allow $elemMatch expressions to be swapped before $map projections

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization

      Context:

      Consider the following aggregation query:

      [
              {
                  $addFields: {
                      flattened: {
                          $map: {input: '$outer', as: "iter", in : "$$iter.inner"},
                      },
                  },
              },
              {
                  $match: {flattened: {$elemMatch: {$eq: true}}},
              }
          ] 

      Currently, the pipeline optimisation rules would try to swap the $match stage before the $addFields one which would imply a rename from {{ "flattened": "outer.inner" }}. As such, the generated filter for the underlying find-land query would look like:

      {filter: { "outer.inner": {$elemMatch: {$eq: true}}}}

      Which is wrong. 

      The field outer is an array, and outer.inner would result in array traversal. Consequently {$elemMatch: {$eq: true}} would be matched against non-array elements, which differs from the original behaviour.

      Proposed solution:

      As asya.kamsky@mongodb.com  proposed, it's possible to rename dotted $elemMatch expressions in a different way such that the semantics are preserved after the expression rewrite.

      Instead of generating 

      "outer.inner":{$elemMatch:{$eq:true}} 

      the rewrite engine could output

      "outer":{$elemMatch:{inner:true}} 

      As things currently stand, the renaming implementation doesn't have sufficient context to discern between user provided dots and $map generated ones. This could be addressed in a series of sub-tasks:

      Step 1.0: Introduce a new type: expression::RenameMapping

      • Currently, renames are represented as a StringMap<std::string>.
      • This is not expressive enough. We need to know which dot was introduced by $map.
      • For now we could alias it to absl::flat_hash_map<FieldRef, FieldRef>.

       
       
      Step 2:  Make FieldRef aware of which part was generated by $map.
      Why: If generated by $map , outer.inner should then be split into [<elemMatch path>, <child prefix>].

      • Both parts could have multiple dots, which weren't necessarily introduced by $map.

       

      Step 3.0: Update the renaming algorithm -
      Make wouldRenameSucceed generate multiple Renamables. [Source

      • Currently wouldRenameSucceed can only generate one rewritten path. This won't suffice.
      • If we encounter a $elemMatch with a $map generated dot:
        • Split the rename into head (rename for $elemMatch) and tail (prefix for children)
        • Recurse into the children with a new RenameMapping (see Step 1.1) which would prefix the children's paths with tail .
        • This should be done via hasOnlyRenameableMatchExpressionChildren , which contrary to the name, also mutates the list of renamables.
      • Otherwise keep the current behaviour.

       

      Step 1.1: Extend expression::RenameMapping to support adding prefixes.

      • This could simply be represented by a single FieldRef
      • This type of rename mapping should:
        • When faced with an non-existing path (boost::none) , set it to prefix
        • When faced with an existing path, prefix it with prefix.

       

      Step 3.1: Extend the renaming algorithm to support adding prefixes. * We can't really do that now if the path is boost::none. [Source

            Assignee:
            Unassigned Unassigned
            Reporter:
            catalin.sumanaru@mongodb.com Catalin Sumanaru
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: