Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85964

Improve SBE join strategy selection

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution

      In determining the join strategy for a $lookup, currently if a suitable index is not available on the foreign collection (which would trigger use of LookupStrategy::kIndexedLoopJoin), the decision whether to use LookupStrategy::kHashJoin or LookupStrategy::kNestedLoopJoin in SBE is based entirely on the stats of the foreign collection. This means that if the foreign collection has stats and is small enough, it will choose HashJoin even if the local collection has only one document in it. In edge cases like this NestedLoopJoin would be faster.

      The join strategy selection should be improved to veto HashJoin if there both are stats available for the local collection AND they show it has a very small number of documents (maybe < 1,000? Some experimentation needed to find a reasonable cutoff point – perhaps this would be as low as single digits in practice). If there are no stats available for the local collection, it should continue to assume that the hash table will pay for itself and choose HashJoin, as that will be the more common case in practice.

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            kevin.cherkauer@mongodb.com Kevin Cherkauer
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: