Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-86248

Better index tie breaking in group distinct scan

    • Type: Icon: Improvement Improvement
    • Resolution: Gone away
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Optimization
    • QO 2024-02-19, QO 2024-03-04, QO 2024-03-18

      In the bestbuy high-value workload distinct_types_with_in_predicate-useAgg, we run an aggregation

      db.products.aggregate([{$match: {type: {$in: ["Game", "Movie", "Music", "HardGood"]}}}, {$group: {_id: "$type"}}]) 

      which results in a plan

       

       

        stages: [
          {
            '$cursor': {
              queryPlanner: {
                namespace: 'bestbuy.products',
                indexFilterSet: false,
                parsedQuery: { type: { '$in': [ 'Game', 'HardGood', 'Movie', 'Music' ] } },
                queryHash: '24FB8A6A',
                planCacheKey: '6DBEACC2',
                optimizationTimeMillis: 0,
                maxIndexedOrSolutionsReached: false,
                maxIndexedAndSolutionsReached: false,
                maxScansToExplodeReached: false,
                winningPlan: {
                  isCached: false,
                  stage: 'PROJECTION_COVERED',
                  transformBy: { type: 1, _id: 0 },
                  inputStage: {
                    stage: 'DISTINCT_SCAN',
                    keyPattern: { type: 1, subclass: 1, genre: 1, format: 1 },
                    indexName: 'type_1_subclass_1_genre_1_format_1',
                    isMultiKey: false,
                    multiKeyPaths: { type: [], subclass: [], genre: [], format: [] },
                    isUnique: false,
                    isSparse: false,
                    isPartial: false,
                    indexVersion: 2,
                    direction: 'forward',
                    indexBounds: {
                      type: [
                        '["Game", "Game"]',
                        '["HardGood", "HardGood"]',
                        '["Movie", "Movie"]',
                        '["Music", "Music"]'
                      ],
                      subclass: [ '[MinKey, MaxKey]' ],
                      genre: [ '[MinKey, MaxKey]' ],
                      format: [ '[MinKey, MaxKey]' ]
                    }
                  }
                },
                rejectedPlans: []
              }
            }
          },
          { '$groupByDistinctScan': { newRoot: { _id: '$type' } } }
        ],

      However, there is a shorter index which can satisfy the query that the planner does not consider,

       

       

      db.products.aggregate([{$match: {type: {$in: ["Game", "Movie", "Music", "HardGood"]}}}, {$group: {_id: "$type"}}], {hint: {type: 1, name: 1}})
        stages: [
          {
            '$cursor': {
              queryPlanner: {
                namespace: 'bestbuy.products',
                indexFilterSet: false,
                parsedQuery: { type: { '$in': [ 'Game', 'HardGood', 'Movie', 'Music' ] } },
                queryHash: '24FB8A6A',
                planCacheKey: '6DBEACC2',
                optimizationTimeMillis: 0,
                maxIndexedOrSolutionsReached: false,
                maxIndexedAndSolutionsReached: false,
                maxScansToExplodeReached: false,
                winningPlan: {
                  isCached: false,
                  stage: 'PROJECTION_COVERED',
                  transformBy: { type: 1, _id: 0 },
                  inputStage: {
                    stage: 'DISTINCT_SCAN',
                    keyPattern: { type: 1, name: 1 },
                    indexName: 'type_1_name_1',
                    isMultiKey: false,
                    multiKeyPaths: { type: [], name: [] },
                    isUnique: false,
                    isSparse: false,
                    isPartial: false,
                    indexVersion: 2,
                    direction: 'forward',
                    indexBounds: {
                      type: [
                        '["Game", "Game"]',
                        '["HardGood", "HardGood"]',
                        '["Movie", "Movie"]',
                        '["Music", "Music"]'
                      ],
                      name: [ '[MinKey, MaxKey]' ]
                    }
                  }
                },
                rejectedPlans: []
              }
            }
          },
          { '$groupByDistinctScan': { newRoot: { _id: '$type' } } }
        ], 

      We should use the shorter index in this case.

            Assignee:
            Unassigned Unassigned
            Reporter:
            ben.shteinfeld@mongodb.com Ben Shteinfeld
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: