Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13197

Tighten index bounds and allow compound index to be chosen when predicate on leading field is not provided

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.0
    • Component/s: Querying
    • Query Optimization

      Updated description:

      For a compound index {a: 1, b: 1}, a range query {b: {$gt: 10, $lt: 20}} (using a hint on the compound index) produces the following index bounds:

      // 2.4:
      {a: [{$minElement: 1}, {$maxElement: 1}], b: [10, 20]}
      
      // 2.6
      {a: [{$minElement: 1}, {$maxElement: 1}], b: [{$minElement: 1}, {$maxElement: 1}]}
      

      Bounds could be tighter under 2.6.

      As a workaround for 2.6, consider adding the leading field to the query.

      // before:
      {b: {$gt: 10, $lt: 20}}
      
      // workaround:
      {a: {{$gt: MinKey, $lt: MaxKey}, b: {$gt: 10, $lt: 20}}
      

      --------
      This is the output from the reproduction steps in current 2.4 releases:

      {
              "cursor" : "BtreeCursor name_1_age_1",
              "isMultiKey" : false,
              "n" : 60803,
              "nscannedObjects" : 60803,
              "nscanned" : 60827,
              "nscannedObjectsAllPlans" : 60803,
              "nscannedAllPlans" : 60827,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 0,
              "nChunkSkips" : 0,
              "millis" : 207,
              "indexBounds" : {
                      "name" : [
                              [
                                      {
                                              "$minElement" : 1
                                      },
                                      {
                                              "$maxElement" : 1
                                      }
                              ]
                      ],
                      "age" : [
                              [
                                      21,
                                      30
                              ]
                      ]
              },
              "server" : "ubuntu:27017"
      }
      

      The range shown in the index bound comes from the range on the query, and is the second element in the compound key.

      The same data in 2.6.0-rc0 and rc1 produces this:

      {
              "cursor" : "BtreeCursor name_1_age_1",
              "isMultiKey" : false,
              "n" : 60803,
              "nscannedObjects" : 200000,
              "nscanned" : 200000,
              "nscannedObjectsAllPlans" : 200000,
              "nscannedAllPlans" : 200000,
              "scanAndOrder" : false,
              "indexOnly" : false,
              "nYields" : 1562,
              "nChunkSkips" : 0,
              "millis" : 646,
              "indexBounds" : {
                      "name" : [
                              [
                                      {
                                              "$minElement" : 1
                                      },
                                      {
                                              "$maxElement" : 1
                                      }
                              ]
                      ],
                      "age" : [
                              [
                                      {
                                              "$minElement" : 1
                                      },
                                      {
                                              "$maxElement" : 1
                                      }
                              ]
                      ]
              },
              "server" : "raring:27017",
              "filterSet" : false
      }
      

      So the range is now gone, nscanned is up to the full size of the collection as also has the execution time increased as a result of the scan.

      As stated, the matching range is on the second element of the compound key, but if this is intentional why does the prior version match the same number of documents and in considerably less time?

            Assignee:
            backlog-query-optimization [DO NOT USE] Backlog - Query Optimization
            Reporter:
            neillunn Neil Lunn
            Votes:
            9 Vote for this issue
            Watchers:
            38 Start watching this issue

              Created:
              Updated: