Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-8954

Index Key Extraction Much Slower for Some Data Schemas Than Others

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.3
    • Component/s: Index Maintenance
    • None
    • Environment:
      Ubuntu 12.04 LTS
    • Fully Compatible
    • ALL
    • Hide

      To run the test case using objects in the array:
      python test-indexes.py --object

      To run the test case using integers in the array:
      python test-indexes.py --integer

      Show
      To run the test case using objects in the array: python test-indexes.py --object To run the test case using integers in the array: python test-indexes.py --integer

      I have a collection that essentially is an _id and a list of objects of
      complex type. There is an multikey index on this collection from the _id
      to the id of each of the items in the list.

      ie:

      db.test.ensureIndex({_id: 1, 'items.id': 1})
      db.test.insert({_id: ObjectId("...."),
      items: [

      {id: 1}

      ,

      {id: 2}

      ,
      ....

      {id: 1000}

      ,
      ]
      })

      With a small number of items in the list, insert and update times for an
      individual item are reasonable, but once the number of items in the list
      is greater than 1,000 the time to insert or update just one item updates
      starts to slow down dramatically:

      Inserted document with 1000 items in 0.048251 seconds
      Updated document ($set) with 1000 items in 0.104173 seconds
      Updated document ($push) with 1000 items in 0.318420 seconds
      Inserted document with 2000 items in 0.199266 seconds
      Updated document ($set) with 2000 items in 0.483723 seconds
      Updated document ($push) with 2000 items in 1.026530 seconds
      Inserted document with 3000 items in 0.593618 seconds
      Updated document ($set) with 3000 items in 1.053177 seconds
      Updated document ($push) with 3000 items in 2.245902 seconds
      Inserted document with 4000 items in 0.991389 seconds
      Updated document ($set) with 4000 items in 1.898991 seconds
      Updated document ($push) with 4000 items in 4.001129 seconds
      Inserted document with 5000 items in 1.490980 seconds
      Updated document ($set) with 5000 items in 3.080210 seconds
      Updated document ($push) with 5000 items in 6.076108 seconds
      Inserted document with 6000 items in 2.144194 seconds
      Updated document ($set) with 6000 items in 4.325883 seconds

      I've attached a test program that creates the output described above. It
      will insert a test document with an ever increasing number of items. It
      will then $set the list on the newly inserted document to itself. After
      that it will attempt to $push one new item onto the list.

      I've run the same test above with integers as the list items instead
      of an object. As the number of items increases the insert/update speed
      slows down, but the performance doesn't degrade nearly as severly as it
      does when using objects.

        1. test-indexes.py
          1 kB
        2. integer.svg
          123 kB
        3. object.svg
          68 kB

            Assignee:
            Unassigned Unassigned
            Reporter:
            michael@songza.com Michael Henson
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: