Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-99346

Do not keep entire fields in memory when constructing a histogram

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Optimization
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Regardless of any other enhancements that can be done to the histogram construction algorithm, this one seems to be straightforward and a big win – do not store entire strings when constructing the histogram. Take the prefix that is actually used and throw the rest away at first opportunity.

      This will avoid OOMs in the case a user is compelled to build a histogram on a very large field.

      db.foo.drop();
      
      let docs = [];
      for (let i = 0; i < 500000; i++) {
          db.foo.insertOne({a: 'a'.repeat(i)});
      }
      
      db.foo.runCommand({analyze: "foo", key: "a"});
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            philip.stoev@mongodb.com Philip Stoev
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: