-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Regardless of any other enhancements that can be done to the histogram construction algorithm, this one seems to be straightforward and a big win – do not store entire strings when constructing the histogram. Take the prefix that is actually used and throw the rest away at first opportunity.
This will avoid OOMs in the case a user is compelled to build a histogram on a very large field.
db.foo.drop(); let docs = []; for (let i = 0; i < 500000; i++) { db.foo.insertOne({a: 'a'.repeat(i)}); } db.foo.runCommand({analyze: "foo", key: "a"});