As of the fix for SERVER-30189, the $in parsing code makes use of a boost::flat_set in order to avoid the many small allocations required for a tree-based sorted set. This, however, results in O(n^2) worst-case time complexity for parsing an $in, since the boost::flat_set constructor is known to be quadratic:
https://svn.boost.org/trac10/ticket/13140
For $in predicates with many elements, this causes a substantial performance regression. It is observable in that a large $in where the elements are pre-sorted outperforms a large $in where the elements are random by orders of magnitude. (When the $in elements are pre-sorted, the boost::flat_set constructor always inserts at the end of its vector, resulting in O(n log n) runtime.)
We should either patch boost to improve the boost::flat_set constructor's performance, or work around the issue by pre-sorting the $in elements prior to invoking the constructor.
- is related to
-
SERVER-30189 Reduce calls to allocator for large $in expressions
- Closed
- related to
-
SERVER-35699 Avoid holding locks during query parsing
- Backlog
-
SERVER-34012 Planner's logic for taking union of index bounds intervals is slow for large $or queries
- Closed
-
SERVER-35712 boost::flat_set constructor should be O(n log n) given an unsorted range
- Closed
-
SERVER-37176 IndexBoundsBuilder::unionize() can be skipped for $in
- Closed
- mentioned in
-
Page Loading...