-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Performance, Querying
-
Fully Compatible
-
QE 2021-08-23, QE 2021-09-06
There are some low-hanging-fruit optimizations available to ComparisonMatchExpression:
- Call canonicalType() less
SERVER-55043will help a lot, but testing shows it is still faster to avoid canonicalizing where possible.- You only need to enter the first block if the type() doesn't match since same type implies same canonical type. This optimization is already done in woCompare, so this is about doing the same thing in CME.
- Inside the if block it immediately canonicalizes again.
- Optimize NaN checks
- The special NaN logic isn't needed for EQ because it is already done correctly and efficiently in that case by BSONElement::compareElements.
- It will do the NaN checks even for non-numeric types, such as strings. The logic ends up being correct, but it is wasteful.
- It recomputes whether _rhs is NaN every time even though _rhs doesn't change.
- It checks whether Int and Long are NaN, even though they don't have a NaN representation
- It does an expensive Decimal -> double conversion just to check for nan, rather than checking for NaN while still a Decimal.
- It then repeats the NaN check for both _rhs and e if either is NaN to check if both are NaN.
- Do an early-out check when doing EQ comparisons with String to see if the lengths don't match before comparing contents.
- woCompare/compareElements can't do that because it doesn't know if the caller is interested in just equality or relative order, but we know that here.
- We can only do that with the simple collator, but that seems like a case worth optimizing for.
I'm attaching a patch which addresses all of these. I saw a huge improvement in perf with this patch vs without.
I also tried templating ComparisonMatchExpression::matchesSingleElement as matchesSingleElementImpl<MatchType> and calling it from each of the subclasses' matchesSingleElement() so that all of the branching on matchType() would compile away, but I didn't see any improvement in my workload, so I didn't include it in this patch. It is possible that workloads involving more than one MatchType would see some benefit, so it may be worth exploring in the future.