-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Optimization
As noted in our report of performance improvement opportunities for Time Series:
This is the how some users implement the "from the start of the period" predicate, including with a small change the "year to date", "month to date" pattern.
$match on time is most efficiently evaluated using the built in _id index which exists for all time series collections. Our current engine does not recognize enough patterns that the index can solve, eg this two stage predicate is not rewritten and not recognized as sargable (which is a predicate that can be solved by index lookup):
[{ "$addFields" :
{ 'hourDiff':
{ '$dateDiff':
} } },
{ "$match" : {'hourDiff':
}}]
Whilst an equivalent single $match stage is recognized:
{$match: {$expr:
{$gte: ["$time",
{$subtract: ["$$NOW", {$const: 24 * 60 * 60 * 1000} ]}]}}}
On one hand, we can encourage customers to write single stage predicates to make it easier for us to choose the optimal access path; on the other hand, we should also improve our ability to do deeper (cross stage) analysis and rewrite. The question is whether we try to improve the analysis and rewrite in the current engine or only attempt to do this once ABT (part of the new optimizer) is mature enough.