Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.3.0-rc0
Affects Version/s: None
Component/s: Query Execution, Query Planning
Labels:
None

Assigned Teams:

Query Optimization
Backwards Compatibility:
Fully Compatible
Sprint:
QO 2023-10-16, QO 2023-10-30, QO 2023-11-13, QO 2023-11-27
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

A QuerySolutionNode.filter is always generated for clustered collection scans if the bounds are from expressions, apparently solely to distinguish < from <= and > from >=. In these cases, the scan does a bounds-inclusive scan, then the filter eliminates any records for bounds that are actually exclusive.

For example, a query like this against a clustered collection always generates a filter:

db.ni.find({$and: [{_id: {$gt: 1}}, {_id: {$lt: 3}}]})

However, if the bounds were specified via the "min" (always inclusive) and "max" (always exclusive) options, the plan does not generate a filter, and the scan operator is expected to enforce the correct bounds itself. For example, a query like the following against a clustered collection does NOT generate a filter:

db.ni.find().min({_id: 1}).max({_id: 2}).hint({_id: 1})

Given that it is trivially easy to enforce the correct bounds inside the scan operator, and it is already responsible for doing so for the min-max case, the optimizer should stop generating collection scan filters that exist solely for scan bound inclusive vs exclusive enforcement.

This optimization may also be applicable to index scans that have been decomposed into one or more intervals.

The scan operator will need to know whether the lower and upper bounds are inclusive or exclusive. CollectionScanParams (collection_scan_common.h) has a type that is used in plan nodes to indicate this, although it is a bit hard to consume:

    enum class ScanBoundInclusion {
        kExcludeBothStartAndEndRecords,
        kIncludeStartRecordOnly,
        kIncludeEndRecordOnly,
        kIncludeBothStartAndEndRecords,
    };

It would be easier to consume if it were just two booleans like

// A scan bound is exclusive if the respective flag is false and inclusive it it is true.
bool scanLowerBoundInclusive;
bool scanUpperBoundInclusive;

Whether booleans or the existing enum are used, it needs to be ensured these are parameterized with the SBE plan cache so that cached plans do not have permanently baked-in information on inclusive vs exclusive but instead can be correctly reused at runtime for queries that have different bounds. (I do not know if this is already the case with the CollectionScanParams::ScanBoundfInclusion CollectionScanNode.boundInclusion parameter.)

FYI david.storch@mongodb.com hana.pearlman@mongodb.com amr.elhelw@mongodb.com

Assignee:: James Harrison
Reporter:: Kevin Cherkauer (Inactive)
Participants:: Billy Donahue, Githook User, James Harrison, Kevin Cherkauer
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Apr 03 2023 04:53:50 PM UTC
Updated:: Nov 14 2023 06:29:10 PM UTC
Resolved:: Nov 13 2023 11:43:01 AM UTC
Confidence Status Last Update:: 28/Sep/23 12:03 PM

Details

Description

Attachments

Activity

People

Dates