Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Aggregation Framework
Labels:
- optimization

Operating System:
ALL
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

While implementing a feature to handle CSV like input of the form:

A,B,C // header
1,2,3
4,5,6
etc...

We naively implemented it with the following $match condition:

$or: [
    { A: 1, B: 2, C: 3},
    { A: 4, B: 5, C: 6},
    etc...
]

After seeing bad performances/scalability of this approach we tried two alternatives (these are in an aggregation pipeline):

One with $in:

$project: {
    computed_obj: { "1": "$A", "2": "$B", "3": "$C" }
},
$match: {
    computed_obj: { 
        $in: [
            { "1": 1, "2": 2, "3": 3 },
            { "1": 3, "2": 4, "3": 5 },
            etc...
        ]
    }
}

One with $setIsSubset:

$project: {
    condition_value: {
        $setIsSubset: [
            {
                $map: {
                    input: [null], 
                    as: "var__", 
                    in { "1": "$A", "2": "$B", "3": "$C" }
                }
            }, 
            [
               {"1": 1, "2": 2, "3": 3},
               {"1": 3, "2": 4, "3": 5},
               etc...
            ]
        ]
    }
}, 
$match: { condition_value: true }

We found that when starting to have big enough sets the $in approach was in fact slower and not even the same complexity than the $setIsSubset one.
We then noticed that $setIsSubset is using a std::unordered_set whereas $in is using a simple std::set.

Is there a reason why $in is using a std::set over an std::unordered_set?

related to

SERVER-18733 Streamline set cache optimization for set operations

Backlog

Assignee:: Charlie Swanson
Reporter:: Antoine Hom
Participants:: Antoine Hom, Charlie Swanson, Ramon Fernandez Marina
Votes:: 1 Vote for this issue
Watchers:: 14 Start watching this issue

Created:: May 29 2015 11:43:07 AM UTC
Updated:: Nov 04 2015 07:53:50 PM UTC
Resolved:: Nov 04 2015 07:53:50 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates