-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization
The MatchExpression interface offers MatchExpression::equivalent() which can be used to check whether two match expressions are the same. Consider the following two $expr match expressions:
// Display the data in the collection. MongoDB Enterprise > db.c.find() { "_id" : ObjectId("64caa40c416866f24e97cc48"), "str" : "a" } { "_id" : ObjectId("64caa40e416866f24e97cc4a"), "str" : "A" } { "_id" : ObjectId("64caa410416866f24e97cc4c"), "str" : "b" } // Query using lowercase constant. MongoDB Enterprise > db.c.find({$expr: {$eq: ["$str", "a"]}}).collation({locale: "en_US", strength: 2}) { "_id" : ObjectId("64caa40c416866f24e97cc48"), "str" : "a" } { "_id" : ObjectId("64caa40e416866f24e97cc4a"), "str" : "A" } // Query using uppercase constant. MongoDB Enterprise > db.c.find({$expr: {$eq: ["$str", "A"]}}).collation({locale: "en_US", strength: 2}) { "_id" : ObjectId("64caa40c416866f24e97cc48"), "str" : "a" } { "_id" : ObjectId("64caa40e416866f24e97cc4a"), "str" : "A" }
These two queries use the case-insensitive collation and therefore are identical in meaning. However, the implementation of ExprMatchExpression::equivalent() is not collation-aware. Since we haven't implemented related ticket SERVER-30982 yet, ExprMatchExpression::equivalent() currently works by serializing both the left-hand side and right-hand side to a mongo::Value representation and then comparing the resulting values with the simple collator. Because we're using the simple collator, these two expressions will erroneously be considered non-equivalent.
This is not an issue which will result in a user facing bug as currently there is a stronger collation being used for comparison. Yet there is some potential that queries do miss out on a few optimizations due to a more strict comparison. The same also applies for the Hashing function from the Boolean simplification from SERVER-79018. For the scope of this ticket the implementation of ExprMatchExpression::equivalent() should respect comparisons with the collations in mind. This will have an effect on long-tailed customers.
- related to
-
SERVER-79018 Implement MatchExpression hasher
- Closed