-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Distributed Query Execution
-
None
-
Query Execution
-
Fully Compatible
-
ALL
-
QE 2023-06-26, QE 2023-07-10, QE 2023-07-24, QE 2023-08-07, QE 2023-08-21, QE 2023-09-04, QE 2023-09-18, QE 2023-10-02, QE 2023-10-16
While reviewing new code from PM-1632, I spot-checked some existing patterns for how the collation is being resolved by mongos. It appears the are multiple cases where the std::unique_ptr<CollatorInterface> is left nullptr when the collation specification is an empty BSONObj() despite the rule being "when the collation is unspecified in the request the collection's default collation is used to satisfy the operation." Without further investigation it is unclear the extent to which mongos would be doing post-processing of results after merging cursor results (e.g. $group followed by $match) where the collator used by mongos is relevant for the correctness of query results.
Moreover, the contract desired by ExpressionContext is impossible for mongos to uphold given that mongos has no knowledge about the default collation for unsharded collections. The sharding catalog only stores the collation specification for sharded collections.
* The ExpressionContext is always set up with the fully-resolved collation. So even though
* SERVER-24433 describes an ambiguity between a null collator, here we can say confidently that
* null must mean simple since we have already handled "absence of a collator" before creating
* the ExpressionContext.
Hopefully we can produce an interface which leverages the C++ type system to enforce the following contract:
- mongos incorrectly fills in the collator for unsharded collections as simple collation but promises to not ever use it.
- mongos correctly fills in the collator for sharded collections by consulting the ChunkManager and uses it when responsible for merging + post-processing.
- is depended on by
-
SERVER-80145 Avoid explicit callers of ChunkManager::dbPrimary() when doing shard targeting in agg code
- Closed
- is related to
-
SERVER-71896 Validate if a query with _id or shard key is directly targetable to a shard
- Closed
-
SERVER-76857 Have useTwoPhaseProtocol use the collection default collation if the collation is not specified
- Closed
-
SERVER-24433 Distinguish between the simple collator and the absence of a collator
- Backlog
- related to
-
SERVER-85572 Follow up on audit in mongos for improper usage of collation and incorrectly assuming simple collation rather than collection default
- Open
-
SERVER-81991 Delete RoutingCollator after branching for 8.0
- Open
-
SERVER-92967 Refactor index spec collations to be a proper type
- Backlog