-
Type: New Feature
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.3
-
Component/s: Aggregation Framework
-
None
-
Query Optimization
-
(copied to CRM)
The Spark connector uses the Aggregation Framework to create data partitions that are sent to Spark workers. In a sharded cluster it makes sense to align these partitions to chunk boundaries so that each worker's data loading query is targeted to a single shard.
This however is impossible when the shard key is a hashed index. A simple find can use $min / $max but there is no comparable facility in the aggregation framework.
- is related to
-
SPARK-98 MongoShardedPartitioner and hashed shard keys not working correctly
- Closed
-
SERVER-14400 Using $min and $max on shard key doesn't target queries
- Backlog
- related to
-
SERVER-24274 Create a command to provide query bounds for partitioning data in a collection
- Backlog