-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Querying
-
None
-
Query Execution
Both the Spark and Hadoop connectors have custom code to partition data in a collection so they can be processed externally in parallel.
This requires either SplitVector for non sharded systems or access to query the config database for sharded systems. The permissions to determine the partitions may not be possible in a sharded or hosted MongoDB setup.
Adding a command that could provide the min, max query bounds for splitting a collection into multiple parts would allow any external framework to query in parallel each partition and process in parallel.
- is duplicated by
-
SERVER-25289 Make it possible to select a subset of documents based on the shard key
- Closed
- is related to
-
SERVER-28667 Provide a way for the Aggregation framework to query against intervals of a hashed index
- Backlog
-
SERVER-33998 Remove the parallelCollectionScan command
- Closed