-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.2.1
-
Component/s: Aggregation Framework
-
None
Please confirm that the `$sample` aggregation operator streams results to the client immediately whenever possible. If not, consider this a request for improvement.
For example:
MongoDB Enterprise > db.serverStatus().storageEngine { "name" : "wiredTiger", "supportsCommittedReads" : true } MongoDB Enterprise > db.serverBuildInfo().version 3.2.1 MongoDB Enterprise > db.users.count() 10000 MongoDB Enterprise > db.users.aggregate([{$sample:{size: 5}}])
Here the sampling implementation will use a WiredTiger random cursor to obtain each document. I wish to confirm that the first sample document identified is available to the client before the full sample set is computed.
Please confirm for each of the sampling mechanism implemented in MongoDB 3.2 (with and without a query predicate), and also sharded cluster behavior.
Use case: MongoDB Compass. We want to permit users to specify large sample sizes in Compass, then progressively display the results to users. Ideally Compass can begin to display an inferred schema after a very small number of documents are received from the server.
- duplicates
-
SERVER-533 Aggregation stage to randomly sample documents
- Closed