-
Type: New Feature
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Aggregation Framework, Index Maintenance, Usability
-
None
-
Environment:Global, all environments
-
Fully Compatible
-
Quint Iteration 6
We've decided to go about this by adding a new aggregation stage: $sample. Given a positive integer, the stage will pseudo-randomly choose that number of documents from the incoming stream of documents, which is implicitly the entire collection when $sample is the first stage in the pipeline.
Note that this ticket will only track the aggregation stage functionality, and this implementation will be very slow until SERVER-19183 and SERVER-19182 are resolved.
Original description:
Picking a random item from a collection is used in many cases. For example, you want a random item from the collection photos. Currently this can be accomplished by counting the resulting query, computing a random index within that count, and then getting that item with that random index.
A easier approach would be requesting a random item directly from mongo given a query
photos.find(
{"author":"johndoe"}).random()
// this would act like .next() but instead would simply return a random item that matches the query
photos.random_one(
{"author":"johndoe"})
// this would act just like find_one, except it would return a random item that matches the query
- depends on
-
SERVER-20121 XorShift PRNG should use unsigned arithmetic
- Closed
- is duplicated by
-
SERVER-22573 $sample should stream results when possible
- Closed
- is related to
-
DRIVERS-234 Aggregation Builder Support for 3.2
- Closed
- related to
-
CSHARP-1366 Aggregation stage to randomly sample documents
- Closed
-
JAVA-1937 Aggregation stage to randomly sample documents
- Closed
-
SERVER-19182 Integrate storage engine optimizations into $sample stage
- Closed
-
SERVER-19183 Allow storage engines to provide optimized random cursors for use by $sample
- Closed
- links to