Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: MapReduce
Labels:
None

Assigned Teams:

Query Execution

The MapReduce command from MongoDB takes two non-optional functions, "map" and "reduce", and an optional "finalize" function. "reduce" is supposed to output the same data format from the "map" function.

In some other frameworks, the functions are "map", "shuffle" and "reduce". "shuffle" is the one supposed to output the same data format from "map", just like the "reduce" from mongoDB, but it is "shuffle" that is the optional function, and the non-optional "reduce" is more like the "finalize" from MongoDB. "shuffle" is also known as "local reduce".

It would be great if MongoDB could work like this instead, with the different nomenclature and optional parameters. Maybe changing the mapReduce method, or maybe creating a new method...

Another interesting modification is to always deliver the data to the final step ("finalize"/"reduce") inside a list, even if there is just one item. This way we can always assume there is a list to process, and the method becomes simpler to write.

It should also be easy to have an "identity reducer", it could be the default when no reducer is specified.

Related tickets:

related to

SERVER-5818 reduce in map reduce doesn't run with only one input document

Closed

SERVER-2333 mapreduce optimization: do not execute reduce on unique keys

Closed

Assignee:: [DO NOT USE] Backlog - Query Execution

Reporter:: Nicolau Leal Werneck

Participants:: [DO NOT USE] Backlog - Query Execution, Esha Bhargava, Nicolau Leal Werneck, Rafael

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: Sep 11 2013 03:47:39 AM UTC

Updated:: Dec 06 2022 05:17:59 AM UTC

Resolved:: Feb 04 2022 03:09:20 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates