ISSUE SUMMARY
With the introduction of 2.6, certain temporary map/reduce collections have incorrectly been replicated to secondary nodes. This adds additional traffic between replica set nodes. Additionally, these collections do not have an _id value in their documents, which causes scanning of collections during replication on the primary and can impact performance.
USER IMPACT
Large map/reduce jobs with millions of documents can noticeably impact the performance of the server, increase oplog churn and thus network traffic between replica set members.
WORKAROUNDS
There is no workaround for replicating inserts to the temporary collections. If the impact to the server increases to intolerable levels, the m/r job should be moved to a dedicated hidden secondary node to mitigate the issue.
AFFECTED VERSIONS
MongoDB 2.6.0 and 2.6.1 are affected by this issue.
FIX VERSION
The fix is included in the 2.6.2 production release.
RESOLUTION DETAILS
Documents in temporary *_inc collections are explicitly not replicated. This restores the behavior prior to development version 2.5.5.
Original Description
Run the map reduce example on a 2.6 replica set.
From the oplog, I can see that mapReduce generates tmp collections <database.tmp.mr.collection_x_inc> without _id field. This would cause performance issue when it tried to replicate these tmp collections on the secondaries.
> db.oplog.rs.find({ns:/_inc/}) { "ts" : Timestamp(1400390715, 1), "h" : NumberLong("9062785211345050513"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 36, "1" : 256193 } } { "ts" : Timestamp(1400390715, 2), "h" : NumberLong("-6347065931779322235"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 1, "1" : 237298 } } { "ts" : Timestamp(1400390715, 3), "h" : NumberLong("5305159503718125362"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 2, "1" : 247543 } } { "ts" : Timestamp(1400390715, 4), "h" : NumberLong("242292647194800186"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 3, "1" : 246875 } } { "ts" : Timestamp(1400390715, 5), "h" : NumberLong("-3801567793714329373"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 4, "1" : 250808 } } { "ts" : Timestamp(1400390715, 6), "h" : NumberLong("-7467661728084668641"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 5, "1" : 266786 } } { "ts" : Timestamp(1400390715, 7), "h" : NumberLong("48771082501147428"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 6, "1" : 239294 } } { "ts" : Timestamp(1400390715, 8), "h" : NumberLong("7947765402550217396"), "v" : 2, "op" : "i", "ns" : "test.tmp.mr.docs_0_inc", "o" : { "0" : 7, "1" : 246862 } }
- related to
-
SERVER-10154 Suppress replication of temporary collections used for mapReduce
- Closed
-
SERVER-14168 Warning logged when incremental MR collections are unsuccessfully dropped on secondaries
- Closed