Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27312

Mapreduce does not run on Secondary in sharded clusters even with out: inline

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • ALL

      0. Nature of the job:
      Fairly routine Map Reduce job:
      Map:
      Reduce:
      public static final String mapfunction = "function()

      { emit(this.custid, this.txnval); }";
      public static final String reducefunction = "function(key, values) { return Array.sum(values); }";

      (Code is attached)

      When executed on a Replica Set, following log can be seen on the Slave:

      2016-11-23T15:05:26.735+0000 I COMMAND [conn671] command test.txns command: mapReduce { mapreduce: "txns", map: function() { emit(this.custid, this.txnval); }

      , reduce: function(key, values)

      { return Array.sum(values); }, out: { inline: 1 }, query: null, sort: null, finalize: null, scope: null, verbose: true } planSummary: COUNT keyUpdates:0 writeConflicts:0 numYields:7 reslen:4331 locks:{ Global: { acquireCount: { r: 44 } }, Database: { acquireCount: { r: 3, R: 19 } }, Collection: { acquireCount: { r: 3 } } } protocol:op_query 124ms

      The out : {inline : 1} ensure that it hits the slave.

      When executed on a shard:

      mongos> db.txns.getShardDistribution()

      Shard Shard-0 at Shard-0/SG-shard3-281.devservers.mongodirector.com:27017,SG-shard3-282.devservers.mongodirector.com:27017
      data : 498KiB docs : 9474 chunks : 3
      estimated data per chunk : 166KiB
      estimated docs per chunk : 3158

      Shard Shard-1 at Shard-1/SG-shard3-284.devservers.mongodirector.com:27017,SG-shard3-285.devservers.mongodirector.com:27017
      data : 80KiB docs : 1526 chunks : 3
      estimated data per chunk : 26KiB
      estimated docs per chunk : 508

      Totals
      data : 579KiB docs : 11000 chunks : 6
      Shard Shard-0 contains 86.12% data, 86.12% docs in cluster, avg obj size on shard : 53B
      Shard Shard-1 contains 13.87% data, 13.87% docs in cluster, avg obj size on shard : 53B


      Shard 0
      Mongos doesn't log the MapReduce job at all
      Slave only logs the deletion of the tmp collection.

      Primary:

      2016-11-24T08:46:30.828+0000 I COMMAND [conn357] command test.$cmd command: mapreduce.shardedfinish { mapreduce.shardedfinish: { mapreduce: "txns", map: function() { emit(this.custid, this.txnval); }, reduce: function(key, values) { return Array.sum(values); }

      , out:

      { in line: 1 }

      , query: null, sort: null, finalize: null, scope: null, verbose: true, $queryOptions: { $readPreference:

      { mode: "secondary" }

      } }, inputDB: "test", shardedOutputCollection: "tmp.mrs.txns_1479977190_0", shards: { Shard-0/SG-shard3-281.devservers.mongodirector.com
      :27017,SG-shard3-282.devservers.mongodirector.com:27017: { result: "tmp.mrs.txns_1479977190_0", timeMillis: 123, timing:

      { mapTime: 51, emitLoop: 116, reduceTime: 9, mode: "mixed", total: 123 }

      , counts:

      { input: 9474, emit: 9474, reduce: 909, output: 101 }

      , ok: 1.0, $gleS
      tats:

      { lastOpTime: Timestamp 1479977190000|103, electionId: ObjectId('7fffffff0000000000000001') }

      }, Shard-1/SG-shard3-284.devservers.mongodirector.com:27017,SG-shard3-285.devservers.mongodirector.com:27017: { result: "tmp.mrs.txns_1479977190_0", timeMillis: 71, timing:

      { mapTime: 8, emitLoop: 63, reduceTime: 4, mode: "mixed", total: 71 }

      , counts:

      { input: 1526, emit: 1526, reduce: 197, output: 101 }

      , ok: 1.0, $gleStats:

      { lastOpTime: Timestamp 1479977190000|103, electionId: ObjectId('7fffffff0000000000000001') }

      } }, shardCounts: { Sha
      rd-0/SG-shard3-281.devservers.mongodirector.com:27017,SG-shard3-282.devservers.mongodirector.com:27017:

      { input: 9474, emit: 9474, reduce: 909, output: 101 }

      , Shard-1/SG-shard3-284.devservers.mongodirector.com:27017,SG-shard3-285.devservers.mongodirector.com:27017:

      { inpu t: 1526, emit: 1526, reduce: 197, output: 101 }

      }, counts:

      { emit: 11000, input: 11000, output: 202, reduce: 1106 }

      } keyUpdates:0 writeConflicts:0 numYields:0 reslen:4368 locks:{ Global: { acquireCount:

      { r: 2 }

      }, Database: { acquireCount:

      { r: 1 }

      }, Collection: { acqu
      ireCount:

      { r: 1 }

      } } protocol:op_command 115ms
      2016-11-24T08:46:30.830+0000 I COMMAND [conn46] CMD: drop test.tmp.mrs.txns_1479977190_0

      Looks like temp collections are created when it is a sharded database/collection. Couldn't find anything specific about the behavior in the documentation (https://docs.mongodb.com/v3.2/core/map-reduce-sharded-collections/).

      Shard 1, Primary:
      2016-11-24T08:46:30.597+0000 I COMMAND [conn351] CMD: drop test.tmp.mr.txns_0
      2016-11-24T08:46:30.651+0000 I COMMAND [conn351] CMD: drop test.tmp.mrs.txns_1479977190_0
      2016-11-24T08:46:30.654+0000 I COMMAND [conn351] CMD: drop test.tmp.mr.txns_0
      2016-11-24T08:46:30.654+0000 I COMMAND [conn351] CMD: drop test.tmp.mr.txns_0
      2016-11-24T08:46:30.654+0000 I COMMAND [conn351] CMD: drop test.tmp.mr.txns_0

      Secondary:
      2016-11-24T08:46:30.838+0000 I COMMAND [repl writer worker 3] CMD: drop test.tmp.mrs.txns_1479977190_0

            Assignee:
            Unassigned Unassigned
            Reporter:
            dharshanr@scalegrid.net Dharshan Rangegowda
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: