Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14985

Merge stages in aggregation should be distributed beyond primary shard

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.0
    • Component/s: Aggregation Framework
    • None
    • ALL

      http://docs.mongodb.org/manual/core/aggregation-pipeline-sharded-collections/

      "The second pipeline consists of the remaining pipeline stages and runs on the primary shard. The primary shard merges the cursors from the other shards and runs the second pipeline on these results. The primary shard forwards the final results to the mongos. In previous versions, the second pipeline would run on the mongos."

      • This prevents scaling out non-trivial aggregation pipeline queries
      • As a specific case, consider that when using $redact, then all of your queries become aggregation queries, so 100% of your reads will have to flow through the primary shard, even if they otherwise would be a targeted query.
      • Minor, but related: Note that selection of the primary shard is usually implicit, ie random from the user point of view and cannot be changed afterwards.

            Assignee:
            Unassigned Unassigned
            Reporter:
            henrik.ingo@mongodb.com Henrik Ingo (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: