Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-25652

Slow chunk migrations when there are large chunk counts. 3.0, 3.2, 3.3.11

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.8, 3.3.11
    • Component/s: Sharding
    • None
    • ALL
    • Hide
      • Create two-shard cluster;
      • Stop the balancer;
      • Create an empty sharded collection and split it into 500,000 chunks.
        (100k chunks may be enough to make noticeable differences, but it has been at 500k that it becomes very easy to observe.)
      • Start the balancer to and observe the time taken to move each chunk.
      Show
      Create two-shard cluster; Stop the balancer; Create an empty sharded collection and split it into 500,000 chunks. (100k chunks may be enough to make noticeable differences, but it has been at 500k that it becomes very easy to observe.) Start the balancer to and observe the time taken to move each chunk.
    • Sharding 2016-09-19, Sharding 2016-10-10, Sharding 2016-10-31

      I've been testing the speed of chunk migrations in an all-on-one-server test cluster. Even when the chunks being migrated are empty (i.e. the chunk move takes only ~0.1 secs) the entire cycle run by the balancer takes a lot longer.

      version balance round time
      3.0.8 ~4.5 secs
      v3.3.11-30-gc96009e ~ 9 secs

      From someone's else case with v3.2 and different servers / network to my test I heard of a ~6 second cycle. Not sure if that was a replica set config db or the older SCCC-style one.

      Can the balancer be changed so that the balance round will do multiple chunks of each collection so long as they finish quickly? E.g. balance round identifies candidate chunks for migrations, and keeps on doing chunk moves for them serially until a, say, 10 sec window completes.

      At any rate if data has been completely deleted for a big fraction of chunk ranges before adding a new shard, it would be good if those chunks moves happened a lot more quickly.

        1. chunksOnWrongTier_v2.js
          1 kB
        2. createChunks_v3.js
          3 kB

            Assignee:
            kaloian.manassiev@mongodb.com Kaloian Manassiev
            Reporter:
            akira.kurogane Akira Kurogane
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: