Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24871

Add shard split feature (mitosis)

    • Type: Icon: New Feature New Feature
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • None
    • Sharding

      Rationale – In a large sharded cluster with heavy insert/update load, data balancing across shards via chunk migration is extremely slow, especially in comparison with the rate of data ingestion. Adding new shards will often take months in order for data to fully rebalance. Often times the extra load/overhead of chunk migrations also cause the cluster to fail to serve actual traffic, thus making it impossible to keep the balancer running.

      Even with the new parallel chunk migration strategy in 3.4, rebalancing to a single new shard will not be improved, since you can still only move one chunk at a time to the new shard. This means that when operating such a cluster, you have to keep on top of planning and provisioning sufficient extra capacity months ahead of time. Otherwise you risk getting into a situation where load or disk utilization on existing shards is far too high, and it will then take months of running in a degraded state until data balances to alleviate the pain.

      Proposed feature – Add a new admin command similar to split chunk, but instead of creating a new chunk, create a new shard and update the cluster metadata such that roughly half the chunks on the shard being split are now owned by the new shard.

      How this might work in practice – Say you have a shard that is a replica set of 3 mongoDs. First you would add an additional 3 (or more) replicas to the replica set, and wait for them to complete their initial sync. Then you can issue the new split shard command, and designate which nodes will belong to the new shard being created, so you end up with two replica sets with 3 (or more) replicas each. The chunk metadata can then be updated such that half the chunks on the original shard will belong to the newly created shard, and then each shard can delete the data in the half of the chunks they no longer own.

      With this mechanism, adding a new shard that has balanced data will be able to be completed in seconds or minutes instead of months.

            Assignee:
            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            Reporter:
            dai@foursquare.com Dai Shi
            Votes:
            0 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: