Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-95773

Resharding does not need to sample documents if the key is hashed

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Cluster Scalability

      Resharding uses SamplingBasedSplitPolicy, which is a descendant of the regular InitialSplitSpolicy base class.

      The function calculateHashedSplitPoints defined on the parent is only used in other child classes
      (SplitPointsBasedSplitPolicy::SplitPointsBasedSplitPolicy and AbstractTagsBasedSplitPolicy). The SamplingBasedSplitPolicy does not rely on this method based on the code inspection.

      If the shard key consists of only a hashed field we do not need to sample and can split the space deterministically among the recipients. This allows us to mitigate known issues with the $sample implementation and allow the final distribution of chunks to mirror the distribution of the customer's data without the downsides of sampling.

            Assignee:
            Unassigned Unassigned
            Reporter:
            lamont.nelson@mongodb.com Lamont Nelson
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: