Make analyzeShardKey command support sampling documents

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Fixed
    • Priority: Major - P3
    • 7.1.0-rc0, 7.0.0-rc7
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • v7.0
    • Sharding NYC 2023-06-26, Sharding NYC 2023-07-10
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      According to the experiments in SERVER-68763, the aggregate command run by the analyzeShardKey command to calculate the metrics about the characteristics about the shard key can take up to hours to run if the collection contains hundreds of millions of documents and the cardinality of the shard key is also very large. Given this, we should make the command support calculating metrics based on sampled documents instead of all of documents in the collection.

            Assignee:
            Cheahuychou Mao
            Reporter:
            Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: