Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-103340

Consider making resharding create _id index in the "building-index" phase instead of upon creating the temporary resharding collection

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      As described in this comment in SERVER-100264, the ReshardingCollectionCloner fetches documents in RecordId order which is not the _id order (unless the collection is clustered). Let N be the number of documents in each insert batch,  in the worst case it will dirty N different _id index pages. Making resharding create _id index in the "building-index" phase instead of upon creating the temporary resharding collection may significantly speed up and reduce the resource utilization the cloning phase of resharding. This is what initial sync does. One downside is that if the collection contains documents with duplicated _id (which should be very rare), then that would not be caught until the "building-index" state instead of in the "cloning" state.

            Assignee:
            Unassigned Unassigned
            Reporter:
            cheahuychou.mao@mongodb.com Cheahuychou Mao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None