-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Cluster Scalability
-
None
-
None
-
None
-
None
-
None
-
None
-
None
As described in this comment in SERVER-100264, the ReshardingCollectionCloner fetches documents in RecordId order which is not the _id order (unless the collection is clustered). Let N be the number of documents in each insert batch, in the worst case it will dirty N different _id index pages. Making resharding create _id index in the "building-index" phase instead of upon creating the temporary resharding collection may significantly speed up and reduce the resource utilization the cloning phase of resharding. This is what initial sync does. One downside is that if the collection contains documents with duplicated _id (which should be very rare), then that would not be caught until the "building-index" state instead of in the "cloning" state.
- is related to
-
SERVER-100264 Resharding Natural Order Pipeline Does Not Respect reshardingCollectionClonerBatchSizeInBytes
-
- In Code Review
-