-
Type: Task
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
In SERVER-73536, we went with a naive implementation to estimate the size of a bulkWrite command (excluding its ops), where we just serialize a command object with fields copied over and placeholders added as needed, and take the size of that.
Our rationale for this was that:
- we only do it once up-front per bulkWrite command mongos receives
- for most bulkWrite commands, we expect the ops field (which we skip serializing here) to take up the bulk of the command
- this is strictly less expensive than serializing an actual sub-batch command, which is something we often do numerous times for a single incoming request on mongos that targets multiple shards.
That said, for certain workloads (e.g. all writes are to a single shard so we won't split batches often, and/or there are large top-level fields on the command) this could prove costly.
when we do performance testing, it may be worth reevaluating this. A smarter implementation could do math to try to estimate the size without actually serializing the data, similar to what we do for estimating the sizes of individual ops.
- related to
-
SERVER-81086 Complete TODO listed in SERVER-78301
- Closed