-
Type: Task
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Query Execution
fixDocumentForInsert goes through each document to insert four times:
1. First to validate the depth here
2. Then we iterate through the document to validate it (check if there are Timestamps needing fixing, if _id is present more than once, etc.)
3. If the _id is not the first element in the BSON, we fetch the _id element, which under the hood iterates through the BSON doc again.
4. We iterate through the BSON doc again to copy elements into the new BSON doc here
We should be needing at most two passes to do this (maybe even less if we are clever). And some of these steps are easy to avoid. For example, step (3) above can be avoided by remembering where the _id field is in step (2).
I also think we can generate UUIDs (b.appendOID() / reserve optimes to fill in timestamps in batches instead of one at a time. We see (look at attached flamegraphs) that it's taking a considerable amount of time.
See comments for more info.
- is related to
-
SERVER-83148 Investigate hand parsing bulkWrite command instead of IDL
- Closed