Limit the BufferedBulkInserter's batch size by bytes

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Done
    • Priority: Major - P3
    • 100.5.2
    • Affects Version/s: None
    • Component/s: None
    • None

      As part of TOOLS-1956 we removed the byte limit on batch sizes in the BufferedBulkInserter. (See mtc and tools.)

      This means each batch of the BufferedBulkInserter can hold up to ~16 GB of data before it gets flushed. The theoretical maximum of data that can be stored in BufferedBulkInserter's in mongorestore is ~16 GB * NumParallelCollections * NumInsertionWorkers. This is ~64 GB by default.

      This can have a severe impact on performance, even for average document sizes of 1-2 MB.

      We should limit batches to 48MB. The BufferedBulkInserter will flush its batch whenever the document count reaches the batchSize OR the total size of documents in the batch reaches 48MB.

      The go driver splits batches over 48MB so there is no benefit to having batches larger than this.

      Additionally, this will provide a limit for the size of the sync.Pool in TOOLS-1856.

       

            Assignee:
            Tim Fogarty
            Reporter:
            Tim Fogarty
            Evgeni Dobranov (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: