Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-2793

Add streaming support to BSON

    • Type: Icon: New Feature New Feature
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: js-bson

      It is quite difficult to parse large blobs of bson since this library requires pulling the entirety of the bson blob into memory.

      For example, I'm trying to take a closer look at what data is in a particular section of a mongodump file which is all bson. Unfortunately this file is tens of gigabytes in size, making it extremely memory heavy, in some cases these files exceed the memory of the host making it impossible to parse them. If this library supported piped streams the memory footprint would be reduced drastically.

      I am aware that there is a method called 'deserializeStream' however that does not seem to have anything to do with node streams as it still requires a buffer be passed in rather than accepting a piped stream.

      There is also a 3rd party implementation here: https://github.com/timkuijsten/node-bson-stream. However this implementation is no longer maintained, last update was over 5 years ago. It would be nice to have first party support.

      Expected usage:

       

      const fs = require('fs');
      const BSON = require('bson');
      const bsonStreamParser = new BSON.streamParser(opts);
      const fileStream = fs.createReadStream('./myData.dump');
      
      fileStream
        .pipe(bsonStreamParser)
        .on('data', (deserializedBsonDocument) => {
          // logic
        })

       

            Assignee:
            Unassigned Unassigned
            Reporter:
            unusualbob Rob
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: