-
Type: Improvement
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: BSON
-
None
Since BSON.encode returns a BSON instance and BSON.decode requires a BSON instance, they both do an extra copy of the bytes.
For example encoding a RawBSONDocument with bson.BSON.encode takes about twice as long compared to bson._dict_to_bson:
$ python -m timeit -s 'from bson import BSON, DEFAULT_CODEC_OPTIONS, _dict_to_bson; from bson.raw_bson import RawBSONDocument;raw = RawBSONDocument(BSON.encode({"s": "s"*1024*1024*15}))' 'BSON.encode(raw)' 10 loops, best of 3: 22.8 msec per loop $ python -m timeit -s 'from bson import BSON, DEFAULT_CODEC_OPTIONS, _dict_to_bson; from bson.raw_bson import RawBSONDocument;raw = RawBSONDocument(BSON.encode({"s": "s"*1024*1024*15}))' '_dict_to_bson(raw, False, DEFAULT_CODEC_OPTIONS)' 100 loops, best of 3: 13.8 msec per loop
Perhaps we should add new encode and decode functions to work with bytes as BSON without the extra copy.
- is related to
-
PYTHON-1696 Stop encouraging the use of BSON.decode as a class method
- Closed
- related to
-
PYTHON-1785 Provide utility encode and decode methods in BSON module
- Closed
-
PYTHON-1404 _cbson_dict_to_bson should not copy RawBSONDocument.raw object
- Closed