-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
When the C message module encodes a RawBSONDocument it inflates the top-level keys of the raw document (decoding them to Python objects) and then encodes them back into BSON.
This is a bug because not all BSON values can be round-tripped in python, for example UUIDs may be inadvertently changed:
>>> from bson.binary import Binary >>> from uuid import uuid4 >>> from bson import BSON >>> from bson.raw_bson import RawBSONDocument, DEFAULT_RAW_BSON_OPTIONS >>> coll = client.t.t >>> doc = {'_id': 1, 'u': Binary(uuid4().bytes, 4)} >>> raw = RawBSONDocument(BSON.encode(doc)) >>> coll.insert_one(raw) <pymongo.results.InsertOneResult object at 0x103d7e948> >>> raw_coll = coll.with_options(codec_options=DEFAULT_RAW_BSON_OPTIONS) >>> raw2 = raw_coll.find_one() >>> raw.raw b'&\x00\x00\x00\x10_id\x00\x01\x00\x00\x00\x05u\x00\x10\x00\x00\x00\x04\xdfM\xd3,r\x19H\xb8\x87h\x17\x81\xd2q\xcaK\x00' >>> raw2.raw b'&\x00\x00\x00\x10_id\x00\x01\x00\x00\x00\x05u\x00\x10\x00\x00\x00\x03\xdfM\xd3,r\x19H\xb8\x87h\x17\x81\xd2q\xcaK\x00' >>> raw == raw2 False
This is also a performance issue because decoding and encoding a RawBSONDocument is unnecessary.
This fix is simply to make the write_dict method check for RawBSONDocument.