-
Type: Spec Change
-
Resolution: Unresolved
-
Priority: Minor - P4
-
None
-
Component/s: BSON
-
None
-
Needed
This applies to the BSON Corpus spec: https://github.com/mongodb/specifications/blob/master/source/bson-corpus/bson-corpus.rst
The folllowing test case unveiled a bug in PyMongo's array decoding: https://github.com/mongodb/specifications/blob/master/source/bson-corpus/tests/array.json#L34-L37.
PyMongo validates the array's size by looking at the int32 preceding the array, grabbing the byte at that offset from the current position, then checking to see if it's 0x00: https://github.com/mongodb/mongo-python-driver/blob/3.3.0/bson/__init__.py#L158-L161
The interesting thing about the above test case is that the byte at that position is 0x00. PyMongo happily accepts such an array without raising an error, even though the 0x00 byte is part of the field value, not the terminator for the array.
PyMongo has the same problem when decoding sub-documents, but the BSON corpus tests don't have a case for this. I'm proposing that we add one to document.json like this:
{ "description": "Subdocument too short, but terminator looks like EOO.", "bson": ""140000000361000b0000001062000a0000000000" }
For reference, the above BSON is the following document, with the length of subdocument a decreased by one:
{'a': {'b': 10}}