-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.7.2
-
Component/s: None
-
None
There is a difference when using pymongo with python2 or python3 while serializing the same object.
Python 2.7.12:
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from bson import BSON >>> a = {u"unicode": b"binary\xda"} >>> BSON.encode(a) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\Gilad\Envs\mongobin2\lib\site-packages\bson\__init__.py", line 1027, in encode return cls(_dict_to_bson(document, check_keys, codec_options)) bson.errors.InvalidStringData: strings in documents must be valid UTF-8: 'binary\xda' >>>
Python 3.5.2:
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:18:55) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> from bson import BSON >>> a = {u"unicode": b"binary\xda"} >>> BSON.encode(a) b'\x1a\x00\x00\x00\x05unicode\x00\x07\x00\x00\x00\x00binary\xda\x00' >>>
I would like to add a parameter to CodecOptions to enable treating python2's `str` as bytes.
Basically , There can be a `str` object that holds binary data, I think that pymongo should treat that object as `bytes` because it is actually `bytes`.
In order to void backward compatibility issues, I think that adding an optional parameter to CodecOptions is the best solution here.
- duplicates
-
PYTHON-1748 Treat Python 2 str as Binary
- Closed