This occurs in a python script when a cursor comes in contact with a document that contains an invalid utf8 string.
The document does indeed contain an invalid utf8 string. The python driver essentially reads this and crashes the cursor with a fatal error:
bson.errors.InvalidBSON: 'utf8' codec can't decode byte 0xfd in position 1: invalid start byte
I was able to find the affected document and load it in the mongo shell with no errors.
I propose that the python driver if able should handle this in a similar fashion, construct the bson document as best it can, throw a warning if necessary but most important; continue iterating through the cursor.
- is duplicated by
-
PYTHON-995 Pymongo - Entry - decode error - codec can't decode byte
- Closed
- is related to
-
CSHARP-694 Provide some way for the C# driver to be more lenient about UTF8 validity
- Closed
-
JAVA-1305 Ensure that an exception is thrown if a BSON string is not valid UTF-8
- Closed