Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-4131

Add doc example for finding invalid/corrupt documents

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Minor - P4 Minor - P4
    • 4.10
    • Affects Version/s: None
    • Component/s: None
    • None

      Rarely, a user will run into an InvalidBSON exception when reading documents. When this happens it can be difficult to determine the problematic document(s). We should add a docs example for this scenario. One way is to use RawBSONDocument to find which specific document is invalid/corrupted:

      import bson
      
      raw_coll = coll.with_options(codec_options=coll.codec_options.with_options(document_class=bson.raw_bson.RawBSONDocument))
      for doc in raw_coll.find():
          try:
              bson.decode(doc.raw)
          except bson.errors.BSONError as exc:
              print(f"Invalid document {exc}, raw bson: {doc.raw}")
      

      RawBSONDocument lets them delay the BSON decoding so they can narrow down the problematic documents.

      Or, if the InvalidBSON error is coming from database.command:

      res = client.admin.command("serverStatus", codec_options=bson.raw_bson.DEFAULT_RAW_BSON_OPTIONS)
      try:
          bson.decode(res.raw)
      except bson.errors.BSONError as exc:
          print(f"Invalid BSON found in serverStatus response {exc}, raw bson: {res.raw}")
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: