Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-3717

Speed up _type_marker check in BSON

    • Type: Icon: Improvement Improvement
    • Resolution: Fixed
    • Priority: Icon: Unknown Unknown
    • 4.4
    • Affects Version/s: 4.4
    • Component/s: BSON
    • None

      In _cbsonmodule.c, the _type_marker function uses PyObject_HasAttrString(object, "_type_marker") and PyObject_GetAttrString(object, "_type_marker"). In my workloads (highly nested documents with many large array fields), these functions become severe bottlenecks to performance, because they each create new Python string objects by calling PyUnicode_FromString("_type_marker") every time they run.

       

      A simple change that helped substantially (~60% faster) was creating a global TYPEMARKERSTR object, defining it once in PyInit__cbson as PyUnicode_FromString("_type_marker"), and replacing PyObject_Has/GetAttrString(object, "_type_marker") with PyObject_Has/GetAttr(object, TYPEMARKERSTR). One caveat is that this leaks the TYPEMARKERSTR object in the case that the cbson module is unloaded.

       

      Also, correct me if I'm wrong, but I believe these lines are redundant because the function returns type at the end regardless.

            Assignee:
            steve.silvester@mongodb.com Steve Silvester
            Reporter:
            cheah_sean@yahoo.com thalassemia N/A
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: