I have found a memory leak when using bson.decode_all(). I can replicate it with the following code
import bson import time msg = bson.encode({'a':list(range(100)), 'b':'this is a test'} while True: b = bson.decode_all(msg) time.sleep(0.001)
This code grows continuously in memory usage, after 5 seconds of CPU time, it uses 252M virtual memory, 185M resident memory, 177M DATA when views with htop. It keeps growing endlessly, I had processes that were using GBs of RAM after a day of running.
The error occurs only when the cython extension _cbson is being used. It also affects multiple versions of pymongo in both python2 and python3 versions. When I used the tracemalloc library, it seemed that the python classes CodecOptions and TypeRegistry were constantly being instantiated which is strange because there seems to be cythonised versions of these classes and if not, there should be a single default version that keeps getting used ie. DEFAULT_CODEC_OPTIONS
I found a work around for this memory leak by doing
import bson from bson.codec_options import CodecOptions import time codec_options = CodecOptions() msg = bson.encode({'a':list(range(100)), 'b':'this is a test'} while True: b = bson.decode_all(msg, codec_options) time.sleep(0.001)
I also found an additional error while doing this:
Traceback (most recent call last):
File "test1.py", line 21, in <module>
b = bson.decode_all(msg, codec_options=options)
TypeError: decode_all() takes no keyword arguments
- is caused by
-
PYTHON-472 Provide an API for inserting and returning raw BSON
- Closed