Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-2751

Report CPython GC bug related to connection pinning and socket cleanup

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: None
    • Component/s: None
    • None

      While working on PYTHON-2673, we ran into what we believe is a bug in CPython's garbage collector. To workaround the problem we needed to add the Pool.__pinned_sockets property which holds references to each SocketInfo checked out of the pool.

      Without the Pool.__pinned_sockets workaround, the bug triggered when:

      • the client is created with loadBalanced=True
      • a find or aggregate cursor fails on the initial attempt
      • the retry attempt succeeds
      • then soon after the cursor is garbage collected without being iterated or closed explicitly

      Here's some logs from a repro (https://github.com/ShaneHarvey/mongo-python-driver/blob/2054e80c0cd815171b345241d2dd2ba3642ce604/test/load_balancer/test_load_balancer.py#L78):

      Running Find
      2021-06-10 18:00:18,617 INFO event_loggers [pool ('127.0.0.1', 8000)] connection check out started
      2021-06-10 18:00:18,618 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #1] connection created
      2021-06-10 18:00:18,645 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #1] connection setup succeeded
      2021-06-10 18:00:18,645 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #1] connection checked out of pool
      get_socket checkout: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61222), raddr=('127.0.0.1', 8000)>) at 0x7f8e156389a0
      _close_socket: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61222), raddr=('127.0.0.1', 8000)>) at 0x7f8e156389a0
      get_socket checkout error: SocketInfo(<socket.socket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>) CLOSED at 0x7f8e156389a0: 127.0.0.1:8000: connection closed
      2021-06-10 18:00:18,648 INFO event_loggers [pool ('127.0.0.1', 8000)] pool cleared
      return socket: SocketInfo(<socket.socket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>) CLOSED at 0x7f8e156389a0
      2021-06-10 18:00:18,648 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #1] connection checked into pool
      2021-06-10 18:00:18,649 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #1] connection closed, reason: error
      server.pool after error: deque([])
      2021-06-10 18:00:18,649 INFO event_loggers [pool ('127.0.0.1', 8000)] connection check out started
      2021-06-10 18:00:18,650 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #2] connection created
      2021-06-10 18:00:18,652 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #2] connection setup succeeded
      2021-06-10 18:00:18,652 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #2] connection checked out of pool
      get_socket checkout: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      server.pool after success: deque([])
      Created _SocketManager: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      get_socket: SocketInfo(<socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61216), raddr=('127.0.0.1', 8000)>) at 0x7f8e075693a0
      return socket: SocketInfo(<socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61216), raddr=('127.0.0.1', 8000)>) at 0x7f8e075693a0
      sockets.appendleft: SocketInfo(<socket.socket fd=6, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61216), raddr=('127.0.0.1', 8000)>) at 0x7f8e075693a0
      sock: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      Pool before cursor del: deque([])
      Pool after cursor del: deque([])
      gc: collectable <Cursor 0x7f8df5d292e0>
      gc: collectable <dict 0x7f8df5d65b40>
      gc: collectable <Collection 0x7f8e07569fd0>
      gc: collectable <dict 0x7f8df5d65a80>
      gc: collectable <Database 0x7f8e07569e20>
      gc: collectable <CodecOptions 0x7f8df5d18c10>
      gc: collectable <dict 0x7f8df5d23d40>
      gc: collectable <_ServerSession 0x7f8e15a157c0>
      gc: collectable <Binary 0x7f8df5d2db80>
      gc: collectable <dict 0x7f8e12d9d780>
      gc: collectable <dict 0x7f8e14ab2a40>
      gc: collectable <SessionOptions 0x7f8e14ff4100>
      gc: collectable <ClientSession 0x7f8e14fe3b20>
      gc: collectable <dict 0x7f8e13d19840>
      gc: collectable <_Transaction 0x7f8e15256cd0>
      gc: collectable <_Query 0x7f8e147dd880>
      gc: collectable <method 0x7f8df5d2f080>
      gc: collectable <cell 0x7f8e15256c10>
      gc: collectable <cell 0x7f8e153f5a00>
      gc: collectable <cell 0x7f8e153f59a0>
      gc: collectable <cell 0x7f8e152cea90>
      gc: collectable <tuple 0x7f8df5d2da90>
      gc: collectable <function 0x7f8e1543b1f0>
      gc: collectable <function 0x7f8e15639c10>
      gc: collectable <frame 0x7f8df5f0e740>
      gc: collectable <frame 0x7f8e1553fc90>
      gc: collectable <_MongoClientErrorHandler 0x7f8df5d25760>
      gc: collectable <frame 0x7f8e15540a10>
      gc: collectable <socket 0x7f8e156dfb20>
      gc: collectable <SocketInfo 0x7f8e156389a0>
      gc: collectable <dict 0x7f8df5d2f200>
      gc: collectable <set 0x7f8e15a13660>
      gc: collectable <SocketChecker 0x7f8e15638af0>
      gc: collectable <SON 0x7f8e156e80e0>
      gc: collectable <dict 0x7f8e13d5c140>
      gc: collectable <list 0x7f8e147959c0>
      gc: collectable <tuple 0x7f8e1578de40>
      gc: collectable <tuple 0x7f8df5d08040>
      gc: collectable <managedbuffer 0x7f8e152660c0>
      gc: collectable <memoryview 0x7f8e08e0bf40>
      gc: collectable <tuple 0x7f8e152c2ca0>
      gc: collectable <OSError 0x7f8df5d51240>
      gc: collectable <traceback 0x7f8e08e68a00>
      gc: collectable <frame 0x7f8e1554f440>
      gc: collectable <traceback 0x7f8e08e68240>
      gc: collectable <frame 0x7f8e1554f0e0>
      gc: collectable <traceback 0x7f8e08e68340>
      gc: collectable <AutoReconnect 0x7f8e15796c40>
      gc: collectable <tuple 0x7f8e156389d0>
      gc: collectable <set 0x7f8e15a13820>
      gc: collectable <dict 0x7f8df5d2b300>
      gc: collectable <list 0x7f8e154509c0>
      gc: collectable <traceback 0x7f8e08e68780>
      gc: collectable <frame 0x7f8df5d35cf0>
      gc: collectable <traceback 0x7f8e1475d4c0>
      gc: collectable <frame 0x7f8df5d7d580>
      gc: collectable <traceback 0x7f8df5d23a00>
      gc: collectable <frame 0x7f8df5d27dd0>
      gc: collectable <traceback 0x7f8df5d237c0>
      gc: collectable <frame 0x7f8e155cf020>
      gc: collectable <traceback 0x7f8df5d238c0>
      gc: collectable <frame 0x7f8df5d30c40>
      gc: collectable <traceback 0x7f8df5d23940>
      gc: collectable <traceback 0x7f8df5d23840>
      gc: collectable <traceback 0x7f8df5d23c80>
      gc: collectable <traceback 0x7f8df5d23fc0>
      gc: collectable <socket 0x7f8df5d734c0>
      gc: collectable <SocketInfo 0x7f8e1543e7c0>
      gc: collectable <dict 0x7f8e14795a00>
      gc: collectable <set 0x7f8e15a13900>
      gc: collectable <SocketChecker 0x7f8e15638340>
      gc: collectable <ObjectId 0x7f8e14ab95b0>
      gc: collectable <_OpMsg 0x7f8df5d04340>
      gc: collectable <list 0x7f8e08e668c0>
      gc: collectable <list 0x7f8e157d83c0>
      gc: collectable <ObjectId 0x7f8e14ab9580>
      gc: collectable <dict 0x7f8df5d65900>
      gc: collectable <ObjectId 0x7f8e1525b370>
      gc: collectable <dict 0x7f8df5d658c0>
      gc: collectable <ObjectId 0x7f8df5d29400>
      gc: collectable <dict 0x7f8df5d65a40>
      gc: collectable <dict 0x7f8df5d04440>
      gc: collectable <Int64 0x7f8df5d65c40>
      gc: collectable <dict 0x7f8df5d23a40>
      gc: collectable <Timestamp 0x7f8df5d298b0>
      gc: collectable <PinnedResponse 0x7f8df5d73760>
      gc: collectable <frame 0x7f8df5f09ae0>
      gc: collectable <frame 0x7f8df5f09140>
      gc: collectable <_SocketManager 0x7f8e15619f70>
      gc: collectable <dict 0x7f8e08e666c0>
      gc: collectable <collections.deque 0x7f8df5d73640>
      gc: collectable <frame 0x7f8df5f08b60>
      gc: collectable <frame 0x7f8e14be5cd0>
      gc: collectable <frame 0x7f8df5d75d40>
      Cursor.__die
      _SocketManager.close: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      return socket: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      2021-06-10 18:00:18,687 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #2] connection checked into pool
      sockets.appendleft: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      _SocketManager.__del__: None
      /Users/shane/git/mongo-python-driver/test/load_balancer/test_load_balancer.py:99: ResourceWarning: unclosed <socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>
        gc.collect()
      Object allocated at (most recent call last):
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/suite.py", lineno 84
          return self.run(*args, **kwds)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/suite.py", lineno 122
          test(result)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/suite.py", lineno 84
          return self.run(*args, **kwds)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/suite.py", lineno 122
          test(result)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/suite.py", lineno 84
          return self.run(*args, **kwds)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/suite.py", lineno 122
          test(result)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", lineno 736
          return self.run(*args, **kwds)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", lineno 676
          self._callTestMethod(testMethod)
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/unittest/case.py", lineno 633
          method()
        File "/Users/shane/git/mongo-python-driver/test/load_balancer/test_load_balancer.py", lineno 83
          next(cursor)
        File "/Users/shane/git/mongo-python-driver/pymongo/cursor.py", lineno 1237
          if len(self.__data) or self._refresh():
        File "/Users/shane/git/mongo-python-driver/pymongo/cursor.py", lineno 1158
          self.__send_message(q)
        File "/Users/shane/git/mongo-python-driver/pymongo/cursor.py", lineno 1046
          response = client._run_operation(
        File "/Users/shane/git/mongo-python-driver/pymongo/mongo_client.py", lineno 1289
          return self._retryable_read(
        File "/Users/shane/git/mongo-python-driver/pymongo/mongo_client.py", lineno 1391
          with self._slaveok_for_server(
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", lineno 113
          return next(self.gen)
        File "/Users/shane/git/mongo-python-driver/pymongo/mongo_client.py", lineno 1235
          with self._get_socket(server, session, pin=pin) as sock_info:
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", lineno 113
          return next(self.gen)
        File "/Users/shane/git/mongo-python-driver/pymongo/mongo_client.py", lineno 1171
          with server.get_socket(
        File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/contextlib.py", lineno 113
          return next(self.gen)
        File "/Users/shane/git/mongo-python-driver/pymongo/pool.py", lineno 1374
          sock_info = self._get_socket(all_credentials)
        File "/Users/shane/git/mongo-python-driver/pymongo/pool.py", lineno 1497
          sock_info = self.connect(all_credentials)
        File "/Users/shane/git/mongo-python-driver/pymongo/pool.py", lineno 1323
          sock = _configured_socket(self.address, self.opts)
        File "/Users/shane/git/mongo-python-driver/pymongo/pool.py", lineno 1020
          sock = _create_connection(address, options)
        File "/Users/shane/git/mongo-python-driver/pymongo/pool.py", lineno 982
          sock = socket.socket(
      gc: collectable <tuple 0x7f8df5da4b50>
      gc: collectable <type 0x7f8df5f326a0>
      gc: collectable <getset_descriptor 0x7f8df5db2580>
      gc: collectable <dict 0x7f8df5d83080>
      gc: collectable <tuple 0x7f8df5da8900>
      gc: collectable <tuple 0x7f8df5da4c70>
      gc: collectable <type 0x7f8df5f32da0>
      gc: collectable <tuple 0x7f8df5daef90>
      gc: collectable <tuple 0x7f8df5da4d00>
      gc: collectable <type 0x7f8df5f33410>
      gc: collectable <tuple 0x7f8df5db3220>
      Pool after gc.collect: deque([SocketInfo(<socket.socket fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>) at 0x7f8e1543e7c0])
      

      From the above we can see the GC executes Cursor._del_ which correctly transfers ownership of the SocketInfo to the Pool but then the SocketInfo's internal socket is then immediately closed (in the same GC call):

      gc: collectable <Cursor 0x7f8df5d292e0>
      gc: collectable <socket 0x7f8df5d734c0>
      gc: collectable <SocketInfo 0x7f8e1543e7c0>
      gc: collectable <_SocketManager 0x7f8e15619f70>
      gc: collectable <traceback 0x7f8e08e68340>
      gc: collectable <AutoReconnect 0x7f8e15796c40>
      Cursor.__die
      _SocketManager.close: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      return socket: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      2021-06-10 18:00:18,687 INFO event_loggers [pool ('127.0.0.1', 8000)][conn #2] connection checked into pool
      sockets.appendleft: SocketInfo(<socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>) at 0x7f8e1543e7c0
      _SocketManager.__del__: None
      /Users/shane/git/mongo-python-driver/test/load_balancer/test_load_balancer.py:99: ResourceWarning: unclosed <socket.socket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('127.0.0.1', 61224), raddr=('127.0.0.1', 8000)>
        gc.collect()
      

      It seems like the GC is mistakenly loosing a reference count to the socket.socket instance. We should whittle this down to a small reproducer script which doesn't reply on pymongo and report the bug to CPython.

            Assignee:
            Unassigned Unassigned
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: