Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-4664

Operation(s) getting randomly cancelled.

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: 4.7, 4.8
    • Component/s: None
    • None
    • Python Drivers
    • Hide

      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?

      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?

      Detailed steps to reproduce the problem?

      We write testcode where we are using a dockerised mongo mongo:latest and have our python program connect to it for doing unit/integration tests.

      Upon updating past 4.7.0 we see the test code starting to experience flakeyness, i.e. sometimes fails with a site-packages\pymongo\network.py:377: _OperationCancelled` exception.

      Stacktrace of it happening, from our code (using an collection).
       

      trade_item\trade_item_service.py:367: in _log_to_database
          audit_logs_collection.update_one(
      ..\..\..\site-packages\pymongo\collection.py:1077: in update_one
          self._update_retryable(
      ..\..\..\site-packages\pymongo\collection.py:872: in _update_retryable
          return self.__database.client._retryable_write(
      ..\..\..\site-packages\pymongo\mongo_client.py:1569: in _retryable_write
          return self._retry_with_session(retryable, func, s, bulk, operation, operation_id)
      ..\..\..\site-packages\pymongo\mongo_client.py:1455: in _retry_with_session
          return self._retry_internal(
      ..\..\..\site-packages\pymongo\_csot.py:108: in csot_wrapper
          return func(self, *args, **kwargs)
      ..\..\..\site-packages\pymongo\mongo_client.py:1501: in _retry_internal
          ).run()
      ..\..\..\site-packages\pymongo\mongo_client.py:2347: in run
          return self._read() if self._is_read else self._write()
      ..\..\..\site-packages\pymongo\mongo_client.py:2464: in _write
          return self._func(self._session, conn, self._retryable)  # type: ignore
      ..\..\..\site-packages\pymongo\collection.py:853: in _update
          return self._update(
      ..\..\..\site-packages\pymongo\collection.py:806: in _update
          result = conn.command(
      ..\..\..\site-packages\pymongo\helpers.py:327: in inner
          return func(*args, **kwargs)
      ..\..\..\site-packages\pymongo\pool.py:1013: in command
          self._raise_connection_failure(error)
      ..\..\..\site-packages\pymongo\pool.py:985: in command
          return command(
      ..\..\..\site-packages\pymongo\network.py:202: in command
          reply = receive_message(conn, request_id)
      ..\..\..\site-packages\pymongo\network.py:317: in receive_message
          length, _, response_to, op_code = _UNPACK_HEADER(_receive_data_on_socket(conn, 16, deadline))
      ..\..\..\site-packages\pymongo\network.py:394: in _receive_data_on_socket
          wait_for_read(conn, deadline)
      _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
      
      conn = Connection(<socket.socket [closed] fd=-1, family=2, type=1, proto=0>) CLOSED at 1338676934672, deadline = None    
      
      def wait_for_read(conn: Connection, deadline: Optional[float]) -> None:
              """Block until at least one byte is read, or a timeout, or a cancel."""
              sock = conn.conn
              timed_out = False
              # Check if the connection's socket has been manually closed
              if sock.fileno() == -1:
                  return
              while True:
                  # SSLSocket can have buffered data which won't be caught by select.
                  if hasattr(sock, "pending") and sock.pending() > 0:
                      readable = True
                  else:
                      # Wait up to 500ms for the socket to become readable and then
                      # check for cancellation.
                      if deadline:
                          remaining = deadline - time.monotonic()
                          # When the timeout has expired perform one final check to
                          # see if the socket is readable. This helps avoid spurious
                          # timeouts on AWS Lambda and other FaaS environments.
                          if remaining <= 0:
                              timed_out = True
                          timeout = max(min(remaining, _POLL_TIMEOUT), 0)
                      else:
                          timeout = _POLL_TIMEOUT
                      readable = conn.socket_checker.select(sock, read=True, timeout=timeout)
                  if conn.cancel_context.cancelled:
      >               raise _OperationCancelled("operation cancelled")
      E               pymongo.errors._OperationCancelled: operation cancelled
       

      In the test we usually get this problem through using either insert or update.

       
      Between the tests we also drop the whole database between tests - and my hunch would be that the problem occurs from here somewhere, i.e. the dropping of databases between tests and then new tests using the same MongoClient to reach a database/collection.

      The tests are written as they expect the database to be totally empty, and even some tests check and expect that indexes are or are not created. Which makes the easy solution to drop databases/collections between tests.

      Definition of done: what must be done to consider the task complete?

      Find whatever is causing the issue, and fix it would be nice. As one would not want to use packages where builds are failing "randomly" at quite high chance.

      The exact Python version used, with patch level:

      $ python -c "import sys; print(sys.version)"

      3.11.8 (tags/v3.11.8:db85d51, Feb  6 2024, 22:03:32) [MSC v.1937 64 bit (AMD64)]

      (The issue is also found in our CI, which is 3.11)

      The exact version of PyMongo used, with patch level:

      $ python -c "import pymongo; print(pymongo.version); print(pymongo.has_c())"

      4.7.0
      True

      AND

      4.8.0
      True

      Describe how MongoDB is set up. Local vs Hosted, version, topology, load balanced, etc.

      Windows 11, and Linux (Dockerized in a CI)
      No frameworks or asynchronous network libraries used. ...)

      Although we have installed the `aws` module, but it is probably not used in tests.

      Tests work as expected in all versions before 4.7.0 thus now we are updating to 4.6.3 as that is the highest version we found without our tests and CI failing due to the issue.

            Assignee:
            shane.harvey@mongodb.com Shane Harvey
            Reporter:
            petter.salminen@ica.se Petter Salminen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: