Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-2452

PyMongo does not retry command responses with server-supplied RetryableWriteError error label

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.11.3
    • Affects Version/s: None
    • Component/s: None
    • None

      On MongoDB 4.4 we rely on the server to supply the RetryableWriteError error label rather than adding it to the exception in the driver. However, we seem to be ignoring the errorLabels field returned by the server and as a result, these commands are not retried by the driver. Repro script:

      from bson import SON
      from pymongo import MongoClient
      from pymongo.errors import WriteConcernError
      from pymongo.monitoring import CommandListener
      import logging
      
      
      class CommandLogger(CommandListener):
          def started(self, event):
              logging.info("Command {0.command_name} with request id "
                           "{0.request_id} started on server "
                           "{0.connection_id} - {0.command}".format(event))
      
          def succeeded(self, event):
              logging.info("Command {0.command_name} with request id "
                           "{0.request_id} on server {0.connection_id} "
                           "succeeded in {0.duration_micros} "
                           "microseconds - {0.reply}".format(event))
      
          def failed(self, event):
              logging.info("Command {0.command_name} with request id "
                           "{0.request_id} on server {0.connection_id} "
                           "failed in {0.duration_micros} "
                           "microseconds".format(event))
      
      
      logging.basicConfig(level=logging.INFO)
      
      
      client = MongoClient(directConnection=False, event_listeners=[CommandLogger()])
      client.admin.command(SON([('configureFailPoint', 'failCommand'),
                                ('mode', {'times': 1}),
                                ('data', {
                                    'failCommands': ['insert'],
                                    'writeConcernError': {
                                        'code': 91,
                                        'errmsg': 'Replication is being shut down'}})]))
      
      try:
          client.foo.bar.insert_one({})
      except WriteConcernError as exc:
          print(exc._error_labels)
          raise
      
      client.admin.command(SON([('configureFailPoint', 'failCommand'),
                                ('mode', 'off')]))
      
      

      The command monitoring output shows that the server returns the following reply to the insert operation:

      {'n': 1, 'opTime': {'ts': Timestamp(1607047144, 1), 't': 57}, 'electionId': ObjectId('7fffffff0000000000000039'), 'ok': 1.0, 'writeConcernError': {'code': 91, 'errmsg': 'Replication is being shut down'}, 'errorLabels': ['RetryableWriteError'], '$clusterTime': {'clusterTime': Timestamp(1607047144, 1), 'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'keyId': 0}}, 'operationTime': Timestamp(1607047144, 1)}
      

      However, the exception raised by does not have this label

      > exc                                                                                                                                                                                                                                                                                                                  
      WriteConcernError("Replication is being shut down, full error: {'code': 91, 'errmsg': 'Replication is being shut down'}")
      > exc.details                                                                                                                                                                                                                                                                                                          
      {'code': 91, 'errmsg': 'Replication is being shut down'}
      > exc.has_error_label('RetryableWriteError')                                                                                                                                                                                                                                                                           
      False
      

            Assignee:
            prashant.mital Prashant Mital (Inactive)
            Reporter:
            prashant.mital Prashant Mital (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: