Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-84468

Fix deadlock when running runTransactionOnShardingCatalog()

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.2.1, 7.3.0-rc0, 7.0.6
    • Affects Version/s: 7.0.0, 7.3.0-rc0, 7.2.0
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • v7.2, v7.0
    • CAR Team 2024-01-08
    • 26

      The function runTransactionOnShardingCatalog() is called when a config server operation needs to execute an internal transaction.

      This function creates a new OperationContext under an AlternativeClientServer that is set as interruptible a few lines below its creation.

      If the OperationContext got killed by the stepdown thread right before setting it as interruptible, we would end up in a deadlock. This is the event sequence for a deadlock:

       

      Under runTransactionOnShardingCatalog(), create a new OperationContext.

      Step-down thread kicks in and kills the recently created OperationContext, but it's not killed because it doesn't meet the conditions to be killed.

      The new OperationContext is set as interruptible (but late).

      The internal transaction checks out a session.

      Step-down thread acquires the RSTL lock.

      Step-down thread checks out all the active sessions to kill them. Gets stuck here since one session is still checked out by the non-interrupted thread.

      The internal transaction tries to get the RSTL lock here and gets stuck.

       

            Assignee:
            silvia.surroca@mongodb.com Silvia Surroca
            Reporter:
            silvia.surroca@mongodb.com Silvia Surroca
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: