Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-77633

Calling withTransaction with a checked out session may end up in a deadlock (on stepdown)

    • Type: Icon: Task Task
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Sharding EMEA
    • Sharding EMEA 2023-06-12, Sharding EMEA 2023-06-26
    • 135

      Any code running `withTransaction` may end up with a deadlock if the given OperationContext holds a SessionId and there is a stepdown during the transaction process. Right now we don't have any thread that holds a session when `withTransaction` is called, however, it should be fixed to avoid hitting this error in the future.

      The sequence of events leading to a deadlock is the following:

      • (): withTransaction thread
      • (): step-down thread

      1. () checks out a SessionId
      2. () run `withTransaction`
      3. () step-down thread starts and an Interruption is sent to all the threads.
      4. () abortTransaction is executed
      4. () step-down thread acquires RSTL lock
      5. () tries to checkout all sessions to kill them
      6. () gets blocked when trying to checkout the session of thread A
      7. () gets blocked trying to acquire RSTL lock to abort the transaction.

      withTransaction is a method implemented as a utility for the ShardingCatalogManager when new transactions API didn't exist.

      The new transaction API yields the session attached to the thread to avoid this scenario. So I suggest getting rid of withTransaction code and using the new transaction API instead. This is an example of implementation for the new transaction API

      This issue was discovered when the sessionId was attached to the ConfigsvrCollMod request. The sessionId was finally removed to solve quickly the bug.

            Assignee:
            silvia.surroca@mongodb.com Silvia Surroca
            Reporter:
            silvia.surroca@mongodb.com Silvia Surroca
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: