Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45916

On primary, 2-phase index build cleanup writes an abortIndexBuild oplog entry under a stronger mode user collection lock X which can lead to 3 way deadlock with prepared transactions, step down and index build

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.7.0
    • Affects Version/s: None
    • Component/s: Storage
    • None
    • Fully Compatible
    • ALL
    • Execution Team 2020-04-06, Execution Team 2020-04-20, Execution Team 2020-05-04

      Consider the following sequence,
      1) Start an index build on collection A on primary.
      2) Prepare the transaction on collection A.
      3) Index build gets aborted can be possibly due to some killOp cmd or due to some key constraint errors.
      4) As a result of index build failure, it tries to do the cleanup phase. Assume, it's here. So, index build thread has acquired RSTL in mode IX and the uninterruptible lock guard is enabled.
      5) Now, assumed stepDown cmd comes in. So, it's going to enqueue the RSTL in mode X. But, blocked behind the index build thread.
      6) Now, the index builder thread tries to acquire collection lock in X mode to write the abortIndexBuild oplog entry and to tear down the index build. But this step, gets blocked behind prepared transaction due to collection lock conflict.
      7) Prepared transaction's commit command blocks behind the step down thread.

            Assignee:
            louis.williams@mongodb.com Louis Williams
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: