Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-42494

Deadlock between aggregation pipeline and IndexBuildsCoordinator in storage engines that do not support document level locking

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Concurrency, Storage
    • None
    • Storage Execution
    • ALL
    • 19

      The IndexBuildsCoordinator can get into a deadlock scenario the following way:
      Thread 1:

      • Begin an index build on test.one.
      • Grab an intent lock on test and an exclusive lock on test.one.
      • Grab the IndexBuildsCoordinator mutex.
      • Initialize the MultiIndexBlock calling init().
      • Log the operation to the oplog (Grabs intent lock on local and wait to grab intent lock on local.oplog.rs. See logic in logOp() below.
      if (!opCtx->getServiceContext()->getStorageEngine()->supportsDocLocking()) {
          dbWriteLock.emplace(opCtx, NamespaceString::kLocalDb, MODE_IX);
          collWriteLock.emplace(opCtx, oplogInfo->getOplogCollectionName(), MODE_IX);
      }
      

      Thread 2:

      • During the aggregation pipeline, run a rename command to rename a temporary collection.
      • Rename with source test.tmp.two and target test.two.
      • Grab the intent lock on test and exclusive locks on both tmp.two and two.
      • Check if there are any ongoing index builds by calling IndexBuildsCoordinator:: assertNoIndexBuildInProgForCollection().
      • Waits to grab the IndexBuildsCoordinator mutex.

      Since thread 1 is stuck waiting for an intent lock on local.oplog.rs it must mean that thread 2 is holding an exclusive lock on local.oplog.rs before running the rename command as I could not find any lock acquisitions for it during the rename call.

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            gregory.wlodarek@mongodb.com Gregory Wlodarek
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: