Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45409

Rollback-via-refetch should wait for aborted two-phase index build threads to exit

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.3.4
    • Affects Version/s: None
    • Component/s: None
    • None
    • Fully Compatible
    • ALL
    • Execution Team 2020-01-27, Execution Team 2020-02-10
    • 33

      Before rollback, we abort and record all active index builds so that upon completion of rollback, we know which index builds may need to be restarted.

      The IndexBuildsCoordinator::onRollback() function does not wait for these index build threads to exit, which opens the possibility for index builds to run concurrently with rollback (not good).

      This can cause rollback to fail with errors like "There's already an index with name 'a_1' being built on the collection". This error is not an UnrecoverableRollbackError, so rollback-via-refetch will restart. On its next attempt, it no longer has information about the aborted index builds, and as a result, will not restart them. This leads to index inconsistencies.

      In general, any non-fatal error during rollback-via-refetch will result in two-phase index builds not being restarted. We should potentially store these aborted index builds at higher level to be more resilient to this issue.

            Assignee:
            louis.williams@mongodb.com Louis Williams
            Reporter:
            louis.williams@mongodb.com Louis Williams
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: