Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82967

Stepdown after calling ActiveIndexBuilds::registerIndexBuild() during index build setup doesn't unregister itself

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.2.1, 7.3.0-rc0, 7.0.5, 6.0.13
    • Affects Version/s: None
    • Component/s: None
    • None
    • Storage Execution
    • Fully Compatible
    • ALL
    • v7.2, v7.0, v6.0, v5.0
    • Execution Team 2023-11-27, Execution Team 2023-12-11
    • 16

      After we get into this state, building the same index on the new primary has the following outcomes:

      In debug builds we crash

      [js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.169+00:00"},"s":"I",  "c":"STORAGE",  "id":20661,   "ctx":"ReplWriterWorker-0","msg":"Index build conflict. There's already an index with the same name being built under an existing index build","attr":{"buildUUID":{"uuid":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"}},"existingBuildUUID":{"uuid":{"$uuid":"c5ea1251-ae19-45f1-8858-2411dd90abc8"}},"index":"a_1","collectionUUID":{"uuid":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"}}}}
      [js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.171+00:00"},"s":"E",  "c":"REPL",     "id":21262,   "ctx":"ReplWriterWorker-0","msg":"Failed command during oplog application","attr":{"command":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"db":"test","error":{"code":285,"codeName":"IndexBuildAlreadyInProgress","errmsg":"Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up"}}}
      [js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.172+00:00"},"s":"F",  "c":"REPL",     "id":21237,   "ctx":"ReplWriterWorker-0","msg":"Error applying operation","attr":{"oplogEntry":{"oplogEntry":{"op":"c","ns":"test.$cmd","ui":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"},"o":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"ts":{"$timestamp":{"t":1699468691,"i":2}},"t":2,"v":2,"wall":{"$date":"2023-11-08T18:38:11.116Z"}}},"error":" :: caused by :: IndexBuildAlreadyInProgress: Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up"}}
      [js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.173+00:00"},"s":"F",  "c":"REPL",     "id":21235,   "ctx":"OplogApplier-0","msg":"Failed to apply batch of operations","attr":{"numOperationsInBatch":1,"firstOperation":{"oplogEntry":{"op":"c","ns":"test.$cmd","ui":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"},"o":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"ts":{"$timestamp":{"t":1699468691,"i":2}},"t":2,"v":2,"wall":{"$date":"2023-11-08T18:38:11.116Z"}}},"lastOperation":{"oplogEntry":{"op":"c","ns":"test.$cmd","ui":{"$uuid":"ab811fbc-6eb1-4b68-9d88-9a2a150a382a"},"o":{"startIndexBuild":"coll","indexBuildUUID":{"$uuid":"8de98a3e-b852-4cbf-a08b-58656f77f966"},"indexes":[{"v":2,"key":{"a":1},"name":"a_1"}]},"ts":{"$timestamp":{"t":1699468691,"i":2}},"t":2,"v":2,"wall":{"$date":"2023-11-08T18:38:11.116Z"}}},"failedWriterThread":11,"error":"IndexBuildAlreadyInProgress: Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up"}}
      [js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.174+00:00"},"s":"F",  "c":"ASSERT",   "id":23095,   "ctx":"OplogApplier-0","msg":"Fatal assertion","attr":{"msgid":34437,"error":"IndexBuildAlreadyInProgress: Index build conflict: 8de98a3e-b852-4cbf-a08b-58656f77f966: There's already an index with name 'a_1' being built on the collection  ( ab811fbc-6eb1-4b68-9d88-9a2a150a382a ) under an existing index build: c5ea1251-ae19-45f1-8858-2411dd90abc8 index build state: Setting up","file":"src/mongo/db/repl/oplog_applier_impl.cpp","line":624}}
      [js_test:repro] d20040| {"t":{"$date":"2023-11-08T18:38:11.175+00:00"},"s":"F",  "c":"ASSERT",   "id":23096,   "ctx":"OplogApplier-0","msg":"\n\n***aborting after fassert() failure\n\n"}
      

      In non-debug builds the following is logged:

      [j0:n0] | 2023-11-01T05:33:25.121+00:00 I  STORAGE  20661   [ReplWriterWorker-0] "Index build conflict. There's already an index with the same name being built under an existing index build","attr":{"buildUUID":{"uuid":{"$uuid":"fa8424f2-b104-4d2d-8295-aa58eedebc85"}},"existingBuildUUID":{"uuid":{"$uuid":"b77dd15d-2abf-4181-8017-e8da192a532e"}},"index":"testDb_1","collectionUUID":{"uuid":{"$uuid":"ec105bf2-5987-4aa7-a454-27246446d37c"}}}
      [j0:n0] | 2023-11-01T05:33:25.121+00:00 W  REPL     7149001 [ReplWriterWorker-0] "Potential replication constraint violation during steady state replication","attr":{"msg":"received an acceptable error during oplog application","obj":{"oplogEntry":{"op":"c","ns":"admin.$cmd","ui":{"$uuid":"ec105bf2-5987-4aa7-a454-27246446d37c"},"o":{"startIndexBuild":"jstests_rename5","indexBuildUUID":{"$uuid":"fa8424f2-b104-4d2d-8295-aa58eedebc85"},"indexes":[{"v":2,"key":{"testDb":1},"name":"testDb_1"}]},"ts":{"$timestamp":{"t":1698816805,"i":10}},"t":4,"v":2,"wall":{"$date":"2023-11-01T05:33:25.116Z"}}},"status":{"code":285,"codeName":"IndexBuildAlreadyInProgress","errmsg":"Index build conflict: fa8424f2-b104-4d2d-8295-aa58eedebc85: There's already an index with name 'testDb_1' being built on the collection  ( ec105bf2-5987-4aa7-a454-27246446d37c ) under an existing index build: b77dd15d-2abf-4181-8017-e8da192a532e index build state: Setting up"}}
      

      and the index build commit quorum will not be satisfied (different buildUUIDs for the same index build).

      When the node was stepped down, nothing about the index build was replicated to the secondaries yet. The affected node never builds the index in the first place as it's interrupted, but we forgot to reset in-memory state.

            Assignee:
            shinyee.tan@mongodb.com Shin Yee Tan
            Reporter:
            gregory.wlodarek@mongodb.com Gregory Wlodarek
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: