Enable fsm workloads that use moveChunk in sharded stepdown suites

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Duplicate
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Sharding
    • Sharding
    • Sharding 2019-06-17, Sharding 2019-07-15, Sharding 2019-08-26
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      There are a few concurrency workloads that explicitly use moveChunk, which are currently disallowed in the concurrency stepdown suites because moveChunk is considered a non-retryable command by the network error retry override. Conceptually, moving a chunk is retryable, but it's likely it was disallowed because it can return non-retryable by default error codes if interrupted (e.g. OperationFailed if persisting critical section signal fails).

      To get more coverage of stepdowns concurrent with moveChunks, it should be possible to add special logic to the network override to handle the particular errors returned by moveChunk instead, similar to the workarounds for other operations that return inconsistent codes. This would be especially valuable for the workloads that move chunks while running transactions, like random_moveChunk_broadcast_update_transaction.jsrandom_moveChunk_broadcast_update_transaction.js, and agg_with_chunk_migrations.js when running in the concurrency_sharded_multi_stmt_txn_with_stepdowns suite.

            Assignee:
            [DO NOT USE] Backlog - Sharding Team
            Reporter:
            Jack Mulrow
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: