Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-83680

Write commands are unsafely auto-retried by the cloud upon tenant migration error, causing duplicate commits.

    • Type: Icon: Bug Bug
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Execution
    • ALL
    • v8.0
    • Execution Team 2024-02-05, Execution Team 2024-02-19, Execution Team 2024-03-04, Execution Team 2024-03-18, Execution Team 2024-04-29, Execution Team 2024-06-10, Execution Team 2024-06-24
    • 200

      When batched updates (w/multi false) performing non-idempotent writes fail with TenantMigrationCommitted or TenantMigrationAborted errors, the Atlas proxy should only auto-retry the unsuccessful update statements, not the entire update command. Otherwise, this could lead to duplicate commits. CLOUDP-77814 accidentally missed handling batched updates and deletes, only addressing batched inserts. To be noted, this a problem only if the batched updates  are run outside of retryable writes. By default, the client drivers run with retryWrites:true. So, the severity is less for this case.

       
      While investigating the BF, I found that the aggregation $merge is also auto-retried by the proxy, causing inconsistent results. $merge can perform non-transactional writes, so it's not safe to retry. For this reason, $merge is not a supported retryable write. (Note: $out is not supported by Serverless, but $merge is supported)

       

      This problem has existed since the introduction of tenant migration in 5.0.

            Assignee:
            suganthi.mani@mongodb.com Suganthi Mani
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: