Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89650

Aggregations can retry non-idempotent operations that use cursors on certain failures

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0, 8.0.0-rc4
    • Affects Version/s: None
    • Component/s: None
    • None
    • Catalog and Routing
    • Fully Compatible
    • ALL
    • v8.0
    • CAR Team 2024-04-29

      Right now whenever the shard detects a failure for a StaleShardVersionError or a ShardCannotRefreshDueToLocksHeld error it checks to see if it's for a getMore cursor (respecively here, and here). This is because the cursor may have been consumed by getMore and closed it as a result which makes the error non-retryable.

      However, aggregations may do the same with a $mergeCursors stage since it may have done getMores on them and subsequently fail the operation due to the same errors. This is especially true of aggregations that use $lookup and $graphLookup.

      As a result, the operation is retried and terminally failed due to a CursorNotFound error, which is non-retryable due to not having the TransientTransactionError label.

            Assignee:
            jordi.olivares-provencio@mongodb.com Jordi Olivares Provencio
            Reporter:
            jordi.olivares-provencio@mongodb.com Jordi Olivares Provencio
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: