Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-84623

Shard-local re-execution of a command might bubble up a misleading StaleConfig exception to the router

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0, 8.0.0-rc5
    • Affects Version/s: 5.0.0, 6.0.0, 7.0.0, 8.0.0-rc0, 7.3.0
    • Component/s: None
    • None
    • Catalog and Routing
    • Fully Compatible
    • ALL
    • v8.0, v7.0, v6.0, v5.0
    • CAR Team 2024-02-05, CAR Team 2024-02-19, CAR Team 2024-03-04, CAR Team 2024-03-18, CAR Team 2024-04-01, CAR Team 2024-04-29, CAR Team 2024-05-13, CAR Team 2024-05-27
    • 200
    • 2

      In some cases, when a shard realizes that the filtering metadata is not properly installed, it fails the current execution of the command, forces a refresh and then it re-executes the original command. If that refresh ends up failing, the error that is bubbled up to the router is the StaleConfig one and not the new one we got from the refresh. This behavior in general makes a lot of sense: the router will retry the whole command and then the shard will execute the same steps as before, hoping that the refresh won't fail.

      Bubbling up the StaleConfig instead of the proper error might be problematic in some cases, though: imagine that we get a NotPrimary exception. If we bubble up a StaleConfig exception the router won't realize that the primary has changed, potentially leading to a re-execution of the command with exactly the same consequences until we  exhaust the retries.

            Assignee:
            pol.pinol@mongodb.com Pol Pinol
            Reporter:
            sergi.mateo-bellido@mongodb.com Sergi Mateo Bellido
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: