Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-92005

Cluster dbStats Can Trigger Invariant with Concurrent removeShard

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 8.1.0-rc0, 8.0.4
    • Affects Version/s: 5.0.27, 8.1.0-rc0, 6.0.16, 7.3.3, 7.0.12, 8.0.0-rc10
    • Component/s: None
    • Query Optimization
    • Fully Compatible
    • ALL
    • v8.0, v7.3, v7.0, v6.0, v5.0

      The sharding version of dbStats will do a scatter gather request to all shards and will later aggregate all of the results. The aggregate logic assumes that all of the responses are successful, and uses an invariant to enforce this. It appears that the caller is relying on the result of appendRawResponses to determine whether or not a shard returned an error, such that the uassert would be triggered and aggregateResults() would only be called if all results were success. However, appendRawResponses treats ShardNotFound errors as success if at least one shard returned success.

      It would appear that in most cases this is not a problem, because even if a shard is removed without the mongos running dbStats knowing about it, it can still target and execute a request against that shard (probably receiving some network error). ShardNotFound will only be returned if the mongos learns that a shard was removed between targeting that shard and executing the request (for example, if the background refresh happens to coincide with this point in time).

            Assignee:
            jess.balint@mongodb.com Jess Balint
            Reporter:
            brett.nawrocki@mongodb.com Brett Nawrocki
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: