Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-84709

Resharding critical section timeout is not honored on stepdown

    • Cluster Scalability
    • ALL
    • Hide

      The attached repro is not perfect since it assumes that the stepdown will happen before the timeout is hit, but it has reproduced the problem pretty consistently in my environment.

      Show
      The attached repro is not perfect since it assumes that the stepdown will happen before the timeout is hit, but it has reproduced the problem pretty consistently in my environment.
    • Cluster Scalability 2024-09-02, Cluster Scalability 2024-10-14, Cluster Scalability 2024-10-28, Cluster Scalability 2024-11-11

      The reshardingCriticalSectionTimeoutMillis parameter is intended to bound the amount of time that the critical section will be held during resharding. This is implemented by scheduling a callback which sets an error if the timeout is exceeded.

      However, this is a local callback that is scheduled, and it seems as though it is never re-scheduled in the case of stepdown so the timeout parameter will be ignored after a stepdown occurs.

        1. cs_timeout_repro.patch
          7 kB
          Allison Easton

            Assignee:
            daisy.kucharski@mongodb.com Daisy Kucharski
            Reporter:
            allison.easton@mongodb.com Allison Easton
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: