Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-53168

Support 50 concurrent migrations on a single recipient

    • Fully Compatible
    • Repl 2021-02-08, Repl 2021-02-22

      Currently our tenant migration recipient thread pool default size is 8 (and it’s a tunable server startup parameter). For each migration, we have components, like oplog fetcher & cloner, on recipient side that would do some synchronous job (fetching data from remote donor node) on the tenant migration recipient thread , without yielding the thread. With the default thread pool size as 8, we can expect only at most 3 concurrent migration to be initiated on recipient side (per migration, 2 threads for sync jobs + 1 thread for async job),. Otherwise, concurrent tenant migration can lead to complete stalling of all active tenant migrations on recipient side.

      Consider the case, say, tenant migration recipient thread pool size is 4.
      1) Assume Recipient received recipeintSyncData comand for migration id 1, 2,3 and all of them have started the oplog fetcher and at runQuery(). At this point, we are left with only one free worker thread in the tenant migration recipient thread pool
      2) Now, the recipient received recipeintSyncData comand for migration id #4, that would successfully able to start the oplog fetcher

      So, now, we have no free worker threads left in the tenant migration recipient thread pool to start the cloner. All 4 tenant migrations would hang on recipient side until we cancel one migration explicitly using ForgetMigration cmd.

            Assignee:
            lingzhi.deng@mongodb.com Lingzhi Deng
            Reporter:
            suganthi.mani@mongodb.com Suganthi Mani
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: