-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
Fully Compatible
-
Repl 2021-02-08, Repl 2021-02-22
Currently our tenant migration recipient thread pool default size is 8 (and it’s a tunable server startup parameter). For each migration, we have components, like oplog fetcher & cloner, on recipient side that would do some synchronous job (fetching data from remote donor node) on the tenant migration recipient thread , without yielding the thread. With the default thread pool size as 8, we can expect only at most 3 concurrent migration to be initiated on recipient side (per migration, 2 threads for sync jobs + 1 thread for async job),. Otherwise, concurrent tenant migration can lead to complete stalling of all active tenant migrations on recipient side.
Consider the case, say, tenant migration recipient thread pool size is 4.
1) Assume Recipient received recipeintSyncData comand for migration id 1, 2,3 and all of them have started the oplog fetcher and at runQuery(). At this point, we are left with only one free worker thread in the tenant migration recipient thread pool
2) Now, the recipient received recipeintSyncData comand for migration id #4, that would successfully able to start the oplog fetcher
So, now, we have no free worker threads left in the tenant migration recipient thread pool to start the cloner. All 4 tenant migrations would hang on recipient side until we cancel one migration explicitly using ForgetMigration cmd.
- depends on
-
SERVER-54090 SSLConfiguration use after free when running concurrent migrations
- Closed
-
SERVER-54328 Refactor creation of transient SSLConnectionContext to own its own instance of SSLManagerInterface
- Closed