Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Networking & Observability
Operating System:
ALL
Sprint:
N&O Prioritized List
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

In ~~SERVER-54504~~, which landed in 6.2, we removed the ability to tune the taskExecutorPoolSize, and instead always set it to 1.

I believe that this was due to the addition of client-thread polling in the baton in ~~SERVER-34739~~, as my theory is that we were seeing heavy lock contention when we were using the baton to run work on the client threads & using more than one ShardingTaskExecutor (~~SERVER-77539~~ demonstrates this).

However, some customers (see linked HELP tickets) are experiencing performance regressions when they set taskExecutorPoolSize to 1, which prevents them from upgrading to 7.0+. These customers are running workloads where they have huge sharded clusters with queries that hit many shards in the cluster, resulting in a higher than usual load on the mongos's egress networking stack. In one of the cases, we saw very high waitTime metrics on the ShardingTaskExecutor reactor thread, perhaps suggesting that the single reactor thread & client baton were unable to keep up with the heavy load of egress networking requests.

We should re-evaluate the decision to fix taskExecutorPoolSize on 6.2+ given these customers needs, and understand if there are limits to the single ShardingTaskExecutor model that we were previously unaware of.

is related to

SERVER-96848 Reject work if reactor is overwhelmed

Backlog

split to

SERVER-102477 Revert change preventing tuning taskExecutorPoolSize on 7.0+

Blocked

Assignee:: Unassigned
Reporter:: Erin McNulty
Participants:: Erin McNulty
Votes:: 0 Vote for this issue
Watchers:: 14 Start watching this issue

Created:: Mar 14 2025 03:12:00 PM UTC
Updated:: Apr 11 2025 05:39:57 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates