-
Type: Bug
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.8.1
-
Component/s: Performance
-
None
-
Environment:Windows 10 Version 1703
.NET Framework 4.6.1 & .NET Core 2.2
MongoDB 4.0.5 & 4.0.9
Issue
We use use the C# MongoDB driver in an ASP .NET Core application and experience occasionally extremely poor performance. Some requests take 80-90 seconds, where they normally take ~1 seconds. This problem is triggered there is issued a large number of parallel tasks which fetches data from MongoDB. The data is fetched using the Async API.
Repro & Test Results
I've put together a minimal repro example illustrating the problem. The Visual Studio solution is attached on this ticket.
The issue occurs when the following is run:
async Task Fetch()
Unknown macro: { await collection.Find(filter).ToListAsync(); await collection.Aggregate<BsonDocument>(pipeline).ToListAsync(); }var tasks = Enumerable.Range(0, count).Select(i => Fetch()).ToList();
await Task.WhenAll(tasks);
The filter is simply for a single _id and the pipeline is a match for the same _id. The collection contains a single document with the _id requested. The 'count' variable is used to test various number in parallel. Strangely the problem goes away if I swap the order of the Find & Aggregate queries, or comment out one of them.
When this code is run it starts to run extremely slow with a count of ~200-250 tasks. However it appears to run slow only the first time. Here is some run times with various counts, each time time the application is restarted and the queries are run two times:
Run 1. Fetched 150 in 606 ms
Run 2. Fetched 150 in 192 msRun 1. Fetched 200 in 46 801 ms
Run 2. Fetched 200 in 606 msRun 1. Fetched 250 in 68 233 ms
Run 2. Fetched 250 in 1 667 msRun 1. Fetched 20 in 359 ms
Run 2. Fetched 2500 in 2 843 ms
At ~200 the first parallel queries take over 46 seconds. However if the first query has few in parallel, the second handles a lot more (Like the last example above, where running 2500 i parallel finishes in only 2.8 seconds). When the queries run fast (the first example above), I see in Windows Task Manager the number of threads for the application staying around ~30, when the problem occurs the number of threads increases by 1 for every second. So to me, when the problem occurs it appears that even though I'm using the Async methods, something is blocking causing new threads to constantly be spawned.
If the problem only would occur at the first request it would not be such a big deal, however in our ASP .NET Core application we see the problem not only for the first request, but occasionally (The IMongoClient & IMongoDatabase are singletons).
Temporary Fix
Increasing the MaxConnectionPool of the MongoClient to f.ex. 1000 makes the problem disappear. However I think this is a band-aid fix, just masking the problem.
Sidenote
I suspect this issue is somewhat related to: https://jira.mongodb.org/browse/CSHARP-1895
When I change the fetch function in the repro example to use the Sync API:
async Task Fetch()
{
await Task.Run(() =>
Unknown macro: { collection.Find(filter).ToList(); collection.Aggregate<BsonDocument>(pipeline).ToList(); });
}
I get the following example after 30 seconds:
System.TimeoutException: 'A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector
Unknown macro: { AllowedLatencyRange = 00}}. Client view of cluster state is { ClusterId : "1", ConnectionMode : "Automatic", Type : "Unknown", State : "Disconnected", Servers : [{ ServerId: "
Unknown macro: { ClusterId }", EndPoint: "Unspecified/localhost:27017", State: "Disconnected", Type: "Unknown", LastUpdateTimestamp: "2019-08-14T09:37:22.4361867Z" }] }.
Which is the same exception seen in 1895.