Uploaded image for project: 'C# Driver'
  1. C# Driver
  2. CSHARP-2689

Occasional extremely poor performance when connection pool is exhausted

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.8.1
    • Component/s: Performance
    • None
    • Environment:
      Windows 10 Version 1703
      .NET Framework 4.6.1 & .NET Core 2.2
      MongoDB 4.0.5 & 4.0.9

      Issue

      We use use the C# MongoDB driver in an ASP .NET Core application and experience occasionally extremely poor performance. Some requests take 80-90 seconds, where they normally take ~1 seconds. This problem is triggered there is issued a large number of parallel tasks which fetches data from MongoDB. The data is fetched using the Async API.

      Repro & Test Results

      I've put together a minimal repro example illustrating the problem. The Visual Studio solution is attached on this ticket.

      The issue occurs when the following is run:

      async Task Fetch()

      Unknown macro: {   await collection.Find(filter).ToListAsync();   await collection.Aggregate<BsonDocument>(pipeline).ToListAsync(); }

      var tasks = Enumerable.Range(0, count).Select(i => Fetch()).ToList();
      await Task.WhenAll(tasks);

      The filter is simply for a single _id and the pipeline is a match for the same _id. The collection contains a single document with the _id requested. The 'count' variable is used to test various number in parallel. Strangely the problem goes away if I swap the order of the Find & Aggregate queries, or comment out one of them.

      When this code is run it starts to run extremely slow with a count of ~200-250 tasks. However it appears to run slow only the first time. Here is some run times with various counts, each time time the application is restarted and the queries are run two times:

      Run 1. Fetched 150 in 606 ms
      Run 2. Fetched 150 in 192 ms

      Run 1. Fetched 200 in 46 801 ms
      Run 2. Fetched 200 in 606 ms

      Run 1. Fetched 250 in 68 233 ms
      Run 2. Fetched 250 in 1 667 ms

      Run 1. Fetched 20 in 359 ms
      Run 2. Fetched 2500 in 2 843 ms

      At ~200 the first parallel queries take over 46 seconds. However if the first query has few in parallel, the second handles a lot more (Like the last example above, where running 2500 i parallel finishes in only 2.8 seconds). When the queries run fast (the first example above), I see in Windows Task Manager the number of threads for the application staying around ~30, when the problem occurs the number of threads increases by 1 for every second. So to me, when the problem occurs it appears that even though I'm using the Async methods, something is blocking causing new threads to constantly be spawned.

      If the problem only would occur at the first request it would not be such a big deal, however in our ASP .NET Core application we see the problem not only for the first request, but occasionally (The IMongoClient & IMongoDatabase are singletons).

      Temporary Fix

      Increasing the MaxConnectionPool of the MongoClient to f.ex. 1000 makes the problem disappear. However I think this is a band-aid fix, just masking the problem.

      Sidenote

      I suspect this issue is somewhat related to: https://jira.mongodb.org/browse/CSHARP-1895

      When I change the fetch function in the repro example to use the Sync API:

      async Task Fetch()
      {
        await Task.Run(() =>
       

      Unknown macro: {     collection.Find(filter).ToList();     collection.Aggregate<BsonDocument>(pipeline).ToList();   }

      );
      }

      I get the following example after 30 seconds:

      System.TimeoutException: 'A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector

      Unknown macro: { AllowedLatencyRange = 00}

      }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "Automatic", Type : "Unknown", State : "Disconnected", Servers : [{ ServerId: "

      Unknown macro: { ClusterId }

      ", EndPoint: "Unspecified/localhost:27017", State: "Disconnected", Type: "Unknown", LastUpdateTimestamp: "2019-08-14T09:37:22.4361867Z" }] }.

      Which is the same exception seen in 1895.

       

       

       

            Assignee:
            robert@mongodb.com Robert Stam
            Reporter:
            a.theodorsen@krohne.com Anders Theodorsen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: