Uploaded image for project: 'C# Driver'
  1. C# Driver
  2. CSHARP-204

Concurrency Issue on failure recovery

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 1.1
    • Affects Version/s: 1.0
    • Component/s: None
    • Environment:
      Windows 2003, IIS, connected to a single mongos instance
    • Fully Compatible

      Note: using git revision 1a730b for line numbers.

      While doing high-concurrency load tests (~2000 connections) against our web service which calls mongo I ran into a concurrency issue that narrowed down to a problem between MongoServer.Disconnect() and MongoServer.AcquireConnection(MongoDatabase database,bool slaveOk).

      The issue occurs when an exception occurs within a MongoConnection between the GetConectionPoolEndPoint call (MongoServer.cs:798) and the connectionPool.AcquireConnection(database) (MongoServer.cs:807) call in AcquireConnection(database, endPoint) (MongoServer.cs:799).

      Since MongoConnection.HandleException() is forcing a server disconnect, when connectionPool.AcquireConnection(database) gets called, the result is eventually a InvalidOperationException("Attempt to get a connection from a closed connection pool") exception (MongoConnectionPool.cs:103).

      As this is a concurrency issue that I've only seen in a very specific test, I have been unable to write a reliable test for it (we have to saturate the data link with several Gbps of data to make it fail semi-consistently).

      Attached is the patch that I've implemented to fix this issue, since the lock() calls are reentrant we simply wrap both GetConectionPoolEndPoint and AcquireConnection with a lock(serverLock). So far this code has performed well, but it needs to be thoroughly reviewed to make sure my simple fix will not cause deadlocks on some scenario I have not taken into account.

      This bug possibly affects previous versions.

            Assignee:
            robert@mongodb.com Robert Stam
            Reporter:
            mpilar Miguel Pilar
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: