Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-978

Unable to connect to replicaset when secondary offline

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.2.26
    • Affects Version/s: 2.2.25
    • Component/s: None
    • Environment:
      Node.JS 0.10.26, 0.10.48
      RHEL 6.8

      Hi,

      We recently upgraded our MongoDB Node.js driver as our MongoDB server was upgraded to 3.2.

      Mongodb 2.0.34 --> 2.2.24
      Mongodb-core 1.2.0 --> 2.1.8

      Our MongoDB is a 3 node replicaset with self signed ssl certs.
      Since upgrading the Node.js drivers our application now exits with an uncaught exception if one of the secondary nodes goes offline.
      We are also unable to start our application due to the same uncaught exception if one of the secondary nodes is offline.

      The error is:

      events.js:72
              throw er; // Unhandled 'error' event
                    ^
      Error: socket hang up
          at SecurePair.error (tls.js:1060:23)
          at EncryptedStream.CryptoStream._done (tls.js:752:22)
          at EncryptedStream.read [as _read] (tls.js:544:12)
          at EncryptedStream.Readable.read (_stream_readable.js:341:10)
          at CleartextStream.onCryptoStreamFinish (tls.js:353:47)
          at CleartextStream.g (events.js:180:16)
          at CleartextStream.emit (events.js:92:17)
          at finishMaybe (_stream_writable.js:360:12)
          at endWritable (_stream_writable.js:367:3)
          at CleartextStream.Writable.end (_stream_writable.js:345:5)
          at CleartextStream.CryptoStream.end (tls.js:690:31)
          at Connection.destroy (/xyz/node_modules/mongodb-core/lib/connection/connection.js:494:21)
          at null.<anonymous> (/xyz/install/node_modules/mongodb-core/lib/connection/pool.js:238:10)
          at g (events.js:180:16)
          at emit (events.js:98:17)
          at CleartextStream.<anonymous> (/xyz/node_modules/mongodb-core/lib/connection/connection.js:177:49)
          at CleartextStream.g (events.js:180:16)
          at CleartextStream.emit (events.js:95:17)
          at Socket.onerror (tls.js:1501:17)
          at Socket.emit (events.js:117:20)
          at net.js:441:14
          at process._tickCallback (node.js:458:13)
      

      If I remove the offline server from the clients connection string, it will obtain the list of servers in the replicaset from the master node, then try to connect to the offline server as it is still listed as a member of the replicaset and so exit with the same error.

      Having a look at the code:
      lib/connection/pool.js

      function connectionFailureHandler(self, event) {
        return function(err) {
          if (this._connectionFailHandled) return;
          this._connectionFailHandled = true;
          // Destroy the connection
          {color:red}this.destroy();{color}
      
          // Remove the connection
      

      connectionFailureHandler gets triggered with event = 'error' and err =

      { [MongoError: connect ECONNREFUSED] name: 'MongoError', message: 'connect ECONNREFUSED' }

      It will then try to destroy the connection.
      which will lead to

      lib/connection/connection.js

      Connection.prototype.destroy = function() {
        // Set the connections
        if(connectionAccounting) deleteConnection(this.id);
        if(this.connection) {
          {color:red}this.connection.end();{color}
          this.connection.destroy();
        }
      
        this.destroyed = true;
      }
      

      When this.connection.end() is called it will result in an exception being thrown which is never caught and causes our application to exit with an uncaught exception.

      If I put this.connection.end() inside a try/catch block it will allow our application to connect correctly. Although this may lead to a lot of connection retries?
      I have also tried putting a try/catch around the 'this.destroy();' call in the pool.js connectionFailureHandler which also allows our application to connect correctly.

      thanks for any help you can provide,
      Geoff

            Assignee:
            christkv Christian Amor Kvalheim
            Reporter:
            gcummings Geoff Cummings
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: