Uploaded image for project: 'Go Driver'
  1. Go Driver
  2. GODRIVER-2525

Occasional handshake error when using mongodb+srv with mongos pool

    • Type: Icon: Bug Bug
    • Resolution: Gone away
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: 1.9.1
    • Component/s: Connections, Error Handling
    • None

      Summary

      About once a day, we see an error like this: connection() error occurred during connection handshake: dial tcp: lookup foo-bar-mongos.svc.cluster.local on 169.254.25.10:53: no such host

      We are using 1.9.1 mongo driver with the following setup:

      • sharded cluster
      • mongos instances are run as an auto-scaled pool
      • access to mongos is via SRV record

      Due to how relatively rare these errors are, we assume they take place when one of mongos instances are either starting or shutting down.  

      Our guess is that the nature of the issue is in a race between SRV and A records, possibly coupled with DNS caches etc. And this seems like the kind of issue that is better handled inside a driver itself. 

      At this time we can propose no trivial WTR for this issue. If we can be of any help with diagnosing the issue, such as enabling verbose logs and sending them to you, feel free to give instructions. 

            Assignee:
            benji.rewis@mongodb.com Benji Rewis (Inactive)
            Reporter:
            petr.ivanov.s@gmail.com Peter Ivanov
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: