-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Connections
-
None
-
(copied to CRM)
We have been migrating our applications from using the community mgo diriver to the official mongo driver.
Some of our higher traffic applications (100-200rps) have started to experiance sporadic issues in production. We are also able to replicate the same problem under a loadtest running in our staging environment.
After much debugging it seems to be the driver doesnt have a connection available to the database so it attempts to open a new connection. However we are seeing that operation take upwards of 3 seconds in some cases (We have pprof outputs showing 1.6+ seconds). The issue is isolated to individual pods, no corrolation between connecting AZ etc..
To counter the issue we tried to specifiy the minPoolSize which was added in v1.1.0, however if this option is specified the driver doesnt actually start (LINK ISSUE HERE). We also tried from master (22646d953d8106e567b1da9aab98b627a2fb204f) and driver is able to connect to mongo but then panics (LINK ISSUE HERE)
Here you can see the case im trying to describe. All of the time seems to be taken in sha1.blockAMD64:
A pprof taken from one of the other pods that didn't have any issues at the same time:
It doesnt seem like i can attach the pprof profiles here.. but if you need access to them i can email or send over direct message.
We are connecting to mongo in the following way:
https://gist.github.com/BradErz/a947198bddf537532190fdb5ea015af3#file-mongoconnection-go
This is the code handling the query:
https://gist.github.com/BradErz/a947198bddf537532190fdb5ea015af3#file-persistence-go
If we can help by providing any more information/debugging please let me know.
- depends on
-
GODRIVER-1298 Panic in topology/pool.go:416
- Closed