-
Type: Bug
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: 1.4.0
-
None
-
Environment:Local testing on:
macOS 10.15.7
go1.16.3 darwin/amd64
tested against mongo docker image https://github.com/docker-library/mongo/blob/9db9e3d4704f5d963e424a3894fa740b8ce4ea70/4.4/Dockerfile
We see delays when we upgrade our mongo driver from `v1.3.2` to `v1.4.0`.
Attached is the `main.go` used to generate the following output.
It will connect a client, prepare the test space, then perform 3 read (`find`) queries and 3 write (`insert`) queries with and without the failCommand enabled. Below is a snippet of the `find` with failCommand enabled.
Output with `v1.3.2`:
```
== Testing Find queries failpoint ENABLED ==
- Time elapsed to do: connect client: 38.833µs
- Time elapsed to do: clear failpoint: 7.497043ms
- Time elapsed to do: populate table: 995.285µs
- Time elapsed to do: set failpoint: 890.543µs
- Time elapsed to do: find query 0: 502.632805ms
Encountered expected error: (mongo.CommandError): (NotWritablePrimary) Failing command due to 'failCommand' failpoint - Time elapsed to do: find query 1: 998.847787ms
Encountered expected error: (mongo.CommandError): (NotWritablePrimary) Failing command due to 'failCommand' failpoint - Time elapsed to do: find query 2: 999.679731ms
Encountered expected error: (mongo.CommandError): (NotWritablePrimary) Failing command due to 'failCommand' failpoint - Time elapsed to do: clear failpoint: 500.759202ms
- Time elapsed to do: clear failpoint: 1.258652ms
- Time elapsed to do: clean table post test: 1.381183ms
- Time elapsed to do: disconnect client: 1.103236ms
```
First: ~500m
Second/third: ~1s
Output with `v1.4.0`:
```
== Testing Find queries failpoint ENABLED ==
- Time elapsed to do: connect client: 41.041µs
- Time elapsed to do: clear failpoint: 5.948902ms
- Time elapsed to do: populate table: 946.695µs
- Time elapsed to do: set failpoint: 849.93µs
- Time elapsed to do: find query 0: 9.996635035s
Encountered expected error: (mongo.CommandError): (NotWritablePrimary) Failing command due to 'failCommand' failpoint - Time elapsed to do: find query 1: 20.015427043s
Encountered expected error: (mongo.CommandError): (NotWritablePrimary) Failing command due to 'failCommand' failpoint - Time elapsed to do: find query 2: 19.987305077s
Encountered expected error: (mongo.CommandError): (NotWritablePrimary) Failing command due to 'failCommand' failpoint - Time elapsed to do: clear failpoint: 10.022557671s
- Time elapsed to do: clear failpoint: 1.460051ms
- Time elapsed to do: clean table post test: 1.456988ms
- Time elapsed to do: disconnect client: 1.322098ms
```
First: ~10s
Second/third: ~20s
The query latency jumps x10. The time to clear the failpoint does as well.
- Why does this happen?
- Why does it appear to double for every query after the first?
The same pattern appears for the write query.
Impact: As a result we cannot depend on our unit tests as all queries with failCommand have this delay.