Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Critical - P2
Fix Version/s: 6.10.0
Affects Version/s: 6.5.0, 6.6.2
Component/s: Connection Layer
Labels:
- external-user

Story Points:
0
Investigation Story Points:
5
Case:
Compass/DevTools Changes:
Not Needed
Confidence Status:
None

Checklist:

show more show less

Documentation Changes:
Not Needed

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Our issue has been ongoing for several versions now and have not been able to go beyond 5.9.2 version of the driver. I believe it is the same as ~~NODE-6166~~ which was closed without finding the issue.

We have both `find` and `bulkWrite` operations that will behave similarly. We only see the issue in production, and it occurs roughly 1 in 3,000 of our updates, and each of those updates involves 4 or 5 bulkWrite operations. We have moderately heavy load, running about ~4,000 updates per second across all nodes, which is about 40/s on each node.

Seemingly randomly, those operations never return, but there is no active connection to the db server. I have viewed internals, and a connection from the connection pool is in use for each of the updates that hang. I believe the problem must be in the reading of the response on the socket, but this driver code has changed so much since 5.9.2 it is very difficult to know what it could be.

We are connected to Atlas with a 3 replica cluster. No unusual activity on server shows in monitoring. Although random and rare relative to the number of updates, it is easy to reproduce when making many updates, although I have not been able to reproduce locally with simulated load testing.

The server is running v5.

related to

NODE-6166 Write Ops Are Persisted But Sometimes Driver Function Does Not Return

Closed

Assignee:: Bailey Pearson
Reporter:: Garr Godfrey
Reviewers:: None
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Sep 11 2024 06:49:06 AM UTC
Updated:: Oct 22 2024 02:50:47 PM UTC
Resolved:: Oct 22 2024 02:50:47 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates