-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Unknown
-
None
-
Component/s: CSOT
-
None
-
Needed
Summary
What is the problem or use case, what are we trying to achieve?
The "Non-tailable cursor lifetime remaining timeoutMS applied to getMore if timeoutMode is unset" flakes fairly regularly in Node with the following error:
AssertionError: Expected event count mismatch, expected [ 'commandStartedEvent', 'commandStartedEvent' ] but got [ 'CommandStartedEvent' ]
Turns out connection establishment is taking roughly 40ms worth of time. This causes that the `find` to time out, not the `getMore`. As a result, we still get a timeout error in the test but we have too few CommandStartedEvents and the test fails.
I modified the test locally to pre-populate the pool with a connection, and cannot reproduce the behavior anymore. I've also fixed the test by increasing the timeout to a value such that an unusually slow connection establishment doesn't cause the find to fail (timeoutMS = 200, blockTimeMS=120 seems to work fine for me locally).
Motivation
Who is the affected end user?
Driver authors who have implemented CSOT.
How does this affect the end user?
Driver authors might see flaky CI.
How likely is it that this problem or use case will occur?
At least in Node, fairly common.
If the problem does occur, what are the consequences and how severe are they?
Red CI for no reason.
Is this issue urgent?
no.
Is this ticket required by a downstream team?
no.
Is this ticket only for tests?
yes.
Acceptance Criteria
Fix the flaky test, either by pre-populating the pool with a connection or increasing the test timeouts.