-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Unknown
-
None
-
Component/s: Server Selection
-
None
-
Not Needed
Summary
This is follow-up ticket for DRIVERS-1998. We need to clarify behavior if pausable pool exception and pinning server are involved.
Motivation
Who is the affected end user?
Drivers
How likely is it that this problem or use case will occur?
It's always reproducible in sharded transaction where pinning server is involved. Steps:
- The valid server is pinned by the previous operation
- Then heartbeat is failed, the pool is paused. The server is not unpinned because heartbeat doesn't participate in unpinning
- The next operation in transaction sees the pinned server and doesn't select it again, then tries to acquire connection and sees pausable pool exception without a reason of why it has been triggered.
The same behavior where we use previously selected server happens in cursors. We pass the server that has been selected in the initial operation into cursor and don't select any other server for internal cursor operations like GetMore or KillCursor (regardless wire protocol).
Our SDAM spec doesn't consider this case. But CMAP spec implicitly assumes that any operation(regardless logic like pinning) will go through server selecting process. We should clarify what behavior is expected here.
Is this issue urgent?
no
Is this ticket required by a downstream team?
no
Is this ticket only for tests?
no
Acceptance Criteria
- Un-skip the following tests from DRIVERS-746:
- source/transactions/tests/unified/retryable-abort-handshake.yml
- source/transactions/tests/unified/retryable-commit-handshake.yml
- is related to
-
DRIVERS-746 Drivers should retry operations if connection handshake fails
- Implementing
- related to
-
DRIVERS-1998 Add a reason to connection pool Clear method
- Backlog