When a network error occurs on an application socket SDAM currently says a post-handshake network timeout should not mark the server as Unknown.
If there is a network timeout on the connection after the handshake completes, the client MUST NOT mark the server Unknown. (A timeout may indicate a slow operation on the server, rather than an unavailable server.)
libmongoc distinguished read timeouts and write timeouts. A read timeout does not mark the server Unknown, but a write timeout does.
Changes in CDRIVER-3615 will remove the distinction this to adhere to the specs, but this may be behavior worth considering.
I think this boils down to whether a write timeout means the socket is bad. I could see an argument made that a write timeout is more likely to indicate a bad socket than a read timeout, but here are some counterarguments:
- Long running server operations could still cause a write timeout. E.g. an unsatisfied write concern, or an update with a filter requiring a collection scan.
- In general, if a user specifies a small socket timeout, we may not want to mark the server unknown since that would require repeatedly rediscovering the server.