-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 5.0.2, 5.0.3
-
Component/s: None
-
Server Programmability
-
ALL
-
-
(copied to CRM)
Since the upgrade from 4.4.4 to 5.0.3, we have observed a steadily increasing number of user assertions in the metrics. On debug level, user assertions with the error message "HostUnreachable: Connection closed by peer" became visible.
As a cause for these user assertions we identified our Kubernetes readiness probe. It is successfully executed every 10 and opens a tcp socket on the mongodb port.
Further analysis showed that this probe also generates error messages about "HostUnreachable" in 4.4.4, but these messages do not result in user assertions.
This is the relevant log excerpt from 4.4.4:
--->> {"t":{"$date":"2021-10-12T14:38:47.811+00:00"},"s":"D2", "c":"NETWORK", "id":22986, "ctx":"conn8778","msg":"Session from remote encountered a network error during SourceMessage","attr":{"remote":"172.23.43.233:44014","error":{"code":6,"codeName":"HostUnreachable","errmsg":"Connection closed by peer"}} {"t":{"$date":"2021-10-12T14:38:47.811+00:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn8778","msg":"Connection ended","attr":{"remote":"172.23.43.233:44014","connectionId":8778,"connectionCount":24}}
And this is the excerpt from 5.0.3:
{"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D4", "c":"NETWORK", "id":4898000, "ctx":"listener","msg":"Setting initial ServiceExecutor context for client","attr":{"client":"conn3385","threadingModel":"dedicated","canUseReserved":false}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D3", "c":"-", "id":5127801, "ctx":"thread3756","msg":"Setting the Client","attr":{"client":"conn3385"}} -->> {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D2", "c":"NETWORK", "id":22986, "ctx":"conn3385","msg":"Session from remote encountered a network error during SourceMessage","attr":{"remote":"172.23.38.209:53800","error":{"code":6,"codeName":"HostUnreachable","errmsg":"Connection closed by peer"}}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D1", "c":"-", "id":23074, "ctx":"conn3385","msg":"User assertion","attr":{"error":"HostUnreachable: Connection closed by peer","file":"src/mongo/transport/service_state_machine.cpp","line":394}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D1", "c":"-", "id":23074, "ctx":"conn3385","msg":"User assertion","attr":{"error":"HostUnreachable: Connection closed by peer","file":"src/mongo/transport/service_state_machine.cpp","line":536}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D2", "c":"NETWORK", "id":5763901, "ctx":"conn3385","msg":"Terminating session due to error","attr":{"error":{"code":6,"codeName":"HostUnreachable","errmsg":"Connection closed by peer"}}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D2", "c":"NETWORK", "id":5127900, "ctx":"conn3385","msg":"Ending session","attr":{"error":{"code":6,"codeName":"HostUnreachable","errmsg":"Connection closed by peer"}}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D4", "c":"NETWORK", "id":4898001, "ctx":"conn3385","msg":"Resetting ServiceExecutor context for client","attr":{"client":"conn3385","threadingModel":"dedicated","canUseReserved":false}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D4", "c":"NETWORK", "id":4898001, "ctx":"conn3385","msg":"Resetting ServiceExecutor context for client","attr":{"client":"conn3385","threadingModel":"dedicated","canUseReserved":false}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn3385","msg":"Connection ended","attr":{"remote":"172.23.38.209:53800","uuid":"3402de9f-26f6-4a35-aff3-6806a83695c1","connectionId":3385,"connectionCount":30}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D3", "c":"-", "id":5127803, "ctx":"conn3385","msg":"Released the Client","attr":{"client":"conn3385"}} {"t":{"$date":"2021-10-12T14:22:34.700+00:00"},"s":"D3", "c":"-", "id":5127803, "ctx":"conn3385","msg":"Released the Client","attr":{"client":"conn3385"}}
Please note how the marked lines appear in both versions but only in 5.0.3 there are the two additional user assertions.
As an attempt to workaround the problem we changed the tcp socket probe to the following command:
mongo -u user -p password --authenticationDatabase admin --quiet --eval 'db.adminCommand("ping").ok'
However, the user assertions persisted.
Our questions:
Why do both commands each return successfully but still generate a DEBUG message indicating an error?
Why is the DEBUG message followed by a user assertion in version 5.0.3?
- is related to
-
SERVER-50867 Roll back ServiceStateMachine changes temporarily
- Closed
- related to
-
SERVER-73561 Consider exception origin when counting assertions
- Closed