Running a simple 3 node replset over localhost, the cluster idles and outputs the normal log messages for heartbeat connections being created and shutdown 30 seconds later.
Restarting the cluster with SSL enabled, the same result would be expected. Instead, see a variety of SSL errors and socket exceptions. Cluster otherwise functions normally, in that heartbeat connections are successfully re-established, and normal replset operations proceed without issue.
The socket exceptions are mostly CONNECT_ERROR, CLOSED and sometimes RECV_ERROR or SEND_ERROR.
The SSL errors are usually
- SSL23_GET_SERVER_HELLO
- could not negotiate SSL connection: EOF detected
and sometimes
- SSL Error ret when receiving: -1 err: 2 error:00000000:lib(0):func(0):reason(0)
- SSL Error ret when receiving: -1 err: 5 error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac
For testing, starting a large replset (eg. 12 nodes) means there are a lot of connections, and so errors are easily observed in under a minute of idling. Smaller replsets still hit the issue, but take proportionately longer.
This is fixed by SERVER-8864 (specifically commit 9ca2fb0), SERVER-10968, and SERVER-11806.
- depends on
-
SERVER-11806 Distinct SSL messages for distinct causes of closed connections
- Closed
-
SERVER-8864 Allow mixed SSL and non-SSL connections
- Closed
-
SERVER-10968 Improved SSL error handling
- Closed