-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 2.4.6, 2.4.7, 2.5.3
-
Component/s: Networking
-
None
-
Fully Compatible
-
Linux
ISSUE SUMMARY
Users can see a rare, intermittent server crash due to a race condition in the OpenSSL interface. This will only impact users that are running with SSL enabled. The crash can manifest in several ways, but the most common signature is to see a segmentation fault (signal 11) or an abort (signal 6) reported in the mongod logs, with a backtrace that includes references to libcrypto.
USER IMPACT
When using an earlier version than OpenSSL 1.x the server exhibits random, intermittent crashes in the OpenSSL interface.
SOLUTION
The crashes were due to multiple registration and unregistrations of an OpenSSL callback function. The registration is now performed only once and the callback is never unregistered.
WORKAROUNDS
Upgrade to OpenSSL 1.x.
PATCHES
Production release v2.4.9 contains the fix for this issue, and production release v2.6.0 will contain the fix as well.
Original Description
The calls to register the openssl callbacks in SSLThreadInfo() and unregister the callback in ~SSLThreadInfo() are misplaced: the callback is a static global, whereas SSLThreadInfo objects are per-thread. The callbacks should be registered once early (before any possible SSL activity) and do not ever need to be unregistered (or at least should not be unregistered on every thread exit, which has been shown to cause crashes due to duplicate frees). Removing the unregister in the destructor addresses the second point and solves the immediate problem, but there may be latent issues due to not registering the callback until the first SSLThreadInfo object is constructed, so I think probably the callback registration should be moved somewhere else as well.