ISSUE SUMMARY
Under very rare circumstances mongos may incorrectly report a write as successful. The bug can manifest in the unlikely event that the mongos reuses a previously-used connection from the shared pool which contains a stale writeback field. In this situation, mongos cannot guarantee the correct post-migration location of writes and thus may incorrectly report the write as successful. Since mongos outgoing connections are tied to incoming client connections, this can only occur in cases of high connection turnover and low latency. The bug is difficult to trigger, but has caused a lost write in one known case.
This race condition can only occur on the first occurrence of a writeback being queued for a shard. Once a writeback is queued, the connection is cached.
USER IMPACT
Affected Version: All versions of MongoDB prior to and including v2.4.8.
Conditions Required: Sharded cluster with balancing enabled and active.
Frequency: Extremely rare.
Root Cause: In certain cases, it is possible for the getLastError aggregation in mongos ClientInfo to not return the correct code to the writeback listener. We ignore any previous writebacks when reprocessing a write in the writeback listener, but incorrectly do not append the other getLastError fields contained in "res" (the getLastError result from the shard).
In short, when retrying a write via the writeback listener, it is possible for the writeback listener to miss the special stale config code it needs to continue retrying.
SOLUTION
Always aggregate results from getLastError even in the presence of previous writebacks.
WORKAROUNDS
Temporarily disable the balancer until all mongos are updated to ensure your sharded cluster is not susceptible to this bug.
PATCHES
Production release v2.4.9 and v2.2.7 contain the fix for this issue, and production release v2.6.0 will contain the fix as well. Upgrading all mongos processes to MongoDB v2.4.9 or MongoDB v2.2.7 is required to avoid this issue.
Original Description
In certain cases, it seems possible for the getLastError aggregation in mongos ClientInfo to not return the correct code to the writeback listener.
The core issue is here:
if ( writebacks.size() ){ vector<BSONObj> v = _handleWriteBacks( writebacks , fromWriteBackListener ); if ( v.size() == 0 && fromWriteBackListener ) { // ok } ... } else { result.append( "singleShard" , theShard ); result.appendElements( res ); }
We ignore any writebacks when reprocessing a write in the WBL, but incorrectly do not append the other getLastError fields contained in "res" (the getLastError result from the shard).
In short, when retrying a command in the WBL, it's possible for the WBL to not get the special stale config code it needs to continue retrying.