-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
ALL
ISSUE SUMMARY
In a replica set, if a resync operation is attempted on a node before it loads a valid replica set config, the mongod process crashes.
A newly started mongod with the --replSet parameter does not immediately have a config; it must first load a valid config from disk, have a config delivered to it from another node, or have the replica set initiate command run by an admin.
USER IMPACT
The mongod process crashes, and a stack trace is printed in the log. This only affects newly started mongod processes that have not yet had a chance to join a replica set, so the impact of this issue on a replica set is minimal.
WORKAROUNDS
Do not run resync on a mongod before loading a valid replica set config.
AFFECTED VERSIONS
MongoDB production releases from version 2.6.0 up to 2.6.3 are affected by this issue.
FIX VERSION
The fix is included in the 2.6.4 production release.
RESOLUTION DETAILS
Do not allow resync commands if the replica set config has not yet been loaded.
Original description
https://mci.10gen.com/ui/task/mongodb_mongo_master_osx_108_dur_off_b1300e3f5656423eac55efaedf6440ab10c37125_14_04_16_21_30_07_replicasets_osx_108_dur_off
https://mci.10gen.com/ui/task/mongodb_mongo_master_osx_108_b1300e3f5656423eac55efaedf6440ab10c37125_14_04_16_21_30_07_replicasets_osx_108
m31001| 2014-04-16T20:14:35.279-0400 [conn2] SEVERE: Invalid access at address: 0 m31001| 2014-04-16T20:14:35.280-0400 [rsStart] replSet I am mci-osx108-5.build.10gen.cc:31001 m31001| 2014-04-16T20:14:35.283-0400 [conn2] SEVERE: Got signal: 11 (Segmentation fault: 11). m31001| 0x1006b125b 0x1006b0dfe 0x7fff88b2790a 0 0x1001aa945 0x1001ab3db 0x1001ac09c 0x1003c0d5f 0x1002927b0 0x1000065b4 0x1006760f1 0x1006e57d5 0x7fff88b39772 0x7fff88b261a1 m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo15printStackTraceERSo+0x2b) [0x1006b125b] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo12_GLOBAL__N_124abruptQuitWithAddrSignalEiP9__siginfoPv+0xde) [0x1006b0dfe] m31001| /usr/lib/system/libsystem_c.dylib(_sigtramp+0x1a) [0x7fff88b2790a] m31001| ??? [0] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x25) [0x1001aa945] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x85f) [0x1001ab3db] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x56c) [0x1001ac09c] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x64f) [0x1003c0d5f] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x7b0) [0x1002927b0] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x134) [0x1000065b4] m31001| /data/mci/shell/mongodb-mongo-master/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x691) [0x1006760f1] m31001| /data/mci/shell/mongodb-mongo-master/mongod(thread_proxy+0xe5) [0x1006e57d5] m31001| /usr/lib/system/libsystem_c.dylib(_pthread_start+0x147) [0x7fff88b39772] m31001| /usr/lib/system/libsystem_c.dylib(thread_start+0xd) [0x7fff88b261a1]
The only change to actual code in the intersection of the blamelists is: https://github.com/mongodb/mongo/commit/0fbd76d233e213e43f53b8882c4dd3c71897a7f3
Other changes:
https://github.com/mongodb/mongo/commit/8bbe304cde912c0e2f96ff6b8f6e4badd90d60f0
https://github.com/mongodb/mongo/commit/b1300e3f5656423eac55efaedf6440ab10c37125