Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15056

Sharded connection cleanup on setup error can crash mongos

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.4.12, 2.6.5, 2.7.6
    • Affects Version/s: 2.4.8
    • Component/s: Sharding
    • None
    • ALL

      Tue Aug 26 03:18:25.731 [conn28959] end connection 172.16.3.130:47505 (30 connections now open)
      Tue Aug 26 03:18:56.306 [mongosMain] connection accepted from 172.16.3.130:47831 #28960 (31 connections now open)
      Tue Aug 26 03:18:56.311 [conn28960] end connection 172.16.3.130:47831 (30 connections now open)
      Tue Aug 26 03:19:23.292 [conn28559] Socket say send() errno:32 Broken pipe 172.16.3.195:27017
      Tue Aug 26 03:19:23.294 [conn28559] Socket say send() errno:32 Broken pipe 172.16.3.202:27017
      Tue Aug 26 03:19:23.294 [conn28559] warning: socket exception when initializing on messageC:messageC/inny-p-p0639-prd-mdb-13.sailthru.pvt:27017,inny-p-p0738-prd-mdb-23.sailthru.pvt:27017,nj1-p-ownalmond-prd-mdb-08.flt:27017, current connection state is { state: { conn: "messageC/inny-p-p0639-prd-mdb-13.sailthru.pvt:27017,inny-p-p0738-prd-mdb-23.sailthru.pvt:27017,nj1-p-ownalmond-prd-mdb-08.flt:27017,nj1-p-madshadow-pr...", vinfo: "blast.message.blast.20140820 @ 5|283||53ed1ac0cb32289f7d9e59ad", cursor: "(none)", count: 0, done: false }, retryNext: false, init: false, finish: false, errored: false } :: caused by :: 9001 socket exception [SEND_ERROR] server [172.16.3.202:27017]
      pure virtual method called
      Received signal 6
      Backtrace: 0xa88935 0x356ca32920 0x356ca328a5 0x356ca34085 0x356debea5d 0x356debcbe6 0x356debcc13 0x356debd53f 0x6d44ec 0x6cb2af 0x723312 0x704659 0x728f4f 0x72959a 0x7054bc 0x9bc2ae 0x997181 0x6666c4 0xa7645e 0x356ce07851
      /usr/bin/mongos(_ZN5mongo17printStackAndExitEi+0x75)[0xa88935]
      /lib64/libc.so.6[0x356ca32920]
      /lib64/libc.so.6(gsignal+0x35)[0x356ca328a5]
      /lib64/libc.so.6(abort+0x175)[0x356ca34085]
      /usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x12d)[0x356debea5d]
      /usr/lib64/libstdc++.so.6[0x356debcbe6]
      /usr/lib64/libstdc++.so.6[0x356debcc13]
      /usr/lib64/libstdc++.so.6[0x356debd53f]
      /usr/bin/mongos(_ZN5mongo14DBClientCursorD0Ev+0x59c)[0x6d44ec]
      /usr/bin/mongos(_ZN5boost6detail12shared_countD1Ev+0x3f)[0x6cb2af]
      /usr/bin/mongos(_ZN5boost6detail17sp_counted_impl_pIN5mongo23ParallelConnectionStateEE7disposeEv+0x32)[0x723312]
      /usr/bin/mongos(_ZN5mongo26ParallelConnectionMetadata7cleanupEb+0x1c9)[0x704659]
      /usr/bin/mongos(_ZNSt4pairIKN5mongo5ShardENS0_26ParallelConnectionMetadataEED1Ev+0x2f)[0x728f4f]
      /usr/bin/mongos(_ZNSt8_Rb_treeIN5mongo5ShardESt4pairIKS1_NS0_26ParallelConnectionMetadataEESt10_Select1stIS5_ESt4lessIS1_ESaIS5_EE8_M_eraseEPSt13_Rb_tree_nodeIS5_E+0x52a)[0x72959a]
      /usr/bin/mongos(_ZN5mongo27ParallelSortClusteredCursorD0Ev+0xdc)[0x7054bc]
      /usr/bin/mongos(_ZN5mongo13ShardStrategy7queryOpERNS_7RequestE+0x201e)[0x9bc2ae]
      /usr/bin/mongos(_ZN5mongo7Request7processEi+0x1d1)[0x997181]
      /usr/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x74)[0x6666c4]
      /usr/bin/mongos(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e)[0xa7645e]
      /lib64/libpthread.so.0[0x356ce07851]
      

      Demangled stack-trace:

      usr/bin/mongos(mongo::printStackAndExit(int)+0x75)
      /lib64/libc.so.6
      /lib64/libc.so.6(gsignal+0x35)
      /lib64/libc.so.6(abort+0x175)
      /usr/lib64/libstdc++.so.6(__gnu_cxx::__verbose_terminate_handler()+0x12d)
      /usr/lib64/libstdc++.so.6
      /usr/lib64/libstdc++.so.6
      /usr/lib64/libstdc++.so.6
      /usr/bin/mongos(mongo::DBClientCursor::~DBClientCursor()+0x59c)
      /usr/bin/mongos(boost::detail::shared_count::~shared_count()+0x3f)
      /usr/bin/mongos(boost::detail::sp_counted_impl_p<mongo::ParallelConnectionState>::dispose()+0x32)
      /usr/bin/mongos(mongo::ParallelConnectionMetadata::cleanup(bool)+0x1c9)
      /usr/bin/mongos(std::pair<mongo::Shard const, mongo::ParallelConnectionMetadata>::~pair()+0x2f)
      /usr/bin/mongos(std::_Rb_tree<mongo::Shard, std::pair<mongo::Shard const, mongo::ParallelConnectionMetadata>, std::_Select1st<std::pair<mongo::Shard const, mongo::ParallelConnect
      ionMetadata> >, std::less<mongo::Shard>, std::allocator<std::pair<mongo::Shard const, mongo::ParallelConnectionMetadata> > >::_M_erase(std::_Rb_tree_node<std::pair<mongo::Shard const, mongo::ParallelConnectionMetadata> >*)+0x52a)
      /usr/bin/mongos(mongo::ParallelSortClusteredCursor::~ParallelSortClusteredCursor()+0xdc)
      /usr/bin/mongos(mongo::ShardStrategy::queryOp(mongo::Request&)+0x201e) 
      /usr/bin/mongos(mongo::Request::process(int)+0x1d1)
      /usr/bin/mongos(mongo::ShardedMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*)+0x74)
      /usr/bin/mongos(mongo::PortMessageServer::handleIncomingMsg(void*)+0x42e)
      /lib64/libpthread.so.0
      

      Sequence of events:
      1. cleanup frees the connection pointer
      https://github.com/mongodb/mongo/blob/r2.7.5/src/mongo/client/parallel.cpp#L348

      2. cursor destructor gets called. Destructor tries to killCursors using the freed connection in #1:

      https://github.com/mongodb/mongo/blob/r2.7.5/src/mongo/client/dbclientcursor.cpp#L370-373

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            randolph@mongodb.com Randolph Tan
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: