Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-6745

Unhandled empty result in RunOnAllShardsCommand

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: 2.0.6, 2.2.0-rc0
    • Component/s: Sharding
    • None
    • Environment:
      All
    • ALL

      In v2.0, the RunOnAllShardsCommand::run() method checks the results without first checking if there were any results. In the event of a socket exception such as:

      Thu Aug  9 02:03:31 [conn60] Socket recv() errno:104 Connection reset by peer 10.217.63.247:27018
      Thu Aug  9 02:03:31 [conn60] SocketException: remote: 10.217.63.247:27018 error: 9001 socket exception [1] server [10.217.63.247:27018] 
      Thu Aug  9 02:03:31 [conn60] DBClientCursor::init lazy say() failed
      Thu Aug  9 02:03:31 [conn60] DBClientCursor::init message from say() was empty
      Thu Aug  9 02:03:31 [conn60] ERROR: Future::spawnComand (part 2) exception: Error running command on server: rsname/server1:27018,server2:27018,server3:27018
      Thu Aug  9 02:03:31 [conn60]   Assertion failure !e.eoo() s/../util/net/../../db/../bson/bsonobjbuilder.h 127
      0x52b5f6 0x53613b 0x7a3284 0x794f24 0x777c5c 0x7b6467 0x7c89c1 0x5e9747 0x2aaaaacce73d 0x2aaaab7494bd 
       /usr/bin/mongos(_ZN5mongo12sayDbContextEPKc+0x96) [0x52b5f6]
       /usr/bin/mongos(_ZN5mongo8assertedEPKcS1_j+0xfb) [0x53613b]
       /usr/bin/mongos(_ZN5mongo15dbgrid_pub_cmds21RunOnAllShardsCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x904) [0x7a3284]
       /usr/bin/mongos(_ZN5mongo7Command20runAgainstRegisteredEPKcRNS_7BSONObjERNS_14BSONObjBuilderEi+0x894) [0x794f24]
       /usr/bin/mongos(_ZN5mongo14SingleStrategy7queryOpERNS_7RequestE+0x5ac) [0x777c5c]
       /usr/bin/mongos(_ZN5mongo7Request7processEi+0x187) [0x7b6467]
       /usr/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71) [0x7c89c1]
       /usr/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x287) [0x5e9747]
       /lib64/libpthread.so.0 [0x2aaaaacce73d]
       /lib64/libc.so.6(clone+0x6d) [0x2aaaab7494bd]
      

      This appears to be caused by Future::CommandResult having a result() that was never populated, then checking for an errmsg:

                          shared_ptr<Future::CommandResult> res = *i;
                          if ( ! res->join() ) {
                              errors.appendAs(res->result()["errmsg"], res->getServer());
                          }
      

      Note this was slightly modified in master for more robust error reporting, however it seems this case is still possible. The impact of this issue seems relatively minimal, given the command failed due to a network issue.

            Assignee:
            Unassigned Unassigned
            Reporter:
            benjamin.becker Ben Becker
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: