Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-13570

mongos does not work if ReplicaSetMonitor no master found

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.2.2, 2.4.9
    • None
    • ALL

      Configuration for test:
      Sharding database name video_user_data is two ReplicaSet.
      ReplicaSet-1

      video-test-mongodb-1:> rs.conf()
      {
      	"_id" : "video-test-mongodb-1",
      	"version" : 3,
      	"members" : [
      		{
      			"_id" : 0,
      			"host" : "mongo01a.vd:27018"
      		},
      		{
      			"_id" : 1,
      			"host" : "mongo01b.vd:27018"
      		},
      		{
      			"_id" : 2,
      			"host" : "mongo01c.vd:27018"
      		}
      	]
      }
      

      ReplicaSet-2

      video-test-mongodb-2:SECONDARY> rs.conf()
      {
      	"_id" : "video-test-mongodb-2",
      	"version" : 3,
      	"members" : [
      		{
      			"_id" : 0,
      			"host" : "mongo02a.vd:27018"
      		},
      		{
      			"_id" : 1,
      			"host" : "mongo02b.vd:27018"
      		},
      		{
      			"_id" : 2,
      			"host" : "mongo02c.vd:27018"
      		}
      	]
      }
      

      Raised over them mongos.
      The problem arises when one of the RS lost PRIMARY.

      simple script on python for test

      #!/usr/bin/env python
      # -*- coding: UTF-8 -*-
      
      import pymongo
      
      conn = pymongo.Connection('mongo01b.vd:27017')
      db = conn.video_user_data
      coll = db.films
      
      counter=0
      for user in coll.find({},partial=True):
              counter+=1
      print "%s" % counter
      
      

      Normal script work:

      mongo01b.vd:~# ./get.py  
      96601
      

      Critical bug for work:

      mongo01b.vd:~# time ./get.py  
      Traceback (most recent call last):
        File "./get.py", line 16, in <module>
          for user in coll.find({},partial=True):
        File "/usr/lib/python2.6/dist-packages/pymongo/cursor.py", line 814, in next
          if len(self.__data) or self._refresh():
        File "/usr/lib/python2.6/dist-packages/pymongo/cursor.py", line 763, in _refresh
          self.__uuid_subtype))
        File "/usr/lib/python2.6/dist-packages/pymongo/cursor.py", line 720, in __send_message
          self.__uuid_subtype)
        File "/usr/lib/python2.6/dist-packages/pymongo/helpers.py", line 100, in _unpack_response
          error_object["$err"])
      pymongo.errors.OperationFailure: database error: ReplicaSetMonitor no master found for set: video-test-mongodb-2
      
      real	0m32.829s
      user	0m0.056s
      sys	0m0.000s
      

      Questions:
      1) Why is it taking so long detection problem? During this time, synchronous backend, under heavy load completely kill all the request queue
      2) With the loss of one of the RS, I expect that I can get data from other RS database sharded cluster.

      Maybe I'm wrong somehow use the driver?

      mongodb version:

      mongodb=1:2.4.9.yandex1
      

      tnx!

        1. mongos.log
          147 kB
          Andrey Godin

            Assignee:
            jacob.ribnik@mongodb.com Jacob Ribnik
            Reporter:
            airesp Andrey Godin
            Votes:
            4 Vote for this issue
            Watchers:
            16 Start watching this issue

              Created:
              Updated:
              Resolved: