Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7260

Balancer lock is not relinquished

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.7, 2.2.0
    • Component/s: Sharding
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Under certain conditions, the balancer lock may never be relinquished. One case appeasr to have occured when the balancer state was disabled during a chunk migration:

      mongos> db.locks.findOne({_id:"balancer"});
      {
              "_id" : "balancer",
              "process" : "r5.10gen.cc:27017:1349297686:1804289383",
              "state" : 2,
              "ts" : ObjectId("506cae1f13bf56db8d1b0856"),
              "when" : ISODate("2012-10-03T21:29:03.359Z"),
              "who" : "r5.10gen.cc:27017:1349297686:1804289383:Balancer:846930886",
              "why" : "doing balance round"
      }
      
      mongos> db.changelog.find().sort({$natural:-1}).limit(10).skip(10).pretty()
      {
              "_id" : "r5.10gen.cc-2012-10-03T21:30:05-17",
              "server" : "r5.10gen.cc",
              "clientAddr" : "127.0.0.1:57957",
              "time" : ISODate("2012-10-03T21:30:05.136Z"),
              "what" : "moveChunk.from",
              "ns" : "sh.test",
              "details" : {
                      "min" : {
                              "id" : "16540452295883480447516388304186410329865247257024"
                      },
                      "max" : {
                              "id" : "22754752024366413683521379069776306796548182491720"
                      },
                      "step1 of 6" : 0,
                      "step2 of 6" : 305,
                      "step3 of 6" : 378,
                      "step4 of 6" : 32007,
                      "step5 of 6" : 4542,
                      "step6 of 6" : 24280
              }
      }
      

      Note the above output was taken 15 hours after the last moveChunk was logged to the config server. It's unclear if the mongos process holding the lock was killed before it had a chance to release the lock.

      The net effect is that sh.isBalancerRunning() never returns false, even if the balancer is no longer running.

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            benjamin.becker Ben Becker (Inactive)
            Votes:
            4 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: