Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-3751

mongodb crashing on repairDatabase

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Blocker - P1 Blocker - P1
    • None
    • Affects Version/s: 2.0.0-rc1
    • Component/s: Replication
    • Environment:
      windows 64bit 24cpu 48gb ram san drive
    • Windows

      we have a replica set of 3.
      we can run repairdatabase on master 1 and it works fine
      we stepdown 1
      2 becomes master
      when we attempt to repairdatabase on 2
      it restarts

      PRIMARY> db.repairDatabase();
      {
      "errmsg" : "exception: nextSafe():

      { $err: \"not master and slaveok=false\", code: 13435 }

      ",
      "code" : 13106,
      "ok" : 0
      }
      SECONDARY>

      so it looks like it crashed and failed over to 1 again

      looked at the logs and we see

      Fri Sep 02 12:55:45 [conn14] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: pr_blue_spruce.sessions top: { opid: 63, active: true, lockType: "write", waitingForLock: false, secs_running: 68, op: "query", ns: "pr_blue_spruce", query:

      { repairDatabase: 1.0 }

      , client: "127.0.0.1:53667", desc: "conn", msg: "index: (3/3) btree-middle", numYields: 0 }
      Fri Sep 02 12:55:45 [conn14] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: pr_blue_spruce.sessions top: { opid: 63, active: true, lockType: "write", waitingForLock: false, secs_running: 68, op: "query", ns: "pr_blue_spruce", query:

      { repairDatabase: 1.0 }

      , client: "127.0.0.1:53667", desc: "conn", msg: "index: (3/3) btree-middle", numYields: 0 }
      Fri Sep 02 12:55:45 [conn14] warning: ClientCursor::yield can't unlock b/c of recursive lock ns: pr_blue_spruce.sessions top: { opid: 63, active: true, lockType: "write", waitingForLock: false, secs_running: 68, op: "query", ns: "pr_blue_spruce", query:

      { repairDatabase: 1.0 }

      , client: "127.0.0.1:53667", desc: "conn", msg: "index: (3/3) btree-middle", numYields: 0 }

      then it keeps failing on startup and we see this exception

      Fri Sep 02 12:55:57 [websvr] User Assertion: 13142:timeout getting readlock
      Fri Sep 02 12:55:57 [websvr] Socket http response send() errno:0 The operation completed successfully. 192.168.16.35:36451
      Fri Sep 02 12:55:57 unhandled windows exception
      Fri Sep 02 12:55:57 ec=0xe06d7363
      Fri Sep 02 12:55:57 [conn14] external sort used : 4 files in 11 secs
      Fri Sep 02 12:55:57 [conn14] New namespace: pr_blue_spruce.sessions.$id
      Fri Sep 02 12:55:57 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192
      Fri Sep 02 12:55:57 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192
      Fri Sep 02 12:55:57 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192
      Fri Sep 02 12:55:57 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192
      Fri Sep 02 12:55:57 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192
      Fri Sep 02 12:55:58 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192
      Fri Sep 02 12:55:58 [conn16] run command admin.$cmd

      { replSetHeartbeat: "prod_rudy", v: 4, pv: 1, checkEmpty: false, from: "monru02.colo.rrgroup.com:27017" }

      Fri Sep 02 12:55:58 [conn16] command admin.$cmd command:

      { replSetHeartbeat: "prod_rudy", v: 4, pv: 1, checkEmpty: false, from: "monru02.colo.rrgroup.com:27017" }

      ntoreturn:1 reslen:125 0ms
      Fri Sep 02 12:55:58 [conn14] allocating new extent for pr_blue_spruce.sessions.$id padding:1 lenWHdr: 8192

      we have tried wiping the db folder for 2 and having it resync a few times but the error doesn't go away.

        1. mongodb_logs_20110906.4-crash.zip
          310 kB
        2. mongodb_monru02.zip
          97 kB
        3. mongodb_repairdatabase.log
          16.63 MB

            Assignee:
            mathias@mongodb.com Mathias Stearn
            Reporter:
            pbrumm Pete Brumm
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: