Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-20993

lock file not deleted when server terminating due to moveChunk commit failed error

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.9
    • Component/s: Admin, Sharding
    • None
    • ALL
    • Hide

      Happened once not sure how to reproduce.

      Show
      Happened once not sure how to reproduce.

      The mongod process terminated due to "ERROR: moveChunk commit failed:" journal files were cleaned up but the lock file was not removed causing subsequent start ups to fail due to old lock file.

      ... for command :{ $err: "socket exception [SEND_ERROR] for 0.1.2.187:1, code: 9001 }
      2015-10-03T00:53:43.494+0000 [conn12906] waiting till out of critical section
      2015-10-03T00:53:43.501+0000 [conn12900] waiting till out of critical section
      2015-10-03T00:53:43.643+0000 [conn12904] waiting till out of critical section
      2015-10-03T00:53:43.651+0000 [conn12907] waiting till out of critical section
      2015-10-03T00:53:43.735+0000 [conn12905] waiting till out of critical section
      2015-10-03T00:53:43.743+0000 [conn12902] waiting till out of critical section
      2015-10-03T00:53:43.893+0000 [conn12903] waiting till out of critical section
      2015-10-03T00:53:43.901+0000 [conn12901] waiting till out of critical section
      2015-10-03T00:53:53.461+0000 [conn12908] ERROR: moveChunk commit failed: version is at 5|5||000000000000000000000000 instead of 6|1||55dce04fae57789a22a4d141
      2015-10-03T00:53:53.461+0000 [conn12908] ERROR: TERMINATING
      2015-10-03T00:53:53.461+0000 [conn12908] dbexit:
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: going to close listening sockets...
      2015-10-03T00:53:53.461+0000 [conn12908] closing listening socket: 10
      2015-10-03T00:53:53.461+0000 [conn12908] closing listening socket: 13
      2015-10-03T00:53:53.461+0000 [conn12908] removing socket file: /tmp/mongodb-27022.sock
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: going to flush diaglog...
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: going to close sockets...
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: waiting for fs preallocator...
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: lock for final commit...
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: final commit...
      2015-10-03T00:53:53.461+0000 [conn12908] shutdown: closing all files...
      2015-10-03T00:53:53.461+0000 [conn12750] end connection 0.1.2.187:1 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12929] end connection 0.1.2.187:2 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12936] end connection 0.1.2.187:3 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12915] end connection 0.1.2.187:4 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12966] end connection 0.1.2.187:5 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12967] end connection 0.1.2.134:6 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12949] end connection 0.1.2.134:7 (22 connections now open)
      2015-10-03T00:53:53.461+0000 [conn12736] end connection 0.1.2.134:8 (22 connections now open)
      2015-10-03T00:53:53.462+0000 [conn12913] end connection 0.1.2.134:9 (22 connections now open)
      2015-10-03T00:53:53.462+0000 [conn12912] end connection 0.1.2.134:10 (22 connections now open)
      2015-10-03T00:53:53.462+0000 [conn12726] end connection 0.1.2.134:11 (22 connections now open)
      2015-10-03T00:53:53.465+0000 [conn12908] closeAllFiles() finished
      2015-10-03T00:53:53.465+0000 [conn12908] journalCleanup...
      2015-10-03T00:53:53.465+0000 [conn12908] removeJournalFiles
      2015-10-03T00:53:53.483+0000 [initandlisten] now exiting
      2015-10-03T00:53:53.483+0000 [initandlisten] dbexit: ; exiting immediately
      

      Was not able to verify from documentation or google groups if this is an expected behavior or a mongo bug.

      Related defect I've found: SERVER-3009

      Exact server version

      2015-10-03T01:08:10.101+0000 [initandlisten] db version v2.6.9
      2015-10-03T01:08:10.101+0000 [initandlisten] git version: df313bc75aa94d192330cb92756fc486ea604e64
      2015-10-03T01:08:10.101+0000 [initandlisten] build info: Linux build20.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
      

            Assignee:
            Unassigned Unassigned
            Reporter:
            Avi Ribchinsky Avi Ribchinsky [X]
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: