Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-6303

shouldRetry needs a write lock

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • 2.2.0-rc0
    • Affects Version/s: None
    • Component/s: Replication
    • None
    • Environment:
      git version: 873e06d16edfe8f6cdc0030e924e3b55087afd6d
    • ALL

      Replication fail with:

      Wed Jul  4 20:10:48 [conn622] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:49885]                                                                                                                       [58/1856]
      Wed Jul  4 20:10:48 [conn610] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:33146] 
      Wed Jul  4 20:10:49 [rsSync] replSet initialSyncOplogApplication applied 1124 operations, synced to Jul  4 17:25:28:32
      Wed Jul  4 20:10:49 [conn650] end connection 192.168.1.21:33004 (3 connections now open)
      Wed Jul  4 20:10:49 [repl writer worker] error expected my_db.ad to be locked 
      Wed Jul  4 20:10:49 [repl writer worker] lock status: unlocked
      Wed Jul  4 20:10:49 [repl writer worker] Assertion: 16104:expected to be read locked for my_db.ad
      Wed Jul  4 20:10:48 [conn550] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:34033] 
      Wed Jul  4 20:10:48 [conn646] command config.$cmd command: { serverStatus: 1 } ntoreturn:1 keyUpdates:0 locks(micros) r:1293 reslen:2549 82344ms
      Wed Jul  4 20:10:48 [conn558] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:44962] 
      Wed Jul  4 20:10:48 [conn614] command config.$cmd command: { serverStatus: 1 } ntoreturn:1 keyUpdates:0 locks(micros) r:1691 reslen:2549 562401ms
      Wed Jul  4 20:10:48 [conn646] end connection 192.168.1.21:54959 (10 connections now open)
      Wed Jul  4 20:10:48 [conn514] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:39918] 
      Wed Jul  4 20:10:48 [conn614] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:38702] 
      Wed Jul  4 20:10:48 [conn602] serverStatus was very slow: { after basic: 0, middle of mem: 0, after mem: 0, after connections: 0, after extra info: 0, after counters: 0, after repl: 0, after asserts: 0, after dur: 734417, at end: 734417 }
      Wed Jul  4 20:10:48 [conn602] command config.$cmd command: { serverStatus: 1 } ntoreturn:1 keyUpdates:0 locks(micros) r:52 reslen:2549 742522ms
      Wed Jul  4 20:10:48 [conn622] serverStatus was very slow: { after basic: 0, middle of mem: 0, after mem: 0, after connections: 0, after extra info: 0, after counters: 0, after repl: 0, after asserts: 0, after dur: 437590, at end: 437590 }
      Wed Jul  4 20:10:48 [conn610] serverStatus was very slow: { after basic: 0, middle of mem: 0, after mem: 0, after connections: 0, after extra info: 0, after counters: 0, after repl: 0, after asserts: 0, after dur: 615416, at end: 615416 }
      Wed Jul  4 20:10:48 [conn602] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:49500] 
      Wed Jul  4 20:10:48 [conn622] command config.$cmd command: { serverStatus: 1 } ntoreturn:1 keyUpdates:0 locks(micros) r:26183 reslen:2549 442455ms
      Wed Jul  4 20:10:48 [conn610] command config.$cmd command: { serverStatus: 1 } ntoreturn:1 keyUpdates:0 locks(micros) r:26216 reslen:2549 622220ms
      Wed Jul  4 20:10:48 [conn622] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:49885] 
      Wed Jul  4 20:10:48 [conn610] SocketException handling request, closing client connection: 9001 socket exception [2] server [192.168.1.21:33146] 
      Wed Jul  4 20:10:49 [rsSync] replSet initialSyncOplogApplication applied 1124 operations, synced to Jul  4 17:25:28:32
      Wed Jul  4 20:10:49 [conn650] end connection 192.168.1.21:33004 (3 connections now open)
      Wed Jul  4 20:10:49 [repl writer worker] error expected my_db.ad to be locked 
      Wed Jul  4 20:10:49 [repl writer worker] lock status: unlocked
      Wed Jul  4 20:10:49 [repl writer worker] Assertion: 16104:expected to be read locked for my_db.ad
      0x72b5f3 0x8ac744 0x8ac82c 0xb0af56 0xa88c7b 0x99d669 0x99d872 0x734f64 0x5e8e1f 0x95428a 0x7fe6be 0x7f38dcb40b50 0x7f38dbee46dd 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x23) [0x72b5f3]
       /usr/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x94) [0x8ac744]
       /usr/bin/mongod() [0x8ac82c]
       /usr/bin/mongod(_ZN5mongo4Lock23assertAtLeastReadLockedERKNS_10StringDataE+0x146) [0xb0af56]
       /usr/bin/mongod(_ZN5mongo14DatabaseHolder11getOrCreateERKSsS2_Rb+0x10b) [0xa88c7b]
       /usr/bin/mongod(_ZN5mongo6Client7Context11_finishInitEb+0x49) [0x99d669]
       /usr/bin/mongod(_ZN5mongo6Client7ContextC2ERKSsSsbb+0x92) [0x99d872]
       /usr/bin/mongod(_ZN5mongo4Sync11shouldRetryERKNS_7BSONObjE+0xb4) [0x734f64]
       /usr/bin/mongod(_ZN5mongo7replset21multiInitialSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x4f) [0x5e8e1f]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x2a) [0x95428a]
       /usr/bin/mongod() [0x7fe6be]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f38dcb40b50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f38dbee46dd]
      Wed Jul  4 20:10:49 [repl writer worker] ERROR: exception: expected to be read locked for my_db.ad on: { ts: Timestamp 1341437174000|22, h: -8912815504339682628, op: "u", ns: "my_db.ad", o2: { _id: ObjectId('4ff216da347caf99133789bb') }, o: { $set: { cou
      nt: 29 }, $set: { ts_end: new Date(1341437174303) } } }
      Wed Jul  4 20:10:49 [repl writer worker]   Fatal Assertion 16361
      0x72b5f3 0x8ac493 0x5e8f1f 0x95428a 0x7fe6be 0x7f38dcb40b50 0x7f38dbee46dd 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x23) [0x72b5f3]
       /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x63) [0x8ac493]
       /usr/bin/mongod(_ZN5mongo7replset21multiInitialSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x14f) [0x5e8f1f]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x2a) [0x95428a]
       /usr/bin/mongod() [0x7fe6be]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f38dcb40b50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f38dbee46dd]
      Wed Jul  4 20:10:49 [repl writer worker] 
      
      ***aborting after fassert() failure
      
      
      Wed Jul  4 20:10:49 Got signal: 6 (Aborted).
      
      Wed Jul  4 20:10:49 Backtrace:
      0x72b5f3 0x574c74 0x7f38dbe3e4f0 0x7f38dbe3e475 0x7f38dbe416f0 0x8ac4ce 0x5e8f1f 0x95428a 0x7fe6be 0x7f38dcb40b50 0x7f38dbee46dd 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x23) [0x72b5f3]
       /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x3e4) [0x574c74]
       /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x7f38dbe3e4f0]
       /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x7f38dbe3e475]
       /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x7f38dbe416f0]
       /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x9e) [0x8ac4ce]
       /usr/bin/mongod(_ZN5mongo7replset21multiInitialSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x14f) [0x5e8f1f]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x2a) [0x95428a]
       /usr/bin/mongod() [0x7fe6be]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f38dcb40b50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f38dbee46dd]
      

      When statr mngo one more time:

      Thu Jul  5 01:18:31 [repl writer worker]   Fatal Assertion 16359
      0x72b5f3 0x8ac493 0x5e8d11 0x95428a 0x7fe6be 0x7f07e55aab50 0x7f07e494e6dd 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x23) [0x72b5f3]
       /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x63) [0x8ac493]
       /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x51) [0x5e8d11]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x2a) [0x95428a]
       /usr/bin/mongod() [0x7fe6be]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f07e55aab50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f07e494e6dd]
      Thu Jul  5 01:18:31 [repl writer worker] 
      
      ***aborting after fassert() failure
      
      
      Thu Jul  5 01:18:31 Got signal: 6 (Aborted).
      
      Thu Jul  5 01:18:31 Backtrace:
      0x72b5f3 0x574c74 0x7f07e48a84f0 0x7f07e48a8475 0x7f07e48ab6f0 0x8ac4ce 0x5e8d11 0x95428a 0x7fe6be 0x7f07e55aab50 0x7f07e494e6dd 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x23) [0x72b5f3]
       /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x3e4) [0x574c74]
       /lib/x86_64-linux-gnu/libc.so.6(+0x324f0) [0x7f07e48a84f0]
       /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x7f07e48a8475]
       /lib/x86_64-linux-gnu/libc.so.6(abort+0x180) [0x7f07e48ab6f0]
       /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x9e) [0x8ac4ce]
       /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x51) [0x5e8d11]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x2a) [0x95428a]
       /usr/bin/mongod() [0x7fe6be]
       /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50) [0x7f07e55aab50]
       /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f07e494e6dd]
      

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            azat Azat Khuzhin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: