Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11802

duplicate key exception in replication

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.6
    • Component/s: Internal Code
    • Environment:
      Ubuntu 10.04.4
      32 model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz / 127Gb memory
    • Linux

      Hi.

      We have RS mongodb:

      > rs.conf()
      {
      	"_id" : "mdb01",
      	"version" : 19959,
      	"members" : [
      		{
      			"_id" : 0,
      			"host" : "mdb01d:27018"
      		},
      		{
      			"_id" : 1,
      			"host" : "mdb01:27018"
      		},
      		{
      			"_id" : 2,
      			"host" : "mdb01g:27018"
      		},
      		{
      			"_id" : 3,
      			"host" : "mdb-backup01d:27018",
      			"votes" : 0,
      			"priority" : 0,
      			"hidden" : true
      		}
      	]
      }
      

      today, secondary node's from our Replica Set out of order.

      Wed Nov 20 21:28:40.921 [conn20] replSet RECOVERING
      Wed Nov 20 21:28:40.921 [conn20] replSet info voting yea for mdb01e:27018 (1)
      Wed Nov 20 21:28:41.055 [conn7] end connection 1.1.1.1:59411 (28 connections now open)
      Wed Nov 20 21:28:41.483 [rsHealthPoll] DBClientCursor::init call() failed
      Wed Nov 20 21:28:41.484 [rsHealthPoll] replset info rty-mdb-backup01d:27018 heartbeat failed, retrying
      Wed Nov 20 21:28:41.485 [rsHealthPoll] replSet info rty-mdb-backup01d:27018 is down (or slow to respond): 
      Wed Nov 20 21:28:41.485 [rsHealthPoll] replSet member rty-mdb-backup01dt:27018 is now in state DOWN
      Wed Nov 20 21:28:41.487 [rsHealthPoll] replSet member rty-mdb01e:27018 is now in state PRIMARY
      Wed Nov 20 21:28:41.488 [rsHealthPoll] DBClientCursor::init call() failed
      Wed Nov 20 21:28:41.488 [rsHealthPoll] replset info rty-mdb01g:27018 heartbeat failed, retrying
      Wed Nov 20 21:28:41.490 [rsHealthPoll] replSet info rty-mdb01g:27018 is down (or slow to respond): 
      Wed Nov 20 21:28:41.490 [rsHealthPoll] replSet member rty-mdb01g:27018 is now in state DOWN
      Wed Nov 20 21:28:42.474 [rsBackgroundSync] replSet syncing to: rty-mdb01e:27018
      Wed Nov 20 21:28:42.475 [rsSync] replSet still syncing, not yet to minValid optime 528ce54b:20
      Wed Nov 20 21:28:42.565 [rsSync] replSet SECONDARY
      Wed Nov 20 21:28:42.648 [repl writer worker 2] ERROR: writer worker caught exception: E11000 duplicate key error index: fotki-bazinga.onetimeJobs.$activeUniqueIdentifier_1  dup key: { : "refreshCounters_{"userId":100544310,"albumId":377350}" } on: { ts: Timestamp 1384965478000|3, h: -6072474236908557566, v: 2, op: "i", ns: "fotki-bazinga.onetimeJobs", o: { _id: { taskId: "refreshCounters", jobId: 1384965478119 }, scheduleTime: new Date(1384965508119), activeUniqueIdentifier: "refreshCounters_{"userId":100544310,"albumId":377350}", parameters: "{"userId":100544310,"albumId":377350}", status: "ready", workers: {}, priority: 20 } }
      Wed Nov 20 21:28:42.648 [repl writer worker 2]   Fatal Assertion 16360
      0xc5e916 0xc23273 0xb14021 0xc30b59 0xc9bbac 0x7f5cb0dd69ca 0x7f5cb017d21d 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0xc5e916]
       /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x63) [0xc23273]
       /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x121) [0xb14021]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x279) [0xc30b59]
       /usr/bin/mongod() [0xc9bbac]
       /lib/libpthread.so.0(+0x69ca) [0x7f5cb0dd69ca]
       /lib/libc.so.6(clone+0x6d) [0x7f5cb017d21d]
      Wed Nov 20 21:28:42.651 [repl writer worker 2] 
      
      ***aborting after fassert() failure
      
      
      Wed Nov 20 21:28:42.651 Got signal: 6 (Aborted).
      
      Wed Nov 20 21:28:42.653 Backtrace:
      0xc5e916 0x70c044 0x7f5cb00c7ba0 0x7f5cb00c7b25 0x7f5cb00cb670 0xc232ae 0xb14021 0xc30b59 0xc9bbac 0x7f5cb0dd69ca 0x7f5cb017d21d 
       /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x26) [0xc5e916]
       /usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x3c4) [0x70c044]
       /lib/libc.so.6(+0x33ba0) [0x7f5cb00c7ba0]
       /lib/libc.so.6(gsignal+0x35) [0x7f5cb00c7b25]
       /lib/libc.so.6(abort+0x180) [0x7f5cb00cb670]
       /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x9e) [0xc232ae]
       /usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x121) [0xb14021]
       /usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x279) [0xc30b59]
       /usr/bin/mongod() [0xc9bbac]
       /lib/libpthread.so.0(+0x69ca) [0x7f5cb0dd69ca]
       /lib/libc.so.6(clone+0x6d) [0x7f5cb017d21d]
      

      Our RS switched to read only, because we have lost all secondary nodes than fully made the service unusable (

            Assignee:
            matt.dannenberg Matt Dannenberg
            Reporter:
            airesp Andrey Godin
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: