-
Type: Bug
-
Resolution: Incomplete
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.0.1
-
Component/s: Replication, WiredTiger
-
None
-
ALL
-
I've upgraded my UAT environment to Mongo 3.0.1 with the WiredTiger storage engine & am repeatedly seeing my primary replicaset fallback to RECOVERY mode with the below stacktrace. The "prism-pubsub.intergate" collection mentioned in the stack trace is a capped collection which we're opening a tailable cursor on using the (just forked) mubsub library here: https://github.com/evanlh/mubsub.
2015-04-02T15:52:21.286-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\util\stacktrace_win.cpp(175) mongo::printStackTrace+0x43 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\util\log.cpp(135) mongo::logContext+0x97 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\util\assert_util.cpp(214) mongo::msgasserted+0xd7 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\storage\wiredtiger\wiredtiger_record_store.cpp(330) mongo::WiredTigerRecordStore::dataFor+0xc7 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\catalog\collection.cpp(157) mongo::Collection::docFor+0x52 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\exec\idhack.cpp(153) mongo::IDHackStage::work+0x176 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\query\plan_executor.cpp(328) mongo::PlanExecutor::getNextSnapshotted+0xdd 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\query\plan_executor.cpp(286) mongo::PlanExecutor::getNext+0x41 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\dbhelpers.cpp(151) mongo::Helpers::findOne+0x227 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\dbhelpers.cpp(116) mongo::Helpers::findOne+0x37 2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\rs_rollback.cpp(583) mongo::repl::`anonymous namespace'::syncFixUp+0x18e7 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\rs_rollback.cpp(774) mongo::repl::`anonymous namespace'::_syncRollback+0x395 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\rs_rollback.cpp(827) mongo::repl::syncRollback+0x1f5 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(428) mongo::repl::BackgroundSync::_rollbackIfNeeded+0x590 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(262) mongo::repl::BackgroundSync::produce+0x406 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(200) mongo::repl::BackgroundSync::_producerThread+0x9a 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(153) mongo::repl::BackgroundSync::producerThread+0x86 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185) boost::`anonymous namespace'::thread_start_function+0x21 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] MSVCR120.dll beginthreadex+0x107 2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] MSVCR120.dll endthreadex+0x192 2015-04-02T15:52:21.289-0400 I CONTROL [rsBackgroundSync] kernel32.dll BaseThreadInitThunk+0xd 2015-04-02T15:52:21.289-0400 I CONTROL [rsBackgroundSync] 2015-04-02T15:52:21.486-0400 I REPL [rsBackgroundSync] replSet exception in rollback ns:prism-pubsub.intergate { _id: ObjectId('551d9db163449994281a07b4') } 28556 Didn't find RecordId in WiredTigerRecordStore ndeletes:9 2015-04-02T15:52:21.486-0400 I REPL [rsBackgroundSync] replSet 9 delete and 20 update operations processed out of 40 total operations 2015-04-02T15:52:21.765-0400 I REPL [ReplicationExecutor] replSetElect voting yea for algo-rh01-sdny.itgssi.com:27017 (1) 2015-04-02T15:52:22.379-0400 I REPL [ReplicationExecutor] Member algo-rh01-sdny.itgssi.com:27017 is now in state PRIMARY 2015-04-02T15:52:22.934-0400 I REPL [rsBackgroundSync] rollback 5 d:9 u:30 2015-04-02T15:52:22.934-0400 I REPL [rsBackgroundSync] rollback 6 2015-04-02T15:52:23.075-0400 W REPL [rsBackgroundSync] issues during syncRollback, see log 2015-04-02T15:52:23.075-0400 I REPL [ReplicationExecutor] transition to RECOVERING 2015-04-02T15:52:23.076-0400 I REPL [rsBackgroundSync] rollback finished 2015-04-02T15:52:23.091-0400 I REPL [ReplicationExecutor] syncing from: algo-rh01-sdny.itgssi.com:27017
- is related to
-
SERVER-18006 Shard replicas crash after 3.0.2 upgrade with "Didn't find RecordId in WiredTigerRecordStore"
- Closed