-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: WiredTiger
-
Fully Compatible
-
ALL
-
ISSUE SUMMARY
MongoDB running with the WiredTiger storage engine, under high load with append-only workloads and no reads, may fail to find pages to evict from cache and hang.
USER IMPACT
mongod keeps running but becomes unresponsive.
WORKAROUNDS
Once the process becomes stuck, mongod must be restarted.
AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.6
FIX VERSION
The fix is included in the 3.0.7 production release.
Configuration:
3 members replica set
db version v3.1.7-pre-
git version: 4cf56d86a386039839dc10bb761bd28c829be426
Two problems:
1) Primary node is up and running but not able to perform any CRUD operations (mongostat and other db. . insert({}) hang), however failover didn't occur.
2) WiredTiger execute endless loop in !__wt_tree_walk and holding CRUD operations w/o timeout/watchdog for robustness (See debugger output for the lock owner)
0:460> !cs -l ----------------------------------------- DebugInfo = 0x000000de00a25740 Critical section = 0x000000de7fc780c0 (+0xDE7FC780C0) LOCKED LockCount = 0x0 WaiterWoken = No OwningThread = 0x000000000000097c RecursionCount = 0x1 LockSemaphore = 0xD8C SpinCount = 0x0000000000000fa0 2 Id: 11d4.97c Suspend: 1 Teb: 00007ff7`4fe68000 Unfrozen Child-SP RetAddr Call Site 000000de`01dafc30 00007ff7`51567749 mongod!__wt_tree_walk+0x1a8 [c:\data\mci\src\src\third_party\wiredtiger\src\btree\bt_walk.c @ 243] 000000de`01dafcc0 00007ff7`515672e7 mongod!__evict_walk_file+0x329 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 1154] 000000de`01dafd60 00007ff7`51566764 mongod!__evict_walk+0x2b7 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 1032] 000000de`01dafdf0 00007ff7`51566d5b mongod!__evict_lru_walk+0x24 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 789] 000000de`01dafe20 00007ff7`51566f58 mongod!__evict_pass+0x25b [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 502] 000000de`01dafe80 00007ffb`5e534f7f mongod!__evict_server+0x38 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 169] 000000de`01dafeb0 00007ffb`5e535126 MSVCR120!beginthreadex+0x107 000000de`01dafee0 00007ffb`6d3f15dd MSVCR120!endthreadex+0x192 000000de`01daff10 00007ffb`6d7343d1 KERNEL32!BaseThreadInitThunk+0xd 000000de`01daff40 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
RS.Status
EitanRs3a:PRIMARY> rs.status() { "set" : "EitanRs3a", "date" : ISODate("2015-08-18T14:45:29.611Z"), "myState" : 1, "term" : NumberLong(0), "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "eitan5:5002", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 66715, "optime" : Timestamp(1439894846, 12455), "optimeDate" : ISODate("2015-08-18T10:47:26Z"), "electionTime" : Timestamp(1439842421, 2), "electionDate" : ISODate("2015-08-17T20:13:41Z"), "configVersion" : 3, "self" : true }, { "_id" : 1, "name" : "Eitan1:5002", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 66704, "optime" : Timestamp(1439894723, 4673), "optimeDate" : ISODate("2015-08-18T10:45:23Z"), "lastHeartbeat" : ISODate("2015-08-18T14:45:29.030Z"), "lastHeartbeatRecv" : ISODate("2015-08-18T14:45:28.841Z"), "pingMs" : 2, "electionTime" : Timestamp(1439906145, 1), "electionDate" : ISODate("2015-08-18T13:55:45Z"), "configVersion" : 3 }, { "_id" : 2, "name" : "Eitan6:5002", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 66663, "optime" : Timestamp(1439893521, 7147), "optimeDate" : ISODate("2015-08-18T10:25:21Z"), "lastHeartbeat" : ISODate("2015-08-18T14:45:29.041Z"), "lastHeartbeatRecv" : ISODate("2015-08-18T14:45:28.844Z"), "pingMs" : 1, "syncingTo" : "eitan5:5002", "configVersion" : 3 } ], "ok" : 1 }
- is depended on by
-
WT-1973 MongoDB changes for WiredTiger 2.7.0
- Closed