-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.2.9
-
Component/s: WiredTiger
-
None
-
ALL
in Two shards mongodb cluster, one primary's wiredtiger cache usage is staying about 90%.
After examinging stack trace, eviction thread never sleep and consume 1 cpu core all the time.
# top top - 14:47:31 up 87 days, 3:19, 1 user, load average: 1.23, 1.26, 1.22 Tasks: 683 total, 1 running, 682 sleeping, 0 stopped, 0 zombie %Cpu0 : 1.0 us, 1.0 sy, 0.0 ni, 98.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st ... %Cpu10 : 1.0 us, 0.0 sy, 0.0 ni, 99.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu11 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st <== %Cpu12 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st ...
We query a lot of data selecting query(about 27000 docs x 30 times) on both primary member.
One primary is okay, but the other is not good and still consuming 1 cpu core for evicitng pages.
Looks like cache usage is not dropped stable status (like 80~85%), so eviction thread never stop scanning pages. I don't know why cache usage is never drop to stable status.
Wiredtiger status report they read-in a lot of block to wired tiger cache (270MB/10 sec).
But weird thing is that There's no disk read and no major fault and not so many minor fault on both primary server. all system metric (except cpu) is almost same as the other primary(stable one).
According to stacktrace, one thread is doing "__tree_walk_internal()", acutally 2 threads and they are consuming 1 cpu core by turns.