-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 3.0.0-rc7
-
Component/s: WiredTiger
-
Fully Compatible
-
ALL
-
During performance tests on a Amazon linux EC2 instance (dedicated c3-4xlarge) stand alone instance, I am seeing extended pauses during a mixed 50/50 update and read run that seem to correspond to high write activity in the general but no corresponding write activity in the disk or cache.
The benchmark is running the mixed 50/50 workload of updates and reads. All the threads doing the reading are returning in a reasonable time giving low thread latency for Read threads. But the Update threads seem to be blocked. The latency when the updates start up again is showing as very high.
388 sec: 10043995 operations; 10020.5 current ops/sec; [UPDATE AverageLatency(us)=2054.55] [READ AverageLatency(us)=1162.3] 390 sec: 10134561 operations; 45260.37 current ops/sec; [UPDATE AverageLatency(us)=492.66] [READ AverageLatency(us)=203.15] 392 sec: 10255505 operations; 60441.78 current ops/sec; [UPDATE AverageLatency(us)=333.89] [READ AverageLatency(us)=183.77] 394 sec: 10329907 operations; 37201 current ops/sec; [UPDATE AverageLatency(us)=546.05] [READ AverageLatency(us)=211.6] 396 sec: 10329907 operations; 0 current ops/sec; 398 sec: 10329907 operations; 0 current ops/sec; 400 sec: 10329907 operations; 0 current ops/sec; 402 sec: 10329907 operations; 0 current ops/sec; 404 sec: 10329907 operations; 0 current ops/sec; 406 sec: 10329907 operations; 0 current ops/sec; 408 sec: 10329907 operations; 0 current ops/sec; 410 sec: 10329907 operations; 0 current ops/sec; 412 sec: 10329907 operations; 0 current ops/sec; 414 sec: 10329907 operations; 0 current ops/sec; 416 sec: 10329907 operations; 0 current ops/sec; 418 sec: 10329907 operations; 0 current ops/sec; 420 sec: 10329907 operations; 0 current ops/sec; 422 sec: 10358050 operations; 14071.5 current ops/sec; [UPDATE AverageLatency(us)=32062.47] [READ AverageLatency(us)=178.26] 424 sec: 10409153 operations; 25538.73 current ops/sec; [UPDATE AverageLatency(us)=1044.98] [READ AverageLatency(us)=193.84] 426 sec: 10520720 operations; 55755.62 current ops/sec; [UPDATE AverageLatency(us)=375.93] [READ AverageLatency(us)=186.22] 428 sec: 10645124 operations; 62202 current ops/sec; [UPDATE AverageLatency(us)=319.98] [READ AverageLatency(us)=183.11] 430 sec: 10759784 operations; 57301.35 current ops/sec; [UPDATE AverageLatency(us)=364] [READ AverageLatency(us)=182.66]
Looking at the metrics is shows a large plateau in general write activity but there does not seem to be any noticeable uptick in write activity in WiredTiger, on disk (xvdb is data only disk) or in the cache. I have tested in full and not full cache modes with the same result.
This appears similar to SERVER-16269,SERVER-16662 and is the same as what I was seeing in CAP-1822.
This was actually tested with the b70d96e build (post rc7), but has been showing up in multiple RCs.
- is related to
-
SERVER-17907 B-tree eviction blocks access to collection for extended period under WiredTiger
- Closed
-
SERVER-16575 intermittent slow inserts with WiredTiger b-tree
- Closed
-
SERVER-16790 Lengthy pauses associated with checkpoints under WiredTiger
- Closed
-
SERVER-16938 60-second stall between checkpoints under WiredTiger
- Closed
- related to
-
SERVER-17194 Low Throughput for YCSB 50-50 workload with high client threads
- Closed
- links to