-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: WT2.9.1
-
Component/s: None
-
None
Hi!
We have a very simple workload that sequentially scans one table and writes slightly modified data to another. (It's upgrade due to small change of table key encoding).
The table has 2 indexes with the following sizes (no compression):
$ ls -lh db/persistence/sor_se/si_db/wt/custom_audit* -rw-r--r-- 1 sbn tbeng 620K Mar 9 10:44 db/persistence/sor_se/si_db/wt/custom_audit_idx-175.wti -rw-r--r-- 1 sbn tbeng 348K Mar 9 10:44 db/persistence/sor_se/si_db/wt/custom_audit_idx-65.wti -rw-r--r-- 1 sbn tbeng 57M Mar 9 10:44 db/persistence/sor_se/si_db/wt/custom_audit.wt
I.e. total size is around 58M
Cache size is 128M
In WiredTiger 2.8 this upgrade took 25 sec, in 2.9.1 it seems to stuck forever with the stacks like:
application thread:
calloc __wt_calloc __wt_row_insert_alloc __wt_row_modify __split_multi_inmem __wt_split_rewrite __evict_page_dirty_update __wt_evict __evict_page __wt_cache_eviction_worker __wt_cache_eviction_check __wt_txn_begin __session_begin_transaction ...
eviction thread:
__wt_row_insert_alloc __wt_row_modify __split_multi_inmem __wt_split_rewrite __evict_page_dirty_update __wt_evict __evict_page __evict_lru_pages __evict_pass __evict_server __wt_evict_thread_run __wt_thread_run start_thread clone
And constantly eating around 100-150% CPU (as reported by Linux top)
I'm sure that this is caused by change of eviction settings for dirty data made in 2.9.0
If I return them back to the values they were in 2.8.0 workload finishes in 30 sec.
So some performance degradation for such write-heave workload is expected (WT-3089).
But is complete lock-up expected?
Thanks!