After the changes in WT-3009, there have been a number of stuck cache aborts on test format runs that use LSM.
These are reproducible fairly quickly (under 50 runs) on Linux with configs such as below
############################################ # RUN PARAMETERS ############################################ abort=0 auto_throttle=1 backups=0 bitcnt=6 bloom=1 bloom_bit_count=45 bloom_hash_count=31 bloom_oldest=0 cache=30 checkpoints=1 checksum=uncompressed chunk_size=1 compaction=0 compression=zlib data_extend=0 data_source=lsm delete_pct=14 dictionary=0 direct_io=0 encryption=none evict_max=4 file_type=row-store firstfit=0 huffman_key=0 huffman_value=0 in_memory=0 insert_pct=73 internal_key_truncation=0 internal_page_max=10 isolation=random key_gap=12 key_max=64 key_min=26 leaf_page_max=17 leak_memory=0 logging=1 logging_archive=0 logging_compression=none logging_prealloc=0 long_running_txn=0 lsm_worker_threads=4 merge_max=17 mmap=1 ops=100000 prefix_compression=1 prefix_compression_min=6 quiet=1 repeat_data_pct=29 reverse=0 rows=100000 runs=1 rebalance=1 salvage=1 split_pct=85 statistics=1 statistics_server=0 threads=11 timer=20 transaction-frequency=36 value_max=1638 value_min=15 verify=1 wiredtiger_config= write_pct=42 ############################################
One solution is to modify the changes to the evict trigger setting changed in WT-3009. The more correct option is likely to change how dirty page accounting works in LSM. Currently dirty pages on the primary LSM chunk are counted towards the dirty page total. As these dirty pages are fully expected, capped in size and dealt with by LSM merges they can potentially be removed from the count.