-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
8
-
Storage Engines 2019-09-09, Storage Engines 2019-09-23, Storage Engines 2019-10-07
We encountered a cache stuck with clean page failure in the test/format stress sanitiser job. Please note timestamp was disabled in the test. We had WT-4608 and WT-3694 created to track LSM cache stuck clean failures. This case is likely the 1st time we see the btree cache stuck with clean pages failure.
http://build.wiredtiger.com:8080/job/wiredtiger-test-format-stress-sanitizer/23878/
[1554809849:456697][2330:0x7fb095f0b700], t, eviction-server: __evict_server, 451: Cache stuck for too long, giving up: Connection timed out t: process 2330 0x7fb09f590488:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 0x7fb09f590488:transaction state dump 0x7fb09f590488:current ID: 4647 0x7fb09f590488:last running ID: 4647 0x7fb09f590488:metadata_pinned ID: 4370 0x7fb09f590488:oldest ID: 4370 0x7fb09f590488:commit timestamp: (0,0) 0x7fb09f590488:oldest timestamp: (0,0) 0x7fb09f590488:pinned timestamp: (0,0) 0x7fb09f590488:stable timestamp: (0,0) 0x7fb09f590488:has_commit_timestamp: no 0x7fb09f590488:has_oldest_timestamp: no 0x7fb09f590488:has_pinned_timestamp: no 0x7fb09f590488:has_stable_timestamp: no 0x7fb09f590488:oldest_is_pinned: no 0x7fb09f590488:stable_is_pinned: no 0x7fb09f590488:checkpoint running: no 0x7fb09f590488:checkpoint generation: 18 0x7fb09f590488:checkpoint pinned ID: 0 0x7fb09f590488:checkpoint txn ID: 0 0x7fb09f590488:oldest named snapshot ID: 0 0x7fb09f590488:session count: 36 0x7fb09f590488:Transaction state of active sessions: 0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.prev 0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4495, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00004008, isolation: WT_ISO_SNAPSHOT 0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.prev 0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED 0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next 0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4504, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED 0x7fb09f590488:ID: 0, pinned ID: 4370, metadata pinned ID: 0, name: WT_CURSOR.search 0x7fb09f590488:mod count: 0, snap min: 4370, snap max: 4375, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00004008, isolation: WT_ISO_SNAPSHOT 0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next 0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED 0x7fb09f590488:ID: 0, pinned ID: 4491, metadata pinned ID: 0, name: WT_CURSOR.next 0x7fb09f590488:mod count: 0, snap min: 4491, snap max: 4503, commit_timestamp: (0,0), durable_timestamp: (0,0), first_commit_timestamp: (0,0), read_timestamp: (0,0), flags: 0x00000008, isolation: WT_ISO_READ_COMMITTED 0x7fb09f590488:=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 0x7fb09f590488:cache dump 0x7fb09f590488:cache full: no 0x7fb09f590488:cache clean check: yes (97.548%) 0x7fb09f590488:cache dirty check: no (0.091%) 0x7fb09f590488:file:wt.wt(checkpoint=WiredTigerCheckpoint.1): 0x7fb09f590488:internal: 3 pages, 0MB, 3/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:leaf: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:file:wt.wt(<live>): 0x7fb09f590488:internal: 64 pages, 24MB, 64/0 clean/dirty pages, 24/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:leaf: 4 pages, 0MB, 4/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:file:WiredTigerLAS.wt(<live>): 0x7fb09f590488:internal: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:leaf: 0 pages 0x7fb09f590488:file:WiredTiger.wt(<live>): 0x7fb09f590488:internal: 1 pages, 0MB, 1/0 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:leaf: 1 pages, 0MB, 0/1 clean/dirty pages, 0/0 clean/dirty MB, 0MB max page, 0MB max dirty page 0x7fb09f590488:cache dump: total found: 27MB vs tracked inuse 27MB 0x7fb09f590488:total dirty bytes: 0MB [1554809849:462413][2330:0x7fb095f0b700], t, eviction-server: __wt_evict_thread_run, 321: cache eviction thread error: Connection timed out [1554809849:462425][2330:0x7fb095f0b700], t, eviction-server: __wt_panic, 520: the process must exit and restart: WT_PANIC: WiredTiger library panic [1554809849:462430][2330:0x7fb095f0b700], t, eviction-server: __wt_abort, 30: aborting WiredTiger library /tmp/jenkins5083747742990501913.sh: line 24: 2330 Aborted nice ./t -1 -c ../../../test/format/CONFIG.stress + cleanup + status=134
The configuration:
############################################ # RUN PARAMETERS ############################################ abort=0 alter=0 auto_throttle=1 backups=1 bitcnt=8 bloom=1 bloom_bit_count=4 bloom_hash_count=32 bloom_oldest=0 cache=28 cache_minimum=20 checkpoints=on checkpoint_log_size=44 checkpoint_wait=100 checksum=uncompressed chunk_size=4 compaction=0 compression=none data_extend=0 data_source=table delete_pct=37 dictionary=0 direct_io=0 encryption=rotn-7 evict_max=0 file_type=row-store firstfit=0 huffman_key=0 huffman_value=0 independent_thread_rng=1 in_memory=0 insert_pct=0 internal_key_truncation=1 internal_page_max=16 isolation=snapshot key_gap=16 key_max=79 key_min=18 leaf_page_max=9 leak_memory=0 logging=1 logging_archive=1 logging_compression=zlib logging_file_max=416851 logging_prealloc=1 long_running_txn=0 lsm_worker_threads=4 memory_page_max=1 merge_max=10 mmap=1 modify_pct=17 ops=0 prefix_compression=1 prefix_compression_min=7 prepare=0 quiet=1 read_pct=45 rebalance=1 repeat_data_pct=40 reverse=0 rows=1000000 runs=1 salvage=1 split_pct=75 statistics=0 statistics_server=1 threads=14 timer=4 timing_stress_aggressive_sweep=0 timing_stress_checkpoint=0 timing_stress_lookaside_sweep=0 timing_stress_split_1=0 timing_stress_split_2=0 timing_stress_split_3=0 timing_stress_split_4=0 timing_stress_split_5=0 timing_stress_split_6=0 timing_stress_split_7=0 timing_stress_split_8=0 transaction_timestamps=0 transaction-frequency=94 truncate=1 value_max=1148 value_min=17 verify=1 wiredtiger_config= write_pct=1 ############################################
- causes
-
WT-5082 Application threads are tasked with eviction even when pinning the oldest transaction ID
- Closed
-
WT-5149 Clear the debugging field value so it cannot get stale
- Closed
- depends on
-
WT-4949 Spike: Cache stuck with clean pages with timestamp disabled format test
- Closed
- is related to
-
WT-5043 Add debugging to aid in test/format hang
- Closed
- related to
-
WT-5142 Don't create huge root pages when rebalancing or bulk-loading objects with overflow keys
- Closed
-
WT-5330 Read-only transactions are being rolled back in MongoDB
- Closed
-
WT-5266 application threads are tasked with eviction even when pinning the oldest transaction ID
- Closed