-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
One of the stress test of WT-6491 reports a history store key out of order issue:
[2020/07/16 00:17:33.221] format.sh: job in /data/mci/c3724b2883abef6c9acbd2a8b46fe64a/wiredtiger/test/format/RUNDIR.28 failed [2020/07/16 00:17:33.228] t: process 28252 running [2020/07/16 00:17:33.228] [1594858646:235582][28252:0x7f24067f7700], file:WiredTigerHS.wt, WT_CURSOR.prev: __cursor_key_order_check_row, 485: WT_CURSOR.prev out-of-order returns: returned key 2,0000038218.00/opqrstuvwxyza,0,0 then key 2,0000849217.00/opqrstuvwxyzabc,0,0: Invalid argument [2020/07/16 00:17:33.228] [1594858646:235643][28252:0x7f24067f7700], file:WiredTigerHS.wt, WT_CURSOR.prev: __cursor_key_order_check_row, 485: the process must exit and restart: WT_PANIC: WiredTiger library panic [2020/07/16 00:17:33.228] [1594858646:235662][28252:0x7f24067f7700], file:WiredTigerHS.wt, WT_CURSOR.prev: __wt_abort, 28: aborting WiredTiger library [2020/07/16 00:17:33.230] format.sh: /data/mci/c3724b2883abef6c9acbd2a8b46fe64a/wiredtiger/test/format/RUNDIR.28/CONFIG: [2020/07/16 00:17:33.235] ############################################ [2020/07/16 00:17:33.235] # RUN PARAMETERS: V2 [2020/07/16 00:17:33.235] ############################################ [2020/07/16 00:17:33.235] assert.commit_timestamp=1 [2020/07/16 00:17:33.235] assert.read_timestamp=0 [2020/07/16 00:17:33.235] backup=0 [2020/07/16 00:17:33.235] backup.incremental=off [2020/07/16 00:17:33.235] backup.incr_granularity=9326 [2020/07/16 00:17:33.235] btree.bitcnt=1 [2020/07/16 00:17:33.235] btree.compression=zlib [2020/07/16 00:17:33.235] btree.dictionary=0 [2020/07/16 00:17:33.235] btree.huffman_key=0 [2020/07/16 00:17:33.235] btree.huffman_value=0 [2020/07/16 00:17:33.235] btree.internal_key_truncation=1 [2020/07/16 00:17:33.235] btree.internal_page_max=11 [2020/07/16 00:17:33.235] btree.key_gap=9 [2020/07/16 00:17:33.235] btree.key_max=29 [2020/07/16 00:17:33.235] btree.key_min=18 [2020/07/16 00:17:33.235] btree.leaf_page_max=9 [2020/07/16 00:17:33.235] btree.memory_page_max=8 [2020/07/16 00:17:33.235] btree.prefix_compression=1 [2020/07/16 00:17:33.235] btree.prefix_compression_min=7 [2020/07/16 00:17:33.235] btree.repeat_data_pct=21 [2020/07/16 00:17:33.235] btree.reverse=0 [2020/07/16 00:17:33.235] btree.split_pct=87 [2020/07/16 00:17:33.235] btree.value_max=607 [2020/07/16 00:17:33.235] btree.value_min=13 [2020/07/16 00:17:33.235] cache=80 [2020/07/16 00:17:33.235] cache.evict_max=5 [2020/07/16 00:17:33.235] cache.minimum=20 [2020/07/16 00:17:33.235] checkpoint=on [2020/07/16 00:17:33.235] checkpoint.log_size=67 [2020/07/16 00:17:33.235] checkpoint.wait=19 [2020/07/16 00:17:33.235] disk.checksum=off [2020/07/16 00:17:33.235] disk.data_extend=0 [2020/07/16 00:17:33.235] disk.direct_io=0 [2020/07/16 00:17:33.235] disk.encryption=none [2020/07/16 00:17:33.236] disk.firstfit=0 [2020/07/16 00:17:33.236] disk.mmap=0 [2020/07/16 00:17:33.236] disk.mmap_all=0 [2020/07/16 00:17:33.236] format.abort=0 [2020/07/16 00:17:33.236] format.independent_thread_rng=1 [2020/07/16 00:17:33.236] format.major_timeout=0 [2020/07/16 00:17:33.236] logging=1 [2020/07/16 00:17:33.236] logging.archive=0 [2020/07/16 00:17:33.236] logging.compression=none [2020/07/16 00:17:33.236] logging.file_max=286135 [2020/07/16 00:17:33.236] logging.prealloc=1 [2020/07/16 00:17:33.236] lsm.auto_throttle=1 [2020/07/16 00:17:33.236] lsm.bloom=1 [2020/07/16 00:17:33.236] lsm.bloom_bit_count=24 [2020/07/16 00:17:33.236] lsm.bloom_hash_count=12 [2020/07/16 00:17:33.236] lsm.bloom_oldest=1 [2020/07/16 00:17:33.236] lsm.chunk_size=4 [2020/07/16 00:17:33.236] lsm.merge_max=16 [2020/07/16 00:17:33.236] lsm.worker_threads=4 [2020/07/16 00:17:33.236] ops.alter=0 [2020/07/16 00:17:33.236] ops.compaction=0 [2020/07/16 00:17:33.236] ops.hs_cursor=1 [2020/07/16 00:17:33.236] ops.pct.delete=7 [2020/07/16 00:17:33.236] ops.pct.insert=11 [2020/07/16 00:17:33.236] ops.pct.modify=9 [2020/07/16 00:17:33.236] ops.pct.read=24 [2020/07/16 00:17:33.236] ops.pct.write=49 [2020/07/16 00:17:33.236] ops.prepare=0 [2020/07/16 00:17:33.236] ops.random_cursor=0 [2020/07/16 00:17:33.236] ops.rebalance=0 [2020/07/16 00:17:33.236] ops.salvage=1 [2020/07/16 00:17:33.236] ops.truncate=1 [2020/07/16 00:17:33.236] ops.verify=1 [2020/07/16 00:17:33.236] quiet=1 [2020/07/16 00:17:33.236] runs=1 [2020/07/16 00:17:33.236] runs.in_memory=0 [2020/07/16 00:17:33.236] runs.ops=0 [2020/07/16 00:17:33.236] runs.rows=1000000 [2020/07/16 00:17:33.236] runs.source=table [2020/07/16 00:17:33.236] runs.threads=4 [2020/07/16 00:17:33.236] runs.timer=6 [2020/07/16 00:17:33.236] runs.type=row-store [2020/07/16 00:17:33.236] runs.verify_failure_dump=0 [2020/07/16 00:17:33.236] statistics=0 [2020/07/16 00:17:33.236] statistics.server=0 [2020/07/16 00:17:33.236] stress.aggressive_sweep=0 [2020/07/16 00:17:33.236] stress.checkpoint=0 [2020/07/16 00:17:33.236] stress.hs_checkpoint_delay=0 [2020/07/16 00:17:33.236] stress.hs_sweep=0 [2020/07/16 00:17:33.236] stress.split_1=1 [2020/07/16 00:17:33.236] stress.split_2=1 [2020/07/16 00:17:33.236] stress.split_3=1 [2020/07/16 00:17:33.236] stress.split_4=0 [2020/07/16 00:17:33.236] stress.split_5=1 [2020/07/16 00:17:33.236] stress.split_6=0 [2020/07/16 00:17:33.236] stress.split_7=0 [2020/07/16 00:17:33.236] stress.split_8=0 [2020/07/16 00:17:33.236] transaction.frequency=100 [2020/07/16 00:17:33.236] transaction.isolation=snapshot [2020/07/16 00:17:33.236] transaction.rollback_to_stable=0 [2020/07/16 00:17:33.236] transaction.timestamps=1 [2020/07/16 00:17:33.236] wiredtiger.config= [2020/07/16 00:17:33.236] wiredtiger.rwlock=1 [2020/07/16 00:17:33.236] wiredtiger.leak_memory=0 [2020/07/16 00:17:33.236] ############################################ [2020/07/16 00:17:35.278] format.sh: 25 successful jobs, 1 failed jobs
Run recovery and then dump the history store, the keys reported by the error are no longer there presumably removed by rollback to stable. However, there are a lot of out of order keys in the dumped file.
WiredTiger Dump (WiredTiger Version 10.0.0) Format=print Header file:WiredTigerHS.wt access_pattern_hint=none,allocation_size=4KB,app_metadata=,assert=(commit_timestamp=none,durable_timestamp=none,read_timestamp=none),block_allocation=best,block_compressor=snappy,cache_resident=false,checksum=uncompressed,colgroups=,collator=,columns=,dictionary=0,encryption=(keyid=,name=),exclusive=false,extractor=,format=btree,huffman_key=,huffman_value=,ignore_in_memory_cache_size=false,immutable=false,internal_item_max=0,internal_key_max=0,internal_key_truncate=true,internal_page_max=4KB,key_format=IuQQ,key_gap=10,leaf_item_max=0,leaf_key_max=0,leaf_page_max=32KB,leaf_value_max=64MB,log=(enabled=true),lsm=(auto_throttle=true,bloom=true,bloom_bit_count=16,bloom_config=,bloom_hash_count=8,bloom_oldest=false,chunk_count_limit=0,chunk_max=5GB,chunk_size=10MB,merge_custom=(prefix=,start_generation=0,suffix=),merge_max=15,merge_min=0),memory_page_image_max=0,memory_page_max=5MB,os_cache_dirty_max=0,os_cache_max=0,prefix_compression=false,prefix_compression_min=4,source=,split_deepen_min_child=0,split_deepen_per_child=0,split_pct=90,type=file,value_format=QQQu Data 2,0000299183.00/opqr,0,0 105011,0,3,0000299183/LMNOPQRSTUVWXYZAB 2,0000299601.00/opqr,0,0 110229,0,3,0000299601/LMNOPQ 2,0000297465.00/opqrs,0,0 104687,0,3,0000297465/LMNOPQRSTUVWXYZABC 2,0000310681.00/opqrs,0,0 106570,0,3,0000310681/LMNOPQRSTUVW 2,0000321322.00/opqrs,0,0 106189,0,3,0000321322/LMNOPQRS 2,0000327999.00/opqrs,0,0 106150,0,3,0000327999/LMNO 2,0000328455.00/opqrs,0,0 105786,0,3,0000328455/LMNOPQRS 2,0000299074.00/opqrst,0,0 107038,0,3,0000299074/LMNOPQRST 2,0000307802.00/opqrst,0,0 104777,0,3,0000307802/LMNOPQ 2,0000314760.00/opqrst,0,0 106698,0,3,0000314760/LMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVW 2,0000318097.00/opqrst,0,0 105110,0,3,0000318097/LMNOPQRSTUVWXYZABCDE 2,0000330649.00/opqrstu,0,0 105755,0,3,0000330649/LMNOPQRSTUVWXY 2,0000299596.00/opqrstuv,0,0 112399,0,3,0000299596/LMNOPQRSTU 2,0000308804.00/opqrstuv,0,0 105170,0,3,0000308804/LMNOPQRSTUVWXYZ 2,0000315705.00/opqrstuv,0,0 105604,0,3,0000315705/LMNOP 2,0000323661.00/opqrstuv,0,0 105405,0,3,0000323661/LMNOPQRSTUVWX 2,0000328895.00/opqrstuv,0,0 106394,0,3,0000328895/LMNOP 2,0000299417.00/opqrstuvw,78427,0 107673,78427,3,0000299417/LMNOPQRSTU 2,0000299064.00/opqrstuvwx,0,0 110600,0,3,0000299064/LMNOP 2,0000324768.00/opqrstuvwx,0,0
I don't think WT-6491 can cause this issue as it doesn't change any code relevant to key insertion and the config doesn't enable prepare.
Here's the call stack:
#0 0x00007f2413125f47 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007f24131278b1 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x0000000000763664 in __wt_abort (session=0x7f241000ab70) at ../src/os_common/os_abort.c:30 #3 0x000000000088a950 in __wt_panic_func (session=0x7f241000ab70, error=22, func=0xe09280 <__func__.__cursor_key_order_check_row> "__cursor_key_order_check_row", line=485, fmt=0xe092c0 <str> "WT_CURSOR.%s out-of-order returns: returned key %.1024s then key %.1024s") at ../src/support/err.c:378 #4 0x00000000009eb1ef in __cursor_key_order_check_row (session=0x7f241000ab70, cbt=0x619000031180, next=false) at ../src/btree/bt_curnext.c:480 #5 0x00000000009ea38e in __wt_cursor_key_order_check (session=0x7f241000ab70, cbt=0x619000031180, next=false) at ../src/btree/bt_curnext.c:506 #6 0x0000000000a0d800 in __wt_btcur_prev (cbt=0x619000031180, truncating=false) at ../src/btree/bt_curprev.c:717 #7 0x0000000000624254 in __curfile_prev (cursor=0x619000031180) at ../src/cursor/cur_file.c:146 #8 0x00000000006be502 in __hs_cursor_position_int (session=0x7f241000ab70, cursor=0x619000031180, btree_id=2, key=0x6040002f8010, timestamp=0, user_srch_key=0x6040002f80d0) at ../src/history/hs.c:1128 #9 0x00000000006bd8d0 in __wt_hs_cursor_position (session=0x7f241000ab70, cursor=0x619000031180, btree_id=2, key=0x6040002f8010, timestamp=0, user_srch_key=0x6040002f80d0) at ../src/history/hs.c:1163 #10 0x00000000006ca07c in __hs_insert_record_with_btree (session=0x7f241000ab70, cursor=0x619000031180, btree=0x617000002000, key=0x6040002f8010, upd=0x60700040b450, type=3 '\003', hs_value=0x6040002f8050, stop_time_point=0x7f24067f39e0, clear_hs=true) at ../src/history/hs.c:547 #11 0x00000000006bc787 in __hs_insert_record (session=0x7f241000ab70, cursor=0x619000031180, btree=0x617000002000, key=0x6040002f8010, upd=0x60700040b450, type=3 '\003', hs_value=0x6040002f8050, stop_time_point=0x7f24067f39e0, clear_hs=true) at ../src/history/hs.c:637 #12 0x00000000006b9be8 in __wt_hs_insert_updates (session=0x7f241000ab70, page=0x60d000562d40, multi=0x61900618a080) at ../src/history/hs.c:1022 #13 0x00000000007ecfb4 in __rec_hs_wrapup (session=0x7f241000ab70, r=0x61d0003fe880) at ../src/reconcile/rec_write.c:2280 #14 0x00000000007d3029 in __rec_write_wrapup (session=0x7f241000ab70, r=0x61d0003fe880, page=0x60d000562d40) at ../src/reconcile/rec_write.c:2114 #15 0x00000000007c2659 in __reconcile (session=0x7f241000ab70, ref=0x60e00390f6e0, salvage=0x0, flags=280, page_lockedp=0x7f24067f56e0) at ../src/reconcile/rec_write.c:210 #16 0x00000000007c0abf in __wt_reconcile (session=0x7f241000ab70, ref=0x60e00390f6e0, salvage=0x0, flags=280) at ../src/reconcile/rec_write.c:89 #17 0x00000000006a9417 in __evict_review (session=0x7f241000ab70, ref=0x60e00390f6e0, evict_flags=0, inmem_splitp=0x7f24067f5b70) at ../src/evict/evict_page.c:665 #18 0x00000000006a560e in __wt_evict (session=0x7f241000ab70, ref=0x60e00390f6e0, previous_state=3 '\003', flags=0) at ../src/evict/evict_page.c:186 #19 0x00000000006866ba in __evict_page (session=0x7f241000ab70, is_server=false) at ../src/evict/evict_lru.c:2254 #20 0x000000000067ed01 in __evict_lru_pages (session=0x7f241000ab70, is_server=false) at ../src/evict/evict_lru.c:1134 #21 0x000000000067dd02 in __wt_evict_thread_run (session=0x7f241000ab70, thread=0x607000004930) at ../src/evict/evict_lru.c:324 #22 0x00000000008ef474 in __thread_run (arg=0x607000004930) at ../src/support/thread_group.c:31 #23 0x00007f24140d66db in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #24 0x00007f2413208a3f in clone () from /lib/x86_64-linux-gnu/libc.so.6
- duplicates
-
WT-6436 Fix not resetting the key when retrying to search the history store
- Closed