Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-6770

assert in __wt_ref_out in eviction thread

    • Type: Icon: Bug Bug
    • Resolution: Gone away
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: WT3.2.1
    • Component/s: None
    • None
    • 8
    • Storage - Ra 2020-10-19, Storage - Ra 2020-11-02, Storage - Ra 2020-11-16, Storage - Ra 2020-11-30

      Hi!

      Our application crashes in WiredTiger library.

      In release build it segfaults with the following stack:

      #1  <signal handler called>
      #2  __evict_walk_tree (session=0x7f4f467c29d8, queue=0x7f4f47447028, max_entries=181, slotp=<optimized out>) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:1857
      #3  __evict_walk (session=0x7f4f467c29d8, queue=<optimized out>) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:1509
      #4  __evict_lru_walk (session=0x7f4f467c5330) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:1186
      #5  __evict_pass (session=0x7f4f467c5330) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:701
      #6  __evict_server (session=0x7f4f467c5330, did_work=<optimized out>) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:379
      #7  __wt_evict_thread_run (session=<optimized out>, thread=0x7f4f4674c3a0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:291
      #8  0x00007f4f61911aa9 in __thread_run (arg=0x7f4f4674c3a0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/support/thread_group.c:31
      #9  0x00007f4f5e259aa1 in start_thread () from /lib64/libpthread.so.0
      #10 0x00007f4f5dfa6c4d in clone () from /lib64/libc.so.6
      

      When built with diagnostic enabled stack is

      #1  <signal handler called>
      #2  0x00007f081afc24f5 in raise () from /lib64/libc.so.6
      #3  0x00007f081afc3cd5 in abort () from /lib64/libc.so.6
      #4  0x00007f081fc0ae5f in __wt_abort (session=0x7f08020f21e0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/os_common/os_abort.c:30
      #5  0x00007f081fad7f91 in __wt_ref_out (session=0x7f08020f21e0, ref=0x7f07a84400c0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/btree/bt_discard.c:32
      #6  0x00007f081fb0868f in __wt_split_rewrite (session=0x7f08020f21e0, ref=0x7f07a84400c0, multi=0x7f07fd3fe528) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/btree/bt_split.c:2282
      #7  0x00007f081fbcab16 in __evict_page_dirty_update (session=0x7f08020f21e0, ref=0x7f07a84400c0, evict_flags=0)
          at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_page.c:426
      #8  0x00007f081fbc91c0 in __wt_evict (session=0x7f08020f21e0, ref=0x7f07a84400c0, previous_state=5, flags=0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_page.c:192
      #9  0x00007f081fbc0df6 in __evict_page (session=0x7f08020f21e0, is_server=true) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:2245
      #10 0x00007f081fbbef1b in __evict_lru_pages (session=0x7f08020f21e0, is_server=true) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:1121
      #11 0x00007f081fbc1f7e in __evict_pass (session=0x7f08020f21e0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:716
      #12 0x00007f081fbbeb41 in __evict_server (session=0x7f08020f21e0, did_work=0x7f07fd3fe843) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:379
      #13 0x00007f081fbbe8a1 in __wt_evict_thread_run (session=0x7f08020f21e0, thread=0x7f08094f23e0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/evict/evict_lru.c:291
      #14 0x00007f081fc91e36 in __thread_run (arg=0x7f08094f23e0) at /tb/builds/thd/sbn/tbricks-2.14.7/src/thirdparty/wiredtiger/3.2.1/src/src/support/thread_group.c:31
      #15 0x00007f081b32baa1 in start_thread () from /lib64/libpthread.so.0
      #16 0x00007f081b078c4d in clone () from /lib64/libc.so.6
      

      Address sanitizer complains about read of already free()'d memory:

      ==131021==ERROR: AddressSanitizer: heap-use-after-free on address 0x60e0001e3908 at pc 0x7ffff36f18c6 bp 0x7fffd0fa1060 sp 0x7fffd0fa1058
      READ of size 8 at 0x60e0001e3908 thread T25
          #0 0x7ffff36f18c5 in __wt_ref_is_root wiredtiger/3.2.1/src/src/include/btree.i:16:18
          #1 0x7ffff36fce2e in __evict_walk_tree wiredtiger/3.2.1/src/src/evict/evict_lru.c:1853:13
          #2 0x7ffff36fb553 in __evict_walk wiredtiger/3.2.1/src/src/evict/evict_lru.c:1509:17
          #3 0x7ffff36f9fd2 in __evict_lru_walk wiredtiger/3.2.1/src/src/evict/evict_lru.c:1186:16
          #4 0x7ffff36f7fe8 in __evict_pass wiredtiger/3.2.1/src/src/evict/evict_lru.c:701:13
          #5 0x7ffff36f239d in __evict_server wiredtiger/3.2.1/src/src/evict/evict_lru.c:379:5
          #6 0x7ffff36f1eca in __wt_evict_thread_run wiredtiger/3.2.1/src/src/evict/evict_lru.c:291:15
          #7 0x7ffff383b716 in __thread_run wiredtiger/3.2.1/src/src/support/thread_group.c:31:9
          #8 0x7fffed95caa0 in start_thread ??:0:0
          #9 0x7fffed289c4c in clone ??:0:0
      
      0x60e0001e3908 is located 8 bytes inside of 160-byte region [0x60e0001e3900,0x60e0001e39a0)
      freed by thread T25 here:
          #0 0x49642d in free _asan_rtl_:3
          #1 0x7ffff3771f4a in __wt_free_int wiredtiger/3.2.1/src/src/os_common/os_alloc.c:302:5
          #2 0x7ffff381f5cc in __stash_discard wiredtiger/3.2.1/src/src/support/generation.c:286:9
          #3 0x7ffff381f860 in __wt_stash_add wiredtiger/3.2.1/src/src/support/generation.c:347:9
          #4 0x7ffff35d2694 in __split_safe_free wiredtiger/3.2.1/src/src/btree/bt_split.c:52:13
          #5 0x7ffff35d06ac in __split_parent wiredtiger/3.2.1/src/src/btree/bt_split.c:819:17
          #6 0x7ffff35da320 in __split_multi wiredtiger/3.2.1/src/src/btree/bt_split.c:2122:5
          #7 0x7ffff35cb549 in __split_multi_lock wiredtiger/3.2.1/src/src/btree/bt_split.c:2162:16
          #8 0x7ffff35cb436 in __wt_split_multi wiredtiger/3.2.1/src/src/btree/bt_split.c:2189:5
          #9 0x7ffff3705ad3 in __evict_page_dirty_update wiredtiger/3.2.1/src/src/evict/evict_page.c:383:13
          #10 0x7ffff37038b1 in __wt_evict wiredtiger/3.2.1/src/src/evict/evict_page.c:192:9
          #11 0x7ffff36f5f0b in __evict_page wiredtiger/3.2.1/src/src/evict/evict_lru.c:2245:5
          #12 0x7ffff36f297a in __evict_lru_pages wiredtiger/3.2.1/src/src/evict/evict_lru.c:1121:20
          #13 0x7ffff36f8091 in __evict_pass wiredtiger/3.2.1/src/src/evict/evict_lru.c:716:13
          #14 0x7ffff36f239d in __evict_server wiredtiger/3.2.1/src/src/evict/evict_lru.c:379:5
          #15 0x7ffff36f1eca in __wt_evict_thread_run wiredtiger/3.2.1/src/src/evict/evict_lru.c:291:15
          #16 0x7ffff383b716 in __thread_run wiredtiger/3.2.1/src/src/support/thread_group.c:31:9
          #17 0x7fffed95caa0 in start_thread ??:0:0
      
      previously allocated by thread T25 here:
          #0 0x496822 in calloc _asan_rtl_:3
          #1 0x7ffff37713ac in __wt_calloc wiredtiger/3.2.1/src/src/os_common/os_alloc.c:50:14
          #2 0x7ffff35c96b4 in __wt_multi_to_ref wiredtiger/3.2.1/src/src/btree/bt_split.c:1624:5
          #3 0x7ffff35da2c0 in __split_multi wiredtiger/3.2.1/src/src/btree/bt_split.c:2116:9
          #4 0x7ffff35cb549 in __split_multi_lock wiredtiger/3.2.1/src/src/btree/bt_split.c:2162:16
          #5 0x7ffff35cb436 in __wt_split_multi wiredtiger/3.2.1/src/src/btree/bt_split.c:2189:5
          #6 0x7ffff3705ad3 in __evict_page_dirty_update wiredtiger/3.2.1/src/src/evict/evict_page.c:383:13
          #7 0x7ffff37038b1 in __wt_evict wiredtiger/3.2.1/src/src/evict/evict_page.c:192:9
          #8 0x7ffff36f5f0b in __evict_page wiredtiger/3.2.1/src/src/evict/evict_lru.c:2245:5
          #9 0x7ffff36f297a in __evict_lru_pages wiredtiger/3.2.1/src/src/evict/evict_lru.c:1121:20
          #10 0x7ffff36f8091 in __evict_pass wiredtiger/3.2.1/src/src/evict/evict_lru.c:716:13
          #11 0x7ffff36f239d in __evict_server wiredtiger/3.2.1/src/src/evict/evict_lru.c:379:5
          #12 0x7ffff36f1eca in __wt_evict_thread_run wiredtiger/3.2.1/src/src/evict/evict_lru.c:291:15
          #13 0x7ffff383b716 in __thread_run wiredtiger/3.2.1/src/src/support/thread_group.c:31:9
          #14 0x7fffed95caa0 in start_thread ??:0:0
      
      Thread T25 created by T0 here:
          #0 0x4814ba in pthread_create _asan_rtl_:3
          #1 0x7ffff37814bb in __wt_thread_create wiredtiger/3.2.1/src/src/os_posix/os_thread.c:28:5
          #2 0x7ffff383a006 in __thread_group_resize wiredtiger/3.2.1/src/src/support/thread_group.c:201:9
          #3 0x7ffff383a608 in __wt_thread_group_create wiredtiger/3.2.1/src/src/support/thread_group.c:287:5
          #4 0x7ffff36f34a2 in __wt_evict_create wiredtiger/3.2.1/src/src/evict/evict_lru.c:481:5
          #5 0x7ffff3658b57 in __wt_connection_workers wiredtiger/3.2.1/src/src/conn/conn_open.c:225:5
          #6 0x7ffff362bcae in wiredtiger_open wiredtiger/3.2.1/src/src/conn/conn_api.c:2666:5
      <application frames>
      
      SUMMARY: AddressSanitizer: heap-use-after-free (/home/sbn/tb/tb/build.x86_64-unknown-linux/asan/bin/../platform/lib64/libwiredtiger-3.2.1.so+0x2c88c5)
      Shadow bytes around the buggy address:
        0x0c1c800346d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c1c800346e0: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
        0x0c1c800346f0: fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa
        0x0c1c80034700: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
        0x0c1c80034710: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
      =>0x0c1c80034720: fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd
        0x0c1c80034730: fd fd fd fd fa fa fa fa fa fa fa fa 00 00 00 00
        0x0c1c80034740: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        0x0c1c80034750: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
        0x0c1c80034760: 00 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa
        0x0c1c80034770: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
      Shadow byte legend (one shadow byte represents 8 application bytes):
        Addressable:           00
        Partially addressable: 01 02 03 04 05 06 07
        Heap left redzone:       fa
        Freed heap region:       fd
        Stack left redzone:      f1
        Stack mid redzone:       f2
        Stack right redzone:     f3
        Stack after return:      f5
        Stack use after scope:   f8
        Global redzone:          f9
        Global init order:       f6
        Poisoned by user:        f7
        Container overflow:      fc
        Array cookie:            ac
        Intra object redzone:    bb
        ASan internal:           fe
        Left alloca redzone:     ca
        Right alloca redzone:    cb
        Shadow gap:              cc
      ==131021==ABORTING
      

      We can reproduce it rather easily with our application.

      The crash happens when we load data into empty DB.
      DB contains two tables, one of each is marked as cache-resident (it's for "hot" data).
      Both tables have the same set of indices and data is written to both: each record under explicit transaction.

      We also have background thread can increase DB cache size during data loading.

      Please advise.
      Thanks!

            Assignee:
            backlog-server-storage-engines [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Dmitri Shubin Dmitri Shubin
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: