Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-4504

dhandle gets zeroed after sweep server frees it

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • 8
    • Storage Engines 2019-11-18

      The error is caused by a WiredTiger dhandle being modified after it has been freed:

      Found a corrupted memory buffer in MallocBlock (may be offset from user ptr): buffer index: 0, buffer ptr: 0x7fd3b94bca00, size of buffer: 456
      Buffer byte 144 is 0x00 (should be 0xcd).
      Buffer byte 145 is 0x00 (should be 0xcd).
      Buffer byte 146 is 0x00 (should be 0xcd).
      Buffer byte 147 is 0x00 (should be 0xcd).
      Buffer byte 148 is 0x00 (should be 0xcd).
      Buffer byte 149 is 0x00 (should be 0xcd).
      Buffer byte 150 is 0x00 (should be 0xcd).
      Buffer byte 151 is 0x00 (should be 0xcd).
      Deleted by thread 0x7fd3dd6ca700
      *** WARNING: Cannot convert addresses to symbols in output below.
      *** Reason: Cannot find 'pprof' (is PPROF_PATH set correctly?)
      *** If you cannot fix this, try running pprof directly.
      @ 0x7fd3ebb84d86
      @ 0x7fd3ebb8970d
      @ 0x7fd3ebb89a03
      @ 0x7fd3e9ad8aa1
      Memory was written to after being freed.
      

      Looking at these locations from where the memory was freed:

      (gdb) l *0x7fd3ebb84d86
      0x7fd3ebb84d86 is in __wt_conn_dhandle_discard_single (src/third_party/wiredtiger/src/conn/conn_dhandle.c:771).
      warning: Source file is more recent than executable.
      766	
      767		/*
      768		 * After successfully removing the handle, clean it up.
      769		 */
      770		if (ret == 0 || final) {
      771			WT_TRET(__conn_dhandle_destroy(session, dhandle));
      772			session->dhandle = NULL;
      773		}
      774	
      775		return (ret);
      (gdb) l *0x7fd3ebb8970d
      0x7fd3ebb8970d is in __sweep_remove_one (src/third_party/wiredtiger/src/conn/conn_sweep.c:217).
      warning: Source file is more recent than executable.
      212	
      213		/*
      214		 * If the handle was not successfully discarded, unlock it and
      215		 * don't retry the discard until it times out again.
      216		 */
      217		if (ret != 0) {
      218	err:		__wt_writeunlock(session, &dhandle->rwlock);
      219		}
      220	
      221		return (ret);
      (gdb) l *0x7fd3ebb89a03
      0x7fd3ebb89a03 is in __sweep_server (src/third_party/wiredtiger/src/conn/conn_sweep.c:248).
      243			if (dhandle->type == WT_DHANDLE_TYPE_TABLE)
      244				WT_WITH_TABLE_WRITE_LOCK(session,
      245				    WT_WITH_HANDLE_LIST_WRITE_LOCK(session,
      246					ret = __sweep_remove_one(session, dhandle)));
      247			else
      248				WT_WITH_HANDLE_LIST_WRITE_LOCK(session,
      249				    ret = __sweep_remove_one(session, dhandle));
      250			if (ret == 0)
      251				WT_STAT_CONN_INCR(session, dh_sweep_remove);
      252			else
      

      It would seem this is a WT issue, as mongod never deals with dhandles directly, so I'm assigning this to the Storage Engines team.

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            geert.bosch@mongodb.com Geert Bosch
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: