The default parallel-pop-lsm runner works, but if you increase the number of populate threads to 10, it fails for me on pixiebob.
# wtperf options file: Run populate thread multi-threaded and with groups # of operations in each transaction. conn_config="cache_size=200MB" table_config="lsm_chunk_size=1M,type=lsm" transaction_config="isolation=snapshot" icount=10000000 report_interval=5 stat_interval=4 run_time=20 populate_ops_per_txn=100 populate_threads=10 verbose=1
Here are the stacks:
thread 15 execute_populate
thread 14 eviction server
thread 13 __wt_lsm_stat_init (waiting on LSM lock)
thread 12 failing thread
thread 11 failing thread
thread 1, 2, 3, 4, 5, 6, 7, 8, 9 10
__clsm_put sleeping
while (clsm->dsk_gen == lsm_tree->dsk_gen)
__wt_sleep(0, 10);
Thread 12:
WT-4 0x0000000000478398 in __wt_abort (session=0x80204ba28) at ../src/os_posix/os_abort.c:21 WT-5 0x0000000000426597 in __wt_assert (session=Could not find the frame base for "__wt_assert". ) at ../src/support/err.c:408 WT-6 0x0000000000411f44 in __lsm_free_chunks (session=0x80204ba28, lsm_tree=0x8023de600) at ../src/lsm/lsm_worker.c:621 WT-7 0x0000000000410b58 in __wt_lsm_merge_worker (vargs=0x80201e450) at ../src/lsm/lsm_worker.c:127 (gdb) frame 6 WT-6 0x0000000000411f44 in __lsm_free_chunks (session=0x80204ba28, lsm_tree=0x8023de600) at ../src/lsm/lsm_worker.c:621 621 WT_ASSERT(session, lsm_tree->old_chunks[skipped] == chunk); (gdb) p cookie $13 = {chunk_array = 0x802619c00, chunk_alloc = 1280, nchunks = 82} (gdb) p i $14 = 75 (gdb) p skipped $15 = 0 (gdb) p progress $16 = 1 (gdb) p chunk $99 = (WT_LSM_CHUNK *) 0x802728ce0 (gdb) p *chunk $100 = {id = 269, generation = 1, uri = 0x8027e1540 "file:test-000269.lsm", bloom_uri = 0x8027ed460 "file:test-000269.bf", count = 135834, create_ts = { tv_sec = 1381325195, tv_nsec = 118098271}, refcnt = 1, txnid_max = 0, flags = 24}
If I look at the list of chunks in the cookie, all of them have a refcnt of 2 except for the chunk we're looking at.
OK, I think the problem here is that we're not incrementing skipped if we continue in the loop because chunk->refcnt > 1.
Thread 11:
(gdb) where WT-5 0x0000000000426597 in __wt_assert (session=Could not find the frame base for "__wt_assert". ) at ../src/support/err.c:408 WT-6 0x0000000000411b1d in __lsm_discard_handle (session=0x80204b820, uri=0x805ff32c0 "file:test-000243.lsm", checkpoint=0x0) at ../src/lsm/lsm_worker.c:491 WT-7 0x000000000041109f in __wt_lsm_checkpoint_worker (arg=0x8023de600) at ../src/lsm/lsm_worker.c:295 (gdb) frame 6 WT-6 0x0000000000411b1d in __lsm_discard_handle (session=0x80204b820, uri=0x805ff32c0 "file:test-000243.lsm", checkpoint=0x0) at ../src/lsm/lsm_worker.c:491 491 WT_ASSERT(session, S2BT(session)->modified == 0); (gdb) p ((WT_BTREE *)session->dhandle->handle)->modified $191 = 1 (gdb) p session->dhandle->name $192 = 0x805ff3340 "file:test-000243.lsm" (gdb) frame 7 WT-7 0x000000000041109f in __wt_lsm_checkpoint_worker (arg=0x8023de600) at ../src/lsm/lsm_worker.c:295 295 if ((ret = __lsm_discard_handle( (gdb) p *chunk $193 = {id = 243, generation = 0, uri = 0x805ff32c0 "file:test-000243.lsm", bloom_uri = 0x0, count = 10292, create_ts = {tv_sec = 1381325191, tv_nsec = 118344315}, refcnt = 2, txnid_max = 19371, flags = 24} (gdb) p chunk->flags & 0x10 $194 = 16 (gdb) p chunk->flags & 0x04 $195 = 0
So, we're discarding a chunk, that chunk is WT_LSM_CHUNK_ONDISK, but not WT_LSM_CHUNK_EVICTED, and we're concerned that the btree handle's modified flag is set.
- related to
-
WT-4 Flexible cursor traversals
- Closed
-
WT-5 How does pget work: is it necessary?
- Closed
-
WT-6 Complex schema example
- Closed
-
WT-7 Do we need the handle->err/errx methods?
- Closed
-
WT-8 Do we need table load, bulk-load and/or dump methods?
- Closed
-
WT-9 Does adding schema need to be transactional?
- Closed
-
WT-10 Basic "getting started" tutorial
- Closed
-
WT-11 placeholder #11
- Closed
-
WT-12 Write more examples
- Closed
-
WT-13 Define supported platforms
- Closed
-
WT-14 Windows build
- Closed
-
WT-15 Automated build/test infrastructure
- Closed
-
WT-16 Test suite
- Closed
-
WT-17 Multithreaded tests
- Closed
-
WT-18 Coverage tests
- Closed
-
WT-19 Memory access / leak tests
- Closed
-
WT-20 API design
- Closed
-
WT-21 Record numbers in row stores
- Closed