Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 2.8.0-rc1
Affects Version/s: 2.8.0-rc0
Component/s: Performance, Storage
Labels:
- wiredtiger

Operating System:
ALL
Steps To Reproduce:
Hide

Add a large amount of data into a collection (my test data generation is outlined in this gist).

Here are the various storage config options used:

//mmapv1 storage: dbPath: "/ssd/db/mmap" engine: "mmapv1" //WT with compression off storage: dbPath: "/ssd/db/wt_none" engine: "wiredtiger" wiredtiger: collectionConfig: "block_compressor=" // WT with snappy storage: dbPath: "/ssd/db/wt_snappy" engine: "wiredtiger" // WT with zlib storage: dbPath: "/ssd/db/wt_zlib" engine: "wiredtiger" wiredtiger: collectionConfig: "block_compressor=zlib"

To force a collection scan, run the following:

var start = new Date().getTime(); db.data.find().explain("executionStats") var end = new Date().getTime(); print("Time to touch data: " + (end - start) + "ms");

Start and end are not really required since this is an explain and contains timing info, but I was also using this to compare to (for example) the touch command, so I wanted apples to apples timing comparisons.
Show
Add a large amount of data into a collection (my test data generation is outlined in this gist ). Here are the various storage config options used: //mmapv1 storage: dbPath: "/ssd/db/mmap" engine: "mmapv1" //WT with compression off storage: dbPath: "/ssd/db/wt_none" engine: "wiredtiger" wiredtiger: collectionConfig: "block_compressor=" // WT with snappy storage: dbPath: "/ssd/db/wt_snappy" engine: "wiredtiger" // WT with zlib storage: dbPath: "/ssd/db/wt_zlib" engine: "wiredtiger" wiredtiger: collectionConfig: "block_compressor=zlib" To force a collection scan, run the following: var start = new Date().getTime(); db.data.find().explain( "executionStats" ) var end = new Date().getTime(); print( "Time to touch data: " + (end - start) + "ms" ); Start and end are not really required since this is an explain and contains timing info, but I was also using this to compare to (for example) the touch command, so I wanted apples to apples timing comparisons.
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

This may be "works as designed" (i.e. that WT is going to be slower for traversing large data structures in memory) but I would like to make sure and quantify the expected behavior here if that is the case.

While attempting to profile the benefits of compression in terms of bandwidth savings, the expected performance of the default snappy compression (which delivered decent on-disk compression) looked slower than expected, significantly slower than mmapv1.

That led to a round of testing to better understand what was going on here. So, I used 4 basic storage engine configurations:

mmapv1
WT with no compression ("block_compressor=")
WT with snappy (default, so no block_compressor specified)
WT with zlib ("block_compressor=zlib")

The only WT config that came close to the mmapv1 performance was zlib, and that was on the read from disk test. So, I decided to test on SSD rather than spinning media, the result was that everything got a bit faster, but the relative differences remained - WT was still significantly slower.

For my initial testing methodology, since I was trying to demonstrate the benefits of compression for IO bandwidth savings, I had been clearing the caches on the system after each run.

Now that IO appeared to have no effect I decided to do consecutive runs of the collection scan, which would make the second run all in-memory (the collection is <16GiB and the test machine has 32GiB RAM, even with indexes it would fit in memory, but indexes are not in play)

However the collection scan was still slow with WiredTiger even when the data was already loaded into RAM. The mmapv1 test dropped from the ~300 second range down to 13 seconds, but the WT testing showed no similar reduction - it did improve, but was still in the hundreds of seconds range rather than double digits.

I have tried tweaking cache_size, lsm, directio, readahead to no effect (the last two before I had ruled out IO issues completely), but no significant improvement either.

Basic initial graph attached, I will add detailed timing information, graphs, perf output below to avoid bloating the description too much.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

ttld_compression.png
180 kB
Nov 14 2014 02:04:12 PM UTC

is duplicated by

SERVER-16164 optimize table scans on wiredtiger

Closed

related to

SERVER-16444 Avoid copying data out of WT buffers during tables scans

Closed

Assignee:: Mathias Stearn
Reporter:: Adam Comerford
Participants:: Adam Comerford, Adam Comerford, Alex Gorrod, Eliot Horowitz, Mathias Stearn
Votes:: 0 Vote for this issue
Watchers:: 20 Start watching this issue

Created:: Nov 14 2014 02:04:12 PM UTC
Updated:: Feb 23 2015 02:16:18 PM UTC
Resolved:: Dec 05 2014 11:47:52 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates