-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 3.0.6, 3.1.7
-
Component/s: WiredTiger
-
None
-
Fully Compatible
-
ALL
- 6 cores, 64 GB memory (everything fits in cache)
- test below issues 10 inserts/s per connection and ramps connections up to 10k connections for a total expected throughput of 100k inserts/s
- measured max throughput at small connection count was 300k/s without journal, 200k/s with journal, so this test, with a maximum expected throughput of only 100k/s, does not tax total capacity of system but rather probes the effect of high connection counts at relatively low op rates per connection
- 3.0.6 build used is actually 3.0.6 + fixes for
SERVER-20091
- expected scaling is achieve without journal (green)
- under 3.0.6 with journal enabled only 25% of expected throughput is reached; this is consistent run to run (red)
- in 3.1 50-75% of expected throughput is reached, but there is a striking run-to-run variability (yellow, blue, purple)
Repro code:
function conns() { return db.serverStatus().connections.current } function ops() { return db.serverStatus().opcounters.insert } function repro(threads_insert) { // run forever seconds = 10000 // starting stats last_conns = curr_conns = start_conns = conns() last_time = new Date() last_ops = ops() // loop starting new connections while (curr_conns < start_conns+threads_insert) { // start 10 more insert threads with a random delay around 100ms (10 inserts/second/thread) res = benchStart({ ops: [{ op: "insert", ns: "test.c", doc: {}, delay: NumberInt(100 + Math.random()*10-5) }] seconds: seconds, parallel: 10, }) // 10 new connections every 100ms sleep(100) // print op rate vs connections curr_conns = conns() if (curr_conns-last_conns >= 100) { curr_time = new Date() curr_ops = ops() ops_per_sec = Math.round((curr_ops - last_ops) / ((curr_time - last_time) / 1000.0)) avg_conns = (last_conns+curr_conns) / 2 print('' + avg_conns + '\t' + ops_per_sec) last_time = curr_time last_ops = curr_ops last_conns = curr_conns } } // run forever sleep(seconds*1000) }
- depends on
-
WT-2031 Buffer log records in memory to improve performance
- Closed
- related to
-
SERVER-20409 Negative scaling with more than 10K connections
- Closed