Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Critical - P2
Fix Version/s: 3.0.0-rc9, 3.1.0
Affects Version/s: 3.0.0-rc8
Component/s: Replication, Storage
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Completed:

3.0.0-rc9
Steps To Reproduce:
Hide

Socialite load workload (available at https://github.com/10gen-labs/socialite).

java -jar target/socialite-0.0.1-SNAPSHOT.jar load --users 10000000 --maxfollows 1000 --messages 2000 --threads 32 sample-config.yml

3 node replica set c3.2xlarge instances, 8 cpu, 15g.
Replica set configuration

{ "_id" : "shard1", "version" : 1, "members" : [ { "_id" : 1, "host" : "shard1-01.knuckleboys.com:27017" }, { "_id" : 2, "host" : "shard1-02.knuckleboys.com:27017" }, { "_id" : 3, "host" : "shard1-03.knuckleboys.com:27017" } ] }

Socialite load workload: java -jar target/socialite-0.0.1-SNAPSHOT.jar load --users 10000000 --maxfollows 1000 --messages 2000 --threads 32 sample-config.yml

Indexes:

[ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "socialite.content" }, { "v" : 1, "key" : { "_a" : 1, "_id" : 1 }, "name" : "_a_1__id_1", "ns" : "socialite.content" } ] [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "socialite.followers" }, { "v" : 1, "unique" : true, "key" : { "_f" : 1, "_t" : 1 }, "name" : "_f_1__t_1", "ns" : "socialite.followers" } ] [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "socialite.following" }, { "v" : 1, "unique" : true, "key" : { "_f" : 1, "_t" : 1 }, "name" : "_f_1__t_1", "ns" : "socialite.following" } ] [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "socialite.users" } ]


Show
Socialite load workload (available at https://github.com/10gen-labs/socialite ). java -jar target/socialite-0.0.1-SNAPSHOT.jar load --users 10000000 --maxfollows 1000 --messages 2000 --threads 32 sample-config.yml 3 node replica set c3.2xlarge instances, 8 cpu, 15g. Replica set configuration { "_id" : "shard1" , "version" : 1, "members" : [ { "_id" : 1, "host" : "shard1-01.knuckleboys.com:27017" }, { "_id" : 2, "host" : "shard1-02.knuckleboys.com:27017" }, { "_id" : 3, "host" : "shard1-03.knuckleboys.com:27017" } ] } Socialite load workload: java -jar target/socialite-0.0.1-SNAPSHOT.jar load --users 10000000 --maxfollows 1000 --messages 2000 --threads 32 sample-config.yml Indexes: [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_" , "ns" : "socialite.content" }, { "v" : 1, "key" : { "_a" : 1, "_id" : 1 }, "name" : "_a_1__id_1" , "ns" : "socialite.content" } ] [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_" , "ns" : "socialite.followers" }, { "v" : 1, "unique" : true , "key" : { "_f" : 1, "_t" : 1 }, "name" : "_f_1__t_1" , "ns" : "socialite.followers" } ] [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_" , "ns" : "socialite.following" }, { "v" : 1, "unique" : true , "key" : { "_f" : 1, "_t" : 1 }, "name" : "_f_1__t_1" , "ns" : "socialite.following" } ] [ { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_" , "ns" : "socialite.users" } ] 

Running the socialite load workload (primarily writes) against a three member replica set, 4g oplog with one secondary using MMAPv1 and the other using wiredTiger, the WT secondary starts falling behind. The MMAPv1 secondary keeps up.
Typically, I'll see the WT secondary get to about 1500s behind the primary, then it will clear up. This take about 2 hours to reproduce. After about 8-10 hours total run time the WT secondary starts to fall behind again, then falls off the tail of the oplog.
Screen shots of MMS during a recovered lag, and of timeline output (from https://github.com/10gen/support-tools) with correlated activity. The full timeline files are attached.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

socialite-3wtnode.png
110 kB
Feb 11 2015 11:08:12 AM UTC
socialite_oplog_1G.png
73 kB
Feb 11 2015 05:28:02 AM UTC
node 3 WT secondary timeline.png
294 kB
Feb 06 2015 07:17:02 PM UTC
node 2 MMAPv1 secondary timeline.png
213 kB
Feb 06 2015 07:17:02 PM UTC
node 1 WT primary timeline.png
288 kB
Feb 06 2015 07:17:02 PM UTC
mongod-3-.log.gz
36 kB
Feb 06 2015 07:17:02 PM UTC
mongod-2-.log.gz
35 kB
Feb 06 2015 07:17:02 PM UTC
mongod-1-.log.gz
3.99 MB
Feb 06 2015 07:17:02 PM UTC
Dashboard___MMS__MongoDB_Management_Service.png
83 kB
Feb 06 2015 07:17:02 PM UTC
3-ss.log
3.32 MB
Feb 06 2015 07:17:02 PM UTC
3-oplog-cs.html
475 kB
Feb 07 2015 04:37:54 PM UTC
3-op-cs.log
1.08 MB
Feb 07 2015 04:37:54 PM UTC
3lag.html
1.21 MB
Feb 06 2015 07:17:02 PM UTC
2-ss.log
2.01 MB
Feb 06 2015 07:17:02 PM UTC
2-oplog-cs.html
63 kB
Feb 07 2015 04:37:54 PM UTC
2-op-cs.log
136 kB
Feb 07 2015 04:37:54 PM UTC
2lag.html
554 kB
Feb 06 2015 07:17:02 PM UTC
1-ss.log
3.39 MB
Feb 06 2015 07:17:02 PM UTC
1-oplog-cs.html
467 kB
Feb 07 2015 04:37:54 PM UTC
1-op-cs.log
1.06 MB
Feb 07 2015 04:37:54 PM UTC
1lag.html
1.45 MB
Feb 06 2015 07:17:02 PM UTC