-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.0.2
-
Component/s: Admin
-
None
-
ALL
-
Hello,
I have a replica set with three nodes (Primary, secondary and arbiter). MongoDB version is 3.0.2 and I started my replica set by this command:
sudo mongod --storageEngine wiredTiger --dbpath /data/ --replSet rs0 --fork --logpath /var/log/mongodb/fork.log
This is a db.stats():
rs0:PRIMARY> db.stats() { "db" : "test", "collections" : 52, "objects" : 1697582895, "avgObjSize" : 745.3563943956916, "dataSize" : 1265304265805, "storageSize" : 647557865472, "numExtents" : 0, "indexes" : 176, "indexSize" : 22991790080, "ok" : 1 }
And this is how it looks like while I am syncing the Secondary on mongostat:
root@mongodb-replica1:/data# mongostat --discover insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn set repl time localhost:27017 56 6 13 *0 6 8|0 0.0 80.0 0 31.9G 31.3G 0|2 2|2 49k 214k 58 rs0 PRI 09:49:24 mongodb-replica1:27017 56 6 13 *0 6 4|0 0.0 80.0 0 31.9G 31.3G 0|0 2|0 52k 213k 58 rs0 PRI 09:49:24 mongodb-replica2:27017 *0 *0 *0 *0 0 1|0 1.2 1.3 0 31.9G 29.5G 0|1 1|0 79b 15k 5 rs0 UNK 09:49:24 localhost:27017 26 6 8 *0 5 5|0 0.0 80.0 0 31.9G 31.3G 0|0 1|0 37k 634k 58 rs0 PRI 09:49:25 mongodb-replica1:27017 27 6 8 *0 5 5|0 0.0 80.0 0 31.9G 31.3G 0|0 1|0 34k 633k 58 rs0 PRI 09:49:25 mongodb-replica2:27017 *0 *0 *0 *0 0 7|0 1.2 1.4 0 31.9G 29.5G 0|1 1|0 596b 147k 5 rs0 UNK 09:49:25 localhost:27017 35 3 8 *0 5 20|0 0.0 80.0 0 31.9G 31.3G 0|0 2|0 39k 146k 58 rs0 PRI 09:49:26 mongodb-replica1:27017 34 3 8 *0 5 20|0 0.0 80.0 0 31.9G 31.3G 0|0 2|0 39k 146k 58 rs0 PRI 09:49:26 mongodb-replica2:27017 *0 *0 *0 *0 0 2|0 1.3 1.4 0 31.9G 29.5G 0|1 1|0 137b 16k 5 rs0 UNK 09:49:26 localhost:27017 22 6 24 *0 4 5|0 0.0 80.0 0 31.9G 31.3G 0|0 2|0 54k 212k 58 rs0 PRI 09:49:27 mongodb-replica1:27017 22 6 24 *0 4 4|0 0.0 80.0 0 31.9G 31.3G 0|0 2|0 53k 197k 58 rs0 PRI 09:49:27 mongodb-replica2:27017 *0 *0 *0 *0 0 4|0 1.3 1.5 0 31.9G 29.5G 0|1 1|0 422b 16k 5 rs0 UNK 09:49:27
This is a second time I wanted to full sync my secondary and at the end when the storage is almost equal (650GB) and secondary is building indexes the Primary suddenly has a high cpu usage and eventually freezes. The SSH connection will drop and the machine is not accessible. By the look at alerts on both MMS and application level I can see that all the operations also blocked on Primary and there is no insert/update/and query.
I didn't wait to see what would've happened when the secondary finishes its building index as it was at 22% and I had to wait for a long time without primary but when I restarted the primary the secondary suddenly removed everything and started from the beginning.
The hardware spec is 10-core CPU with 80GB of memory and 3TB of storage on both Primary and Secondary. I don't have CPU profiling on MMS enabled as I remember I couldn't do it way back so let me know if you need more info or to log something for the next time.
- duplicates
-
SERVER-17424 WiredTiger uses substantially more memory than accounted for by cache
- Closed