-
Type: Bug
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.10
-
Component/s: None
-
None
-
Environment:Ubuntu 16.04
-
ALL
mongod 3.4.10 on Ubuntu 16.04 in a replica set with 3 nodes. The primary and a secondary consumed all RAM and were killed by the kernel OOM killer within a couple of minutes of each other.
It's on the default setting for storage.wiredTiger.engineConfig.cacheSizeGB, so I would expect it to use around 50% of RAM.
At the time I think there was some heavy insert activity.
Possibly related, the other secondary had started lagging about 15-20 minutes earlier. That node is in Azure and tends to lag under load because of SERVER-31215 / WT-3461.
Primary detecting lag:
Jan 10 23:19:50 primary monit[12013]: 'mongo_replcheck' '/usr/local/bin/mongo_replcheck.sh' failed with exit status (1) -- azureslave:27017 lag of 445 sec exceeds threshold 300
Primary running out of memory:
Jan 10 23:31:01 primary kernel: [2494318.535214] Out of memory: Kill process 1100 (mongod) score 954 or sacrifice child Jan 10 23:31:01 primary kernel: [2494318.548316] Killed process 1100 (mongod) total-vm:17422016kB, anon-rss:15654812kB, file-rss:0kB
Secondary running out of memory:
Jan 10 23:33:46 secondary kernel: [2496035.849027] Out of memory: Kill process 26160 (mongod) score 955 or sacrifice child Jan 10 23:33:46 secondary kernel: [2496035.862134] Killed process 26160 (mongod) total-vm:17415872kB, anon-rss:15675724kB, file-rss:0kB
Total memory on the primary and secondary is 16431148 KB. 14350764 KB in Azure.
I can provide the FTDC logs privately if you are interested.
- related to
-
SERVER-32398 Primary freezes during background index build
- Closed