-
Type: Bug
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Internal Code
-
None
-
ALL
-
This is different than OOM killer which can't be controlled - when mongodb detects that the system cannot allocate memory, it should try to recover if possible rather than immediately die. it would also be preferable to shutdown cleanly if recovery is not possible.
a pause in the server is very preferable to immediate process death - mongodb can afford to flush / free cache memory under this type of memory pressure.
ENOMEM / NULL returned from malloc are not things that need to cause mongodb to blow itself up, especially as mongodb is most often the owner of most of the memory on a system.
The benefits of recover and/or clean shutdown in general should be obvious. Gracefully handling system/resource limits could help avoid the potential for rollbacks and/or corrupted data.
This would be very useful in the case of disabling overcommit on linux - which is very desirable in enterprise server environments as predictable behavior is highly preferable to allowing the OOM killer to kill things.
Here's a backtrace from an unclean shutdown on out of memory from a replicated shard in a cluster.
--------------- BEGIN BACKTRACE ----- {"backtrace":[{"b":"564C71955000","o":"15786B1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"564C71955000","o":"1577CE4","s":"_ZN5mongo29reportOutOfMemoryErrorAndExitEv"},{"b":"564C71955000","o":"14E5 A81","s":"_ZN5mongo12mongoReallocEPvm"},{"b":"564C71955000","o":"889385","s":"_ZN5mongo11_BufBuilderINS_21SharedBufferAllocatorEE15grow_reallocateEi"},{"b":"564C71955000","o":"131FF75","s":"_ZN5mongo 3rpc19CommandReplyBuilder22getInPlaceReplyBuilderEm"},{"b":"564C71955000","o":"A6F131","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},{" b":"564C71955000","o":"A70C61","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},{"b":"564C71955000","o":"10895C0","s":"_ZN5mo ngo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},{"b":"564C71955000","o":"C8EE98","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7Messa geERNS_10DbResponseERKNS_11HostAndPortE"},{"b":"564C71955000","o":"88BFFD","s":"_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE"},{"b":"564C71955000","o":"88 C92D"},{"b":"564C71955000","o":"14E0401"},{"b":"7FF2C4B92000","o":"7E25"},{"b":"7FF2C47CF000","o":"F834D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.10", "gitVersion" : "078f28920cb24de0dd 479b5ea6c66c644f6326e9", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.10.0-693.11.1.el7.x86_64", "version" : "#1 SMP Mon Dec 4 23:52:40 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "564C71955000", "elfType" : 3, "buildId" : "94C7FAB092E567C9338D13DB9B68751363D15EFD" }, { "b" : "7FFF4D0F9000", "elfType" : 3, "buildId" : "4D9C78C211890A0E48180A6194B1837FC9DECA70" }, { "b" : "7FF2C5B33000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "ED0AC7DEB91A242C194B3DEF27A215F41CE43116" }, { "b" : "7FF2C56D2000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "BC0AE9CA0705BEC1F0C0375AAD839843BB219CB1" }, { "b" : "7FF2C54CA000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "6D322588B36D2617C03C0F3B93677E62FCFFDA81" }, { "b" : "7FF2C52C6000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "1E42EBFB272D37B726F457D6FE3C33D2B094BB69" }, { "b" : "7FF2C4FC4000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId " : "808BD35686C193F218A5AAAC6194C49214CFF379" }, { "b" : "7FF2C4DAE000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "3E85E6D20D2CE9CDAD535084BEA56620BAAD687C" }, { "b" : "7FF2C4B920 00", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "A48D21B2578A8381FBD8857802EAA660504248DC" }, { "b" : "7FF2C47CF000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "95FF0 2A4BEBABC573C7827A66D447F7BABDDAA44" }, { "b" : "7FF2C5DA5000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "22FA66DA7D14C88BF36C69454A357E5F1DEFAE4E" }, { "b" : "7FF2C4582000" , "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "DA322D74F55A0C4293085371A8D0E94B5962F5E7" }, { "b" : "7FF2C429A000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "B 69E63024D408E400401EEA6815317BDA38FB7C2" }, { "b" : "7FF2C4096000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "A3832734347DCA522438308C9F08F45524C65C9B" }, { "b" : "7FF2C3E63000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "A48639BF901DB554479BFAD114CB354CF63D7D6E" }, { "b" : "7FF2C3C4D000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "EA8E45DC 8E395CC5E26890470112D97A1F1E0B65" }, { "b" : "7FF2C3A3F000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "6FDF5B013FD2739D304CFB9D723DCBC149EE03C9" }, { "b" : "7FF2C383B000", "p ath" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "2E01D5AC08C1280D013AAB96B292AC58BC30A263" }, { "b" : "7FF2C3621000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "FF4E72 F4E574E143330FB3C66DB51613B0EC65EA" }, { "b" : "7FF2C33FA000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "A88379F56A51950A33198890D37F5F8AEE71F8B4" }, { "b" : "7FF2C3198000", "pat h" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "9CA3D11F018BEEB719CDB34BE800BF1641350D0A" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x564c72ecd6b1] mongod(_ZN5mongo29reportOutOfMemoryErrorAndExitEv+0x84) [0x564c72eccce4] mongod(_ZN5mongo12mongoReallocEPvm+0x21) [0x564c72e3aa81] mongod(_ZN5mongo11_BufBuilderINS_21SharedBufferAllocatorEE15grow_reallocateEi+0x55) [0x564c721de385] mongod(_ZN5mongo3rpc19CommandReplyBuilder22getInPlaceReplyBuilderEm+0x35) [0x564c72c74f75] mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0xB1) [0x564c723c4131] mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0xF81) [0x564c723c5c61] mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x240) [0x564c729de5c0] mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xD38) [0x564c725e3e98] mongod(_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE+0x1FD) [0x564c721e0ffd] mongod(+0x88C92D) [0x564c721e192d] mongod(+0x14E0401) [0x564c72e35401] libpthread.so.0(+0x7E25) [0x7ff2c4b99e25] libc.so.6(clone+0x6D) [0x7ff2c48c734d] ----- END BACKTRACE -----