-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 2.6.6, 2.8.0-rc3
-
Component/s: Aggregation Framework, Concurrency
-
None
-
Fully Compatible
-
ALL
Dropping a collection during an active aggregation operation associated with that collection can crash the server.
The aggregate command creates and pins a cursor for the operation being run, and it is expected that an intent lock on the collection is held during the pin and unpin operations, to guard access to ClientCursor::_collection (which can be written to by other threads, e.g. from CollectionCursorCache::invalidateAll()). However, the aggregate command drops the collection intent lock before the unpin happens. This causes the write to _collection from ClientCursor::kill() in the drop operation to race with the read of _collection from ClientCursorPin::deleteUnderlying() in the aggregation operation.
The following scenarios also suffer from the same issue:
- The case in which the aggregate command saves the cursor.
- getMore on an aggregation cursor.
Example log output:
2014-12-19T12:32:54.420-0500 I COMMAND [conn1] CMD: drop test.foo 2014-12-19T12:32:54.911-0500 I NETWORK [initandlisten] connection accepted from 127.0.0.1:43924 #2 (2 connections now open) 2014-12-19T12:32:55.518-0500 I COMMAND [conn1] CMD: drop test.foo 2014-12-19T12:32:56.524-0500 I NETWORK [conn1] end connection 127.0.0.1:43923 (1 connection now open) 2014-12-19T12:32:59.931-0500 F - [conn2] Invalid access at address: 0x168 2014-12-19T12:32:59.955-0500 F - [conn2] Got signal: 11 (Segmentation fault). 0x194265a 0x1941b85 0x1941fbf 0x7f23ca1e5340 0x7f23ca1df414 0x12971da 0x1297281 0x1296c5e 0x12c03e8 0x13132a3 0x13147fe 0x13681a1 0x136911e 0x1369a00 0x156dc56 0x156f9bb 0x147404b 0x147515d 0x1173c9a 0x1908aef 0x7f23ca1dd182 0x7f23c92de30d ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"400000","o":"154265A"},{"b":"400000","o":"1541B85"},{"b":"400000","o":"1541FBF"},{"b":"7F23CA1D5000","o":"10340"},{"b":"7F23CA1D5000","o":"A414"},{"b":"400000","o":"E971DA"},{"b":"400000","o":"E97281"},{"b":"400000","o":"E96C5E"},{"b":"400000","o":"EC03E8"},{"b":"400000","o":"F132A3"},{"b":"400000","o":"F147FE"},{"b":"400000","o":"F681A1"},{"b":"400000","o":"F6911E"},{"b":"400000","o":"F69A00"},{"b":"400000","o":"116DC56"},{"b":"400000","o":"116F9BB"},{"b":"400000","o":"107404B"},{"b":"400000","o":"107515D"},{"b":"400000","o":"D73C9A"},{"b":"400000","o":"1508AEF"},{"b":"7F23CA1D5000","o":"8182"},{"b":"7F23C91E3000","o":"FB30D"}],"processInfo":{ "mongodbVersion" : "2.8.0-rc3", "gitVersion" : "2d679247f17dab05a492c8b6d2c250dab18e54f2", "uname" : { "sysname" : "Linux", "release" : "3.13.0-24-generic", "version" : "#46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "24EFBDFF2B7691EDBF58E0FE28F9238553CB26B6" }, { "b" : "7FFF77AFE000", "elfType" : 3, "buildId" : "6755FAD2CADACDF1667E5B57FF1EDFC28DD1C976" }, { "b" : "7F23CA1D5000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "FE662C4D7B14EE804E0C1902FB55218A106BC5CB" }, { "b" : "7F23C9FCD000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F23C9DC9000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F23C9AC5000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "19EFDDAB11B3BF5C71570078C59F91CF6592CE9E" }, { "b" : "7F23C97BF000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "574C6350381DA194C00FF555E0C1784618C05569" }, { "b" : "7F23C95A9000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "CC0D578C2E0D86237CA7B0CE8913261C506A629A" }, { "b" : "7F23C91E3000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "B571F83A8A6F5BB22D3558CDDDA9F943A2A67FD1" }, { "b" : "7F23CA3F3000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x27) [0x194265a] mongod(+0x1541B85) [0x1941b85] mongod(+0x1541FBF) [0x1941fbf] libpthread.so.0(+0x10340) [0x7f23ca1e5340] libpthread.so.0(pthread_mutex_lock+0x4) [0x7f23ca1df414] mongod(_ZN5mongo11SimpleMutex4lockEv+0x18) [0x12971da] mongod(_ZN5mongo11SimpleMutex11scoped_lockC1ERS0_+0x37) [0x1297281] mongod(_ZN5mongo21CollectionCursorCache16deregisterCursorEPNS_12ClientCursorE+0x28) [0x1296c5e] mongod(_ZN5mongo15ClientCursorPin16deleteUnderlyingEv+0x90) [0x12c03e8] mongod(+0xF132A3) [0x13132a3] mongod(_ZN5mongo15PipelineCommand3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x8EA) [0x13147fe] mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x96) [0x13681a1] mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xB9E) [0x136911e] mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x491) [0x1369a00] mongod(+0x116DC56) [0x156dc56] mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERNS_5CurOpES3_b+0x39A) [0x156f9bb] mongod(+0x107404B) [0x147404b] mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortEb+0x470) [0x147515d] mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x102) [0x1173c9a] mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4D6) [0x1908aef] libpthread.so.0(+0x8182) [0x7f23ca1dd182] libc.so.6(clone+0x6D) [0x7f23c92de30d] ----- END BACKTRACE -----
- related to
-
SERVER-17624 Interrupting aggregation operation can trip fatal assertion
- Closed