-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.4.6
-
Component/s: Aggregation Framework, Performance
-
None
-
ALL
We have the following collection, with 1773732 documents:
COMPANIES:
{ "_id" : "", "companyId" : "aCompanyId", "name": "aCompany" "status" : "ACTIVE", "description" : "a description", "departments" : [ { "departmentId" : "dep1" } ], "createdOn" : NumberLong(1475667096236), "modifiedOn" : NumberLong(1490715174421), "version" : NumberLong(2) }
Now we have a query that takes 20 random companies, filters out inactive ones, performs basic projection and then picks random 10 elements once again (let's assume we have more than 50% of active companies)
db.companies.aggregate([ {"$sample": {"size": 20}}, {"$match": {"$or": [{"status": "ACTIVE"}, {"status": "$exists"}]}}, {"$project": {"postcodeId": 1, "postTown": 1, "description": 1}}, {"$sample": {"size": 10}} ])
The following query works fine (don't have detailed data, but I'd say p95 <200ms) on mongodb 3.2
However, we've noticed serious performance problems after upgrade to 3.4.6. Execution times are an order of magnitude higher than on 3.2 (it takes ~1.5-2s to complete).
This is what I saw in mongodb logs on 3.4:
2017-07-26T14:36:36.256+0000 I COMMAND [conn542] command Whatever.companies command: aggregate { aggregate: "companies", pipeline: [ { $sample: { size: 20.0 } }, { $match: { $or: [ { status: "ACTIVE" }, { status: "$exists" } ] } }, { $project: { companyId: 1.0, name: 1.0 } }, { $sample: { size: 10.0 } } ], explain: false, cursor: {} } planSummary: MULTI_ITERATOR keysExamined:0 docsExamined:0 cursorExhausted:1 numYields:104 nreturned:10 reslen:1181 locks:{ Global: { acquireCount: { r: 218 } }, Database: { acquireCount: { r: 109 } }, Collection: { acquireCount: { r: 108 } } } protocol:op_command 1190ms
As you can see, there is a lot of locking going on. On 3.2, same query produces about ~15 global read locks. What is also strange, if I get rid of $project and fetch all the data, latencies are fine.
However, when I reverse the logic of the projection, so it becomes:
db.companies.aggregate([ {"$sample": {"size": 20}}, {"$match": {"$or": [{"status": "ACTIVE"}, {"status": "$exists"}]}}, {"$project": {"departments": 0, "createdOn": 0 ... (exclude every single field)}}, {"$sample": {"size": 10}}] )
it works as fast as on 3.2 and there number of global locks also decreases.
Additional information:
- cluster configuration: 3-node replica set
- indexes: companyId, departments.departmentId
Should I consider it a bug? Or maybe I'm missing something?
- duplicates
-
SERVER-23408 Use a smarter batch size for small samples
- Closed
- is related to
-
SERVER-22402 Add dependency tracking to the $match stage
- Closed