-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Query Optimization 2021-05-03
There is big network traffic from mongod->mongos, when a find command is executed in mongos with big skip. like: db.collection.find({xxx}).sort({xxx}).skip(5000).limit(10)
Mongos get all the documents from all shards with no skip, and then do skip in mongos, then return the final documents to client.
In my opinion, this strategy is ok when the request is sent to multi-shards, because mongod s do not know how to skip.
But the request is sent to a single shard, it is a different situation. The target mongod knows how to skip correctly, so it is suitable to do skip in mongod, and no need to return too many redundancy documents,which would lead to less network traffic and less work for mongos.
The idea is like below(the black area is what to be done):
I tested this idea in mongos3.2 with Intel Xoen CPU and 10Gbps network, and I saw great performance impovement (10x).
mongos version | request num | thread num | total time cost(seconds) | network traffic(MB/s) | mongos-CPU(Peak) | mongod-CPU(Peak) |
---|---|---|---|---|---|---|
original | 200 | 5 | 6.3 | 120 | 30% | 13% |
after optimize | 200 | 5 | 0.6 | <1 | 1.7% | 14% |
CPU utilization is observed by linux tool: top
Network traffic is observed by linux tool: sar
The testing data is:
for (var i=0;i<10;i++) {db.testcoll.insert({a:1,b:i,c:"someBigString"}); sleep(10);}
Query requests are:
db.testcoll.find({a:1}).skip(5000).limit(10)
- duplicates
-
SERVER-36290 find command on unsharded collection through mongos returns too much data from mongod to mongos
- Closed
- is related to
-
SERVER-10844 Single shard queries can be optimized (sort/skip)
- Closed
-
SERVER-36290 find command on unsharded collection through mongos returns too much data from mongod to mongos
- Closed
- links to