Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Duplicate
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- pull-request

Sprint:
Query Optimization 2021-05-03
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

There is big network traffic from mongod->mongos, when a find command is executed in mongos with big skip. like: db.collection.find({xxx}).sort({xxx}).skip(5000).limit(10)

Mongos get all the documents from all shards with no skip, and then do skip in mongos, then return the final documents to client.

In my opinion, this strategy is ok when the request is sent to multi-shards, because mongod s do not know how to skip.

But the request is sent to a single shard, it is a different situation. The target mongod knows how to skip correctly, so it is suitable to do skip in mongod, and no need to return too many redundancy documents，which would lead to less network traffic and less work for mongos.

The idea is like below(the black area is what to be done):

I tested this idea in mongos3.2 with Intel Xoen CPU and 10Gbps network, and I saw great performance impovement (10x).

mongos version	request num	thread num	total time cost(seconds)	network traffic(MB/s)	mongos-CPU(Peak)	mongod-CPU(Peak)
original	200	5	6.3	120	30%	13%
after optimize	200	5	0.6	<1	1.7%	14%

CPU utilization is observed by linux tool: top

Network traffic is observed by linux tool: sar

The testing data is:

for (var i=0;i<10;i++) {db.testcoll.insert({a:1,b:i,c:"someBigString"}); sleep(10);}

Query requests are:

db.testcoll.find({a:1}).skip(5000).limit(10)

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

planExplainAfter.txt
5 kB
May 30 2019 03:56:12 AM UTC
planExplainBefore.txt
5 kB
May 30 2019 03:56:12 AM UTC
skipOptimize.png
34 kB
May 28 2019 10:04:49 AM UTC
skiptest.go
3 kB
May 28 2019 10:04:03 AM UTC

duplicates

SERVER-36290 find command on unsharded collection through mongos returns too much data from mongod to mongos

Closed

is related to

SERVER-10844 Single shard queries can be optimized (sort/skip)

Closed

SERVER-36290 find command on unsharded collection through mongos returns too much data from mongod to mongos

Closed

links to

Pull Request for v4.0 #1326

Assignee:: James Wahlin
Reporter:: peng zhenyi
Participants:: AN D, Eric Sedor, James Wahlin, peng zhenyi
Votes:: 4 Vote for this issue
Watchers:: 17 Start watching this issue

Created:: May 28 2019 10:03:27 AM UTC
Updated:: Apr 29 2021 04:56:24 PM UTC
Resolved:: Apr 29 2021 04:56:24 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates