Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Query Optimization

The BigCollection benchmark in the mongo-perf runs multiple tests where the total data size of the collection is kept the same, while the number of documents increases and their size decreases by the same factor. Both Scan and Filter queries reveal a cliff in the throughput for the collection with the largest number of documents (1638400). This holds both for classic and SBE engines.

From the investigation in ~~SERVER-80583~~ on VM, 1-thread throughput in ops-per-sec

Document number	Document size	Batch size	Classic	SBE
25	16777216	0	2.286	2.152
400	1048576	0	2.781	2.505
6400	65536	0	2.788	2.583
102400	4096	0	2.248	2.145
1638400	256	0	0.702	0.799
400	1048576	1	2.937	2.823
6400	65536	16	2.846	2.825
102400	4096	256	2.358	2.432
1638400	256	4096	0.745	0.901

This seems to be partially due to the WiredTiger, and partially due to the predicate computation ( higher computational cost for larger number of documents). Excerpt from the flame graphs in the attachment for the SBE engine:

PlanExecutorSBE::getNext : 1.89% vs. 44.73%

FilterStage::getNext : 1.67% vs. 40.77%

WiredTigerRecordStoreCursorBase::next : 0.93% vs. 23.84%

sbe::vm::ByteCode::runPredicate : 0.24% vs 7.46%

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Filter.BC.6400.16.sbe.svg
325 kB
Jan 16 2024 02:41:13 PM UTC
Filter.BC.1638400.4096.sbe.svg
372 kB
Jan 16 2024 02:40:42 PM UTC

Assignee:: [DO NOT USE] Backlog - Query Optimization

Reporter:: Milena Ivanova

Participants:: [DO NOT USE] Backlog - Query Optimization, Milena Ivanova

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: Jan 11 2024 11:19:49 AM UTC

Updated:: Jan 23 2024 02:41:48 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates