Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Works as Designed
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.2.0
Component/s: Performance
Labels:
None

Assigned Teams:

Server Triage
Operating System:
ALL
Steps To Reproduce:
Hide

This test inserts 10K documents with just the _id field and searches for the first 5 of them using find and aggregate/match. The test shows 30 times difference between these two searches.

let stime = new Date().getTime(); let testdocs = 1000; let findcount = 10000; let ids = []; for(let k=0; k < testdocs; k++) { let id = ObjectId(); // collect first 5 IDs for look-up if(ids.length < 5) ids.push(id); db.test_collection.insertOne({_id: id}); } print("Created " + db.test_collection.count() + " docs in " + ((new Date().getTime() - stime)/1000).toFixed(3) + " seconds"); stime = new Date().getTime(); for(let k=0; k < findcount; k++) { for(let i in ids) db.test_collection.find({_id: ids[i]}); } print("Query: " + ((new Date().getTime() - stime)/1000).toFixed(3) + " seconds") stime = new Date().getTime(); for(let k=0; k < findcount; k++) { for(let i in ids) db.test_collection.aggregate([{$match: {_id: ids[i]}}]); } print("Aggregate: " + ((new Date().getTime() - stime)/1000).toFixed(3) + " seconds") db.test_collection.drop();
Show
This test inserts 10K documents with just the _id field and searches for the first 5 of them using find and aggregate/match . The test shows 30 times difference between these two searches. let stime = new Date().getTime(); let testdocs = 1000; let findcount = 10000; let ids = []; for (let k=0; k < testdocs; k++) { let id = ObjectId(); // collect first 5 IDs for look-up if (ids.length < 5) ids.push(id); db.test_collection.insertOne({_id: id}); } print( "Created " + db.test_collection.count() + " docs in " + (( new Date().getTime() - stime)/1000).toFixed(3) + " seconds" ); stime = new Date().getTime(); for (let k=0; k < findcount; k++) { for (let i in ids) db.test_collection.find({_id: ids[i]}); } print( "Query: " + (( new Date().getTime() - stime)/1000).toFixed(3) + " seconds" ) stime = new Date().getTime(); for (let k=0; k < findcount; k++) { for (let i in ids) db.test_collection.aggregate([{$match: {_id: ids[i]}}]); } print( "Aggregate: " + (( new Date().getTime() - stime)/1000).toFixed(3) + " seconds" ) db.test_collection.drop();

I am seeing a huge difference between db.col.find({_id: someid}) and db.col.aggregate([{$match: {_id: someid}}]) running against the same _id values. In the simplest case demonstrated with the test script below, the documents contain nothing but the _id field. In the actual application, I use $lookup stage after $match.

Since it's the _id field, there's no table scan and other usual things to look at. I checked profiling and logs and see nothing of interest - each of the slow queries is shorter than 1 ms, so nothing catches it, but with real data, this difference is significant and cannot be unnoticed to the point when it's cheaper to just abandon the aggregate framework, which is undesirable for obvious benefits.

All tests below are against a MongoDB server 4.2.0.

The test below inserts 10K documents with just _id fields and searches for first 5 of them 10K times using find and aggregate/match. The search times are:

Query      :  0.544 seconds
Aggregate: 15.551 seconds

I ran this on Windows and Linux with the same outcome. The hardware has plenty of RAM and an SSD disk.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

mongo-10k-objs.js
3 kB
Nov 25 2019 03:59:20 PM UTC

mentioned in: Page Loading...

Assignee:: [HELP ONLY] Backlog - Triage Team

Reporter:: Andre M

Participants:: [HELP ONLY] Backlog - Triage Team, Andre M, Daniel Gottlieb

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: Nov 22 2019 05:22:07 PM UTC

Updated:: Oct 27 2023 01:53:00 PM UTC

Resolved:: Nov 25 2019 04:37:24 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates