Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Storage Execution

Consider a set of updates in a single update command.

Each update is a pair: [\{q1, u1\}, \{q2, u2\}, ... \{qN, uN\}] - apply update ui to all documents that match qi.

Right now we are doing it in sequence: fetch q1, apply u1, fetch q2, apply u2 and so on.

What if we do the following:

Create big query Q = {$or: [q1, q2, .. qN]}.
Start by marking each update as not executed.
For each document D, returned by Q:
1. For each non executed update {qi, ui}, if document matches qi, apply update ui and mark it as executed.
2. Proceed with updated document. So if a document D initially matches q1 and after applying u1 it starts to match q2, we will also apply u2 to it as well.
After Q is exhausted, for each not executed update with upsert: true, insert new document.

This might save us from an overhead of running N separate queries (and possibly multi-planning each separately), but will also force us to apply all N predicates to every possible document, leading to O(N^2) complexity.

But for small N <= 10 it might be beneficial.
It can also improve performance of write operations like $merge.

is related to

SERVER-91191 Re-use CollectionAcquisition in write_ops_exec::performUpdates

Backlog

Assignee:: Unassigned

Reporter:: Ivan Fefer

Participants:: Ivan Fefer, Louis Williams

Votes:: 0 Vote for this issue

Watchers:: 15 Start watching this issue

Created:: Jun 05 2024 10:32:32 AM UTC

Updated:: Jun 11 2024 04:54:05 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates