Loading...

XML

Word

Printable

JSON

Type: New Feature
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- M0

Assigned Teams:

Query Execution
Sprint:
Execution Team 2024-11-11, Execution Team 2024-11-25

For interesting variations of queries which read or modify user data e.g.,

find {,AndModify}
aggregate
insert
remove

Investigate what fields need to be pruned and regenerated from a recorded query to allow it to be replayed by a client. For example:

{
    "aggregate": "sharded",
    "pipeline": [
        {
            "$match": {
                "a": {
                    "$gte": 0.0
                }
            }
        }
    ],
    "cursor": {},
    "lsid": {
        "id": {
            "$uuid": "5e8a24d38c7b4656959b93350a82b5d0"
        }
    },
    "$clusterTime": {
        "clusterTime": {
            "$timestamp": {
                "t": 1730306867,
                "i": 1
            }
        },
        "signature": {
            "hash": {
                "$binary": "AAAAAAAAAAAAAAAAAAAAAAAAAAA=",
                "$type": "00"
            },
            "keyId": 0
        }
    },
    "$db": "test"
}

Only a subset of the fields here would be correct to use in a replayed query.

Workload record is not yet implemented, so initial exploration will require "manually" collecting example queries - e.g., by inserting logging of RequestExecutionContext::_request at some point during the life of a query.

Then investigate how easily the C++ driver can be used to replay such queries, with a a simple POC to inform later work. Assume queries will be provided as BSONObj.

It may be the case that the driver can be provided the query object (with unsuitable fields removed) directly, or the easiest path is a if/else if chain re-building the query using the normal C++ driver methods (bearing in mind that queries may have provided .limit(...), .hint(...) and so on).

Note: there will be "inter-query relationships" e.g., getMore for a particular cursor following a find - directly replaying the getMore is likely to fail as the cursor info will differ during replay - this does not need to be considered yet, later work will investigate/address replaying a series of queries "properly".

Assignee:: Nicola Cabiddu

Reporter:: James Harrison

Participants:: James Harrison, Nicola Cabiddu

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: Nov 04 2024 02:36:27 PM UTC

Updated:: Nov 11 2024 08:20:54 PM UTC

Details

Description

Attachments

Activity

People

Dates