The task is to add support for readConcern level snapshot for sharded aggregate command. The implementation assumes that shards also support it by properly establishing snapshots per the passed atClusterTime argument.
Implementation:
1. Compute atClusterTime: ACT SERVER-33027
the algorithm: compute the greatest lastCommitedOpTime from the targeted shards.
should be added to cluster_commands_helpers.h as this function will be used by other cluster commands.
** * Compute the lastCommittedOpTime from the targeted shards. */ LogicalTime computeAtClusterTime(OperationContext* opCtx, std::set<ShardId> shardIds) {
The call to this function should be done when targeted shards are determined:
https://github.com/mongodb/mongo/blob/r3.7.2/src/mongo/s/commands/cluster_aggregate.cpp#L393
2. Verify targeting
once the ACT is computed need to verify that the targeted shards had the chunks at the ACT moment. This will use multi-version routing table. Should be added to cluster_commands_helpers.h
/** * Verifies that the shardIds are the same as they were atClusteTime using versioned table. */ bool verifyTargetedShardsAtClusterTime(OperationContext* opCtx, std::set<ShardId> shardIds, LogicalTime atClusterTime) {
if the function returns false then use the current cluster time on mongos.
3. Amend the command objects sent to individual shards per API:
lsid, txnNumber, autocommit:true, atClusterTime: ACT
https://github.com/mongodb/mongo/blob/r3.7.2/src/mongo/s/commands/cluster_aggregate.cpp#L426 is the calling point the createCommandForTagetedShards needs to add missing info
4. Add error handling
Snapshot may return a SnapshotError error class. It needs to cause the restart of the read attempt up to configured # of retries
Add catching the error here: https://github.com/mongodb/mongo/blob/r3.7.2/src/mongo/s/commands/cluster_aggregate.cpp#L316
Make sure that the aggPassthrought that calls https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/s/client/shard.cpp#L154 retries via changing https://github.com/mongodb/mongo/blob/r3.7.3/src/mongo/s/client/shard_remote.cpp#L103
Testing
Add integration tests that validate the aggregate command returning the data in snapshot. i.e.
send command with batch size 1, establish cursors
add a few inserts
send getMore - this getMores should not return the inserted data as its in the other snapshot.
- depends on
-
SERVER-33016 API to get/set lastCommittedOpTime on Shard
- Closed
-
SERVER-33027 compute atClusterTime
- Closed
-
SERVER-33062 Amend command with readConcern atClusterTime
- Closed
-
SERVER-33702 Move sessionId and txnNumber addition from ShardingTaskExecutor::scheduleRemoteCommand
- Closed
- is related to
-
SERVER-34014 Add unit tests to cluster aggregate
- Closed
- related to
-
SERVER-33683 Allow aggregation $mergeCursors stage to run inside a transaction
- Closed