-
Type: Task
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Sharding
-
None
-
Sharding 2021-07-26
-
2
getDestinedRecipient() is called on the write path for insert, updates, and deletes. Special care has needed to be taken to avoid introducing a performance regression on non-shardsvrs and on collections in sharded clusters not undergoing a resharding operation (SERVER-52974, SERVER-53678, SERVER-53679). More care is still needed.
Converting getDestinedRecipient() into a class will make it more straightforward to make explicit what components must be lazily initialized and what components must be cached so introducing the resharding feature is performance neutral when not in active use.
class ReshardingDonorWriteRouter { public: ReshardingDonorWriteRouter(OperationContext* opCtx, const NamespaceString& sourceNss, CatalogCache* catalogCache); ReshardingDonorWriteRouter(OperationContext* opCtx, const NamespaceString& sourceNss, CatalogCache* catalogCache, CollectionShardingState* css, const ScopedCollectionDescription* collDesc); CollectionShardingState* getCollectionShardingState() const; boost::optional<ShardId> getDestinedRecipient(const BSONObj& fullDocument) const; private: CollectionShardingState* const _css; const ScopedCollectionDescription* const _collDesc; boost::optional<ScopedCollectionFilter> _ownershipFilter; boost::optional<ShardKeyPattern> _reshardingKeyPattern; boost::optional<ChunkManager> _tempReshardingChunkMgr; };
Something along the above lines would be my proposal. It has the following properties:
- OpObserverImpl doesn't get the CollectionShardingState or ScopedCollectionDescription when the mongod isn't a shardsvr. The 3-argument constructor would use nullptr for both when ShardingState::enabled() returns false.
- UpdateStage already gets the CollectionShardingState and ScopedCollectionDescription. The 5-argument argument constructor would use the values to avoid getting them a second time.
- OpObserverImpl::onInserts() would construct the ReshardingDonorWriteRouter once outside of the for loop. ReshardingDonorWriteRouter would have cached the ScopedCollectionFilter, ShardKeyPattern, and ChunkManager upon construction so calling ReshardingDonorWriteRouter::getDestinedRecipient() in a loop won't do extra work.
- repl::logInsertOps() would be changed to take a const ReshardingDonorWriteRouter& as an argument.
- All OpObserverImpl methods would use ReshardingDonorWriteRouter::getCollectionShardingState() to avoid getting the CollectionShardingState from the map a second time. Note that when the mongod isn't a shardsvr ReshardingDonorWriteRouter::getCollectionShardingState() will return nullptr. This is acceptable because OpObserverShardingImpl won't have been registered for non-shardsvrs either and so the CollectionShardingState* pointer won't ever be dereferenced.
- ReshardingDonorWriteRouter would use TypeCollectionDonorFields::getTempReshardingNss() rather than calling constructTemporaryReshardingNss().
- ReshardingDonorWriteRouter would be moved into libsharding_api_d (from libresharding_util).
- duplicates
-
SERVER-58914 Create ReshardingDonorWriteRouter class with function stubs
- Closed
-
SERVER-58915 Implement ReshardingDonorWriteRouter functionality along with unit tests
- Closed
-
SERVER-58918 Replace getDestinedRecipient() in the code-base with calls into the ReshardingDonorWriteRouter object
- Closed
- is related to
-
SERVER-52974 Checking if destined recipient has changed for resharding creates another full copy of the updated document
- Closed
-
SERVER-53678 No-op for filling in destined recipient for insert oplog entries adds overhead on non-shardsvrs
- Closed
-
SERVER-53679 No-op for filling in destined recipient for insert oplog entries adds overhead on shardsvrs not running resharding
- Closed