-
Type: Improvement
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Internal Code
-
Replication
-
Fully Compatible
-
v8.0
-
Repl 2024-04-29, Repl 2024-05-13
-
200
We can use the following construct to store _rsConfig for the replication coordinator, allowing operations to query the configuration lock-free.
class SharedReplSetConfig { public: struct Lease { public: Lease(uint64_t version, std::shared_ptr<ReplSetConfig> config) : _version(version), _config(std::move(config)) {} uint64_t version() const { return _version; } ReplSetConfig& config() const { return *_config; } private: uint64_t _version; std::shared_ptr<ReplSetConfig> _config; }; SharedReplSetConfig() : _current(std::make_shared<ReplSetConfig>()) {} Lease renew() { auto readLock = _rwMutex.readLock(); return Lease(_version.load(), _current); } bool isStale(Lease& lease) const { return _version.load() != lease.version(); } void update(std::shared_ptr<ReplSetConfig> newConfig) { auto writeLk = _rwMutex.writeLock(); _version.fetchAndAdd(1); _current = std::move(newConfig); } private: WriteRarelyRWMutex _rwMutex; Atomic<uint64_t> _version{0}; std::shared_ptr<ReplSetConfig> _current; };
Any thread that wants to update the configuration, must continue using the replication coordinator mutex to synchronize with other writers, and then call into `SharedReplSetConfig::update` to update the configuration.
As for reading from the configuration, I propose the following:
- Create a thread local of type SharedReplSetConfig::Lease, which holds a cache of the latest known configuration to that thread.
- At first attempt to read the configuration, each thread calls into renew to update its local cache.
- For all future reads from the configuration, each thread calls into isStale with its cached configuration. If it is not stale, they can use the configuration without acquiring any locks. On the very rare occasion that the cached configuration is stale, the thread will call into renew and renew its cached configuration.
Here is an example of adopting the above in replication_coordinator_impl.cpp:
bool ReplicationCoordinatorImpl::isConfigLocalHostAllowed() const { auto& myLease = fromThreadLocalConfigCache(); if (MONGO_unlikely(_sharedReplSetConfig.isStale(myLease)) { myLease = _sharedReplSetConfig.renew(); setThreadLocalConfigCache(myLease); } return myLease.config.isLocalHostAllowed(); }
- is duplicated by
-
SERVER-87020 Use hazard pointers for reading replica-set configuration
- Closed
- is related to
-
SERVER-96144 Possible race condition when reading replica set config fields outside of replication
- Backlog
- related to
-
SERVER-96142 SharedReplSetConfig::setConfig() should use WithLock
- Backlog
-
SERVER-90314 Ensure that _inlock methods in ReplicationCoordinatorImpl take the lock
- Closed
- split from
-
SERVER-88818 Evaluate the performance benefits of hazard pointers
- Closed