Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- repl-modularity

Assigned Teams:

Replication
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

While investigating how the replica set config is written and read for ~~SERVER-96005~~, we noticed a potential race condition in the config access pattern from higher layers.

Currently, replica set config reads are done lock-free (~~SERVER-89631~~). When calling getConfig() in the replication coordinator, the function will check if the cached config is stale, and if so, will update the cached value. This getter is used in the replication coordinator config field-specific getters. The config itself is updated under the replication coordinator mutex. This approach ensures that callers of these config field getters always see the most recent value. Within replication, there is a getConfig(WithLock) function that allows callers to ensure that fields remain unchanged until the lock is released. If a function relies on a particular config field staying the same, we should use this function and just read the desired field off the config object reference.

However, we have cases where higher layers rely on replica set config fields and use the getter functions to retrieve them. Here is an example in addShard. This validates the hostname field on the replica set members fields, but nothing prevents this command from racing with a concurrent reconfig that modifies this field.

This ticket is to think broadly about how we can refactor our config API so that we limit higher layer access to only the most necessary fields, and address this time-of-check to time-of-use bug in the process.

related to

SERVER-96005 Delete unused getConfig* methods

Closed

SERVER-96142 SharedReplSetConfig::setConfig() should use WithLock

Closed

SERVER-89631 Make reading `ReplSetConfig` mostly lock-free

Closed

Assignee:: Unassigned
Reporter:: Ali Mir
Participants:: Ali Mir
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Oct 23 2024 07:33:47 PM UTC
Updated:: Oct 28 2024 05:42:26 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates